FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 05-29-2010, 07:28 PM
Ron Johnson
 
Default Acroread: accelerating the search through a PDF

On 05/29/2010 01:47 PM, Merciadri Luca wrote:

Hi,

I sometimes have really long documents (>4000 p) for specs., or for


Wow. How big is that?


other purely technical stuff. I sometimes look for a given model, or for
a given word. The fact is that acroread reads ~8 pg/s, and, thus, if I
do not know that my keyword is simply at the last page of the document,
it takes 500s ~8 minutes and a half. How can I speed it up? Why is it so
sluggish? Do not tell me that it is limited by R/W access on the HDD...



Have you tried other PDF readers? Searched for Linux-based PDF
indexers?


Do you hear the disk spin up when you start the search?

In Edit->Preferences->Search there is a knob or two you can diddle with.

Lastly, acroread is free-as-in-beer. Adobe wants you to buy Acrobat
to get the Good Stuff.


--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Archive: 4C016AC0.4030707@cox.net">http://lists.debian.org/4C016AC0.4030707@cox.net
 
Old 05-29-2010, 07:34 PM
Merciadri Luca
 
Default Acroread: accelerating the search through a PDF

Ron Johnson wrote:
> On 05/29/2010 01:47 PM, Merciadri Luca wrote:
>> Hi,
>>
>> I sometimes have really long documents (>4000 p) for specs., or for
>
> Wow. How big is that?
Well, there are many bigger works, such as encyclopedias!
>
>> other purely technical stuff. I sometimes look for a given model, or for
>> a given word. The fact is that acroread reads ~8 pg/s, and, thus, if I
>> do not know that my keyword is simply at the last page of the document,
>> it takes 500s ~8 minutes and a half. How can I speed it up? Why is it so
>> sluggish? Do not tell me that it is limited by R/W access on the HDD...
>>
>
> Have you tried other PDF readers? Searched for Linux-based PDF indexers?
As I said in another topic, I am totally okay for free stuff (if it was
not the case, I would not be using Debian: thinking unfree but using
free is cowardice), but the fact is that I have not found a reader whose
range of compatibility with the PDF standard is as high as in acroread.
Acroread is slow, boring, sometimes buggy, but I need to use it as long
as I do not find a PDF reader which has such a big compatibility range.
> Do you hear the disk spin up when you start the search?
Not at all. I have a HDD load monitor, and I do not even see any trace
of some HDD use. Such documents often contain no pictures (only
schematics, as you might guess), and are thus light, so I do not expect
acroread to use the HDD a lot when looking for a word.
> In Edit->Preferences->Search there is a knob or two you can diddle with.
Yes, I tried. But nothing better.
> Lastly, acroread is free-as-in-beer. Adobe wants you to buy Acrobat
> to get the Good Stuff.
That's a fact. That's the less attractive counterpart of acroread.


--
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.


Big thunder. Little rain.
 
Old 05-29-2010, 10:01 PM
Erik Heil
 
Default Acroread: accelerating the search through a PDF

hi. Just posting this reply to the list/usenet newsgroup, so others
can benefit from this, and so it an be properly archived. i'm using
Gmail ere, and there doesn't seem to be a way to disable the
"Conversations" view. So, you get confused when yu use the "Quick
Reply" feature.
---Erik
---------- Forwarded message ----------
From: Erik Heil <eheil1@gmail.com>
Date: Sat, 29 May 2010 16:15:20 -0400
Subject: Re: Acroread: accelerating the search through a PDF
To: Luca.Merciadri@student.ulg.ac.be

Hi.
What you may have to look at is the possibility of a document
management system. For your needs, you won't need anything upscale,
just something that can process the PDF documents, index the text, and
open it within Acroread, or optionally another reader. Since indexing
generates compressed versions of the document, you should be able to
get reasonable search times. Perhaps have a chron job nightly to index
new documents--don't know how often you add new documents. Just ideas
for you tto play with. i'm sure Debian has something like this to
offer. Maybe not? Perhaps more people here would have more knowledge
of this.
--Erik

On 5/29/10, Merciadri Luca <Luca.Merciadri@student.ulg.ac.be> wrote:
> Ron Johnson wrote:
>> On 05/29/2010 01:47 PM, Merciadri Luca wrote:
>>> Hi,
>>>
>>> I sometimes have really long documents (>4000 p) for specs., or for
>>
>> Wow. How big is that?
> Well, there are many bigger works, such as encyclopedias!
>>
>>> other purely technical stuff. I sometimes look for a given model, or for
>>> a given word. The fact is that acroread reads ~8 pg/s, and, thus, if I
>>> do not know that my keyword is simply at the last page of the document,
>>> it takes 500s ~8 minutes and a half. How can I speed it up? Why is it so
>>> sluggish? Do not tell me that it is limited by R/W access on the HDD...
>>>
>>
>> Have you tried other PDF readers? Searched for Linux-based PDF indexers?
> As I said in another topic, I am totally okay for free stuff (if it was
> not the case, I would not be using Debian: thinking unfree but using
> free is cowardice), but the fact is that I have not found a reader whose
> range of compatibility with the PDF standard is as high as in acroread.
> Acroread is slow, boring, sometimes buggy, but I need to use it as long
> as I do not find a PDF reader which has such a big compatibility range.
>> Do you hear the disk spin up when you start the search?
> Not at all. I have a HDD load monitor, and I do not even see any trace
> of some HDD use. Such documents often contain no pictures (only
> schematics, as you might guess), and are thus light, so I do not expect
> acroread to use the HDD a lot when looking for a word.
>> In Edit->Preferences->Search there is a knob or two you can diddle with.
> Yes, I tried. But nothing better.
>> Lastly, acroread is free-as-in-beer. Adobe wants you to buy Acrobat
>> to get the Good Stuff.
> That's a fact. That's the less attractive counterpart of acroread.
>
>
> --
> Merciadri Luca
> See http://www.student.montefiore.ulg.ac.be/~merciadri/
> I use PGP. If there is an incompatibility problem with your mail
> client, please contact me.
>
>
> Big thunder. Little rain.
>
>


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTilf9B2EuHkFMAJ5zZryZysyl0vyLYgyTI4iM9Cq@mail .gmail.com">http://lists.debian.org/AANLkTilf9B2EuHkFMAJ5zZryZysyl0vyLYgyTI4iM9Cq@mail .gmail.com
 
Old 05-30-2010, 12:22 AM
Ron Johnson
 
Default Acroread: accelerating the search through a PDF

On 05/29/2010 02:34 PM, Merciadri Luca wrote:

Ron Johnson wrote:

[snip]


Have you tried other PDF readers? Searched for Linux-based PDF indexers?

As I said in another topic, I am totally okay for free stuff (if it was
not the case, I would not be using Debian: thinking unfree but using
free is cowardice), but the fact is that I have not found a reader whose
range of compatibility with the PDF standard is as high as in acroread.
Acroread is slow, boring, sometimes buggy, but I need to use it as long
as I do not find a PDF reader which has such a big compatibility range.


Nothing says that you must only use one reader at a time.

If poppler, for example, doesn't render *exactly* but searches
/rapidly/, then you could search using poppler and "read" using
Acroread.


Alternatively, install poppler-utils for it's pdftohtml. Certainly
it won't be perfect, but a browser might be faster than Acroread.


--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Archive: 4C01AFDD.7050004@cox.net">http://lists.debian.org/4C01AFDD.7050004@cox.net
 
Old 05-30-2010, 01:40 AM
Erik Heil
 
Default Acroread: accelerating the search through a PDF

Hi there.
I believe that I have some sollutions to your problems. First of all,
you need to see whether or not your documentts are in some kind of
structured format. if they are, say DocBookXML, or something similar,
you may be able to find a quick solution to the searching problem. if
the documents are structured, you can probably parce them by entety
type. of course, this depends on how well they are marked up. Like
I've stated earlier, they key item here is to generate rapidly
searchable indexes that can be quaried against. I'm assuming that
since you deal with highly technical data, it is more or less in a
structured form. You could even generate SQL statements and possibly
use SQLLite if you don't want a full DB as overhead. Anyways, I'm more
than willing to help in any way with this project of yours. Let me
know what you think.
--Erik

On 5/29/10, Ron Johnson <ron.l.johnson@cox.net> wrote:
> On 05/29/2010 02:34 PM, Merciadri Luca wrote:
>> Ron Johnson wrote:
> [snip]
>>>
>>> Have you tried other PDF readers? Searched for Linux-based PDF indexers?
>> As I said in another topic, I am totally okay for free stuff (if it was
>> not the case, I would not be using Debian: thinking unfree but using
>> free is cowardice), but the fact is that I have not found a reader whose
>> range of compatibility with the PDF standard is as high as in acroread.
>> Acroread is slow, boring, sometimes buggy, but I need to use it as long
>> as I do not find a PDF reader which has such a big compatibility range.
>
> Nothing says that you must only use one reader at a time.
>
> If poppler, for example, doesn't render *exactly* but searches
> /rapidly/, then you could search using poppler and "read" using
> Acroread.
>
> Alternatively, install poppler-utils for it's pdftohtml. Certainly
> it won't be perfect, but a browser might be faster than Acroread.
>
> --
> Dissent is patriotic, remember?
>
>
> --
> To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact
> listmaster@lists.debian.org
> Archive: http://lists.debian.org/4C01AFDD.7050004@cox.net
>
>


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTil7rpW7qQYqmPMBBkzvWEA2a3Ns8fqCZfa43iii@mail .gmail.com">http://lists.debian.org/AANLkTil7rpW7qQYqmPMBBkzvWEA2a3Ns8fqCZfa43iii@mail .gmail.com
 
Old 05-30-2010, 07:04 AM
Merciadri Luca
 
Default Acroread: accelerating the search through a PDF

Yes, why not. But if they are in PDF format, how can I (re)structure
them better? Thanks.

Erik Heil wrote:
> Hi there.
> I believe that I have some sollutions to your problems. First of all,
> you need to see whether or not your documentts are in some kind of
> structured format. if they are, say DocBookXML, or something similar,
> you may be able to find a quick solution to the searching problem. if
> the documents are structured, you can probably parce them by entety
> type. of course, this depends on how well they are marked up. Like
> I've stated earlier, they key item here is to generate rapidly
> searchable indexes that can be quaried against. I'm assuming that
> since you deal with highly technical data, it is more or less in a
> structured form. You could even generate SQL statements and possibly
> use SQLLite if you don't want a full DB as overhead. Anyways, I'm more
> than willing to help in any way with this project of yours. Let me
> know what you think.
>


--
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.
 
Old 05-30-2010, 07:26 AM
Merciadri Luca
 
Default Acroread: accelerating the search through a PDF

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ron Johnson <ron.l.johnson@cox.net> writes:

> On 05/29/2010 02:34 PM, Merciadri Luca wrote:
>> Ron Johnson wrote:
> [snip]
>>>
>>> Have you tried other PDF readers? Searched for Linux-based PDF indexers?
>> As I said in another topic, I am totally okay for free stuff (if it was
>> not the case, I would not be using Debian: thinking unfree but using
>> free is cowardice), but the fact is that I have not found a reader whose
>> range of compatibility with the PDF standard is as high as in acroread.
>> Acroread is slow, boring, sometimes buggy, but I need to use it as long
>> as I do not find a PDF reader which has such a big compatibility range.
>
> Nothing says that you must only use one reader at a time.
>
> If poppler, for example, doesn't render *exactly* but searches
> /rapidly/, then you could search using poppler and "read" using
> Acroread.
>
> Alternatively, install poppler-utils for it's pdftohtml. Certainly it
> won't be perfect, but a browser might be faster than Acroread.
You're right. Why not? I'll try it out. Thanks.
- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

Better is the enemy of good.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAkwCExYACgkQM0LLzLt8MhxGMwCfT09ERGobDP abVMreQEMrI4hi
FWcAoKoOdXgyifFBY8m10TosoyPkfTA2
=4N5y
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87ocfx26ll.fsf@merciadriluca-station.MERCIADRILUCA">http://lists.debian.org/87ocfx26ll.fsf@merciadriluca-station.MERCIADRILUCA
 
Old 05-30-2010, 09:54 AM
Camaleón
 
Default Acroread: accelerating the search through a PDF

On Sat, 29 May 2010 20:47:52 +0200, Merciadri Luca wrote:

> I sometimes have really long documents (>4000 p) for specs., or for
> other purely technical stuff. I sometimes look for a given model, or for
> a given word. The fact is that acroread reads ~8 pg/s, and, thus, if I
> do not know that my keyword is simply at the last page of the document,
> it takes 500s ~8 minutes and a half. How can I speed it up? Why is it so
> sluggish? Do not tell me that it is limited by R/W access on the HDD...

4000 pages? Wow, I think I never opened such a document :-)

If you provide a sample link, we could run some text search performance
tests over the file.

Also, I don't have installed Acrobat on my linux boxes, but in windows,
there are two search facilities, "find" and "advanced search". The latter
is quicker.

Greetings,

--
Camaleón


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: pan.2010.05.30.09.54.49@gmail.com">http://lists.debian.org/pan.2010.05.30.09.54.49@gmail.com
 
Old 05-30-2010, 12:11 PM
Eduardo M KALINOWSKI
 
Default Acroread: accelerating the search through a PDF

On 05/29/2010 04:34 PM, Merciadri Luca wrote:

As I said in another topic, I am totally okay for free stuff (if it was
not the case, I would not be using Debian: thinking unfree but using
free is cowardice), but the fact is that I have not found a reader whose
range of compatibility with the PDF standard is as high as in acroread.
Acroread is slow, boring, sometimes buggy, but I need to use it as long
as I do not find a PDF reader which has such a big compatibility range.



Well, if you need Adobe Acrobat Reader, complain to Adobe that it's slow
and hope they fix it.



--
Eduardo M KALINOWSKI
eduardo@kalinowski.com.br


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Archive: 4C0255EC.40407@kalinowski.com.br">http://lists.debian.org/4C0255EC.40407@kalinowski.com.br
 
Old 05-30-2010, 02:01 PM
Merciadri Luca
 
Default Acroread: accelerating the search through a PDF

Eduardo M KALINOWSKI wrote:
> Well, if you need Adobe Acrobat Reader, complain to Adobe that it's
> slow and hope they fix it.
But I find it special that it does not go faster. Adobe wants everybody
to use its client. Then, why don't they make something more valuable?
Habitually, if you want something to look interesting to other's eyes,
you try to make it as much attractive as possible.

--
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.
 

Thread Tools




All times are GMT. The time now is 01:19 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org