Exploding Garrmondo Weiner Interactive Swiss Army Penis

Exploding Garrmondo Weiner Interactive Swiss Army Penis (http://www.gamingforce.org/forums/index.php)
-   Help Desk (http://www.gamingforce.org/forums/forumdisplay.php?f=36)
-   -   Searching for words within PDFs (http://www.gamingforce.org/forums/showthread.php?t=23783)

Hotobu Jul 27, 2007 07:30 PM

Searching for words within PDFs
 
I have a collection of 100+ .pdf documents that are interrelated. Sometimes I would like to find a common word within these documents. Is there a way in which I can search within all of them to produce a result? Thank you

neus Jul 29, 2007 11:20 PM

You could try "find | xargs grep wordswordswords" but that'd require linux command line and text files. I'm pretty sure PDFs encrypt text so that's moot.
Try looking for a program to rip text out of a PDF and then use that command above.
Ah, here we go. Google to the rescue.

LiquidAcid Jul 30, 2007 07:13 AM

There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.

killmoms Jul 30, 2007 08:23 AM

Macs content-index PDFs automatically, searchable in Spotlight.

I was under the impression that any of the desktop search programs for Windows did the same thing (MSN/Google Desktop Search for XP, or the built-in search in Vista). Do they not?

LiquidAcid Jul 30, 2007 05:33 PM

I think not. At least the normal search in win2k interprets any file except standard text files as binary data.

Dyesan Aug 1, 2007 11:25 PM

Use Foxit. Free Download

shadoweave Aug 12, 2007 01:46 AM

Quote:

Originally Posted by LiquidAcid (Post 481180)
There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.

Like what LiquidAcid said, there's no certain way to search for words if the words are actually image files. But if they ARE text, there's an option to search within pdf files in an entire directory in Adobe Acrobat Reader itself. The function's the Full Reader Search I believe.


All times are GMT -5. The time now is 07:38 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2025, vBulletin Solutions, Inc.