![]() |
Searching for words within PDFs
I have a collection of 100+ .pdf documents that are interrelated. Sometimes I would like to find a common word within these documents. Is there a way in which I can search within all of them to produce a result? Thank you
|
You could try "find | xargs grep wordswordswords" but that'd require linux command line and text files. I'm pretty sure PDFs encrypt text so that's moot.
Try looking for a program to rip text out of a PDF and then use that command above. Ah, here we go. Google to the rescue. |
There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.
|
Macs content-index PDFs automatically, searchable in Spotlight.
I was under the impression that any of the desktop search programs for Windows did the same thing (MSN/Google Desktop Search for XP, or the built-in search in Vista). Do they not? |
I think not. At least the normal search in win2k interprets any file except standard text files as binary data.
|
Use Foxit. Free Download
|
Quote:
|
All times are GMT -5. The time now is 07:38 PM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2025, vBulletin Solutions, Inc.