![]() |
||
|
|
Welcome to the Exploding Garrmondo Weiner Interactive Swiss Army Penis. |
GFF is a community of gaming and music enthusiasts. We have a team of dedicated moderators, constant member-organized activities, and plenty of custom features, including our unique journal system. If this is your first visit, be sure to check out the FAQ or our GFWiki. You will have to register before you can post. Membership is completely free (and gets rid of the pesky advertisement unit underneath this message).
|
![]() |
|
Thread Tools |
Searching for words within PDFs
I have a collection of 100+ .pdf documents that are interrelated. Sometimes I would like to find a common word within these documents. Is there a way in which I can search within all of them to produce a result? Thank you
Most amazing jew boots |
You could try "find | xargs grep wordswordswords" but that'd require linux command line and text files. I'm pretty sure PDFs encrypt text so that's moot.
Try looking for a program to rip text out of a PDF and then use that command above. Ah, here we go. Google to the rescue. There's nowhere I can't reach.
Last edited by neus; Jul 29, 2007 at 11:23 PM.
|
Chocorific |
There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.
This thing is sticky, and I don't like it. I don't appreciate it. |
Macs content-index PDFs automatically, searchable in Spotlight.
I was under the impression that any of the desktop search programs for Windows did the same thing (MSN/Google Desktop Search for XP, or the built-in search in Vista). Do they not? I am a dolphin, do you want me on your body?
killmoms - Well, don't really.
Makin' trailers er'ry day. |
Chocorific |
I think not. At least the normal search in win2k interprets any file except standard text files as binary data.
I was speaking idiomatically. |
![]() |
FELIPE NO |