I would appreciate hearing your recommendations/experiences with site searches that include PDF files.
I have a small site that includes PDFs and the wish is for these to be included in the search. I am thinking at adding the Google Site Search Module but the link in the documentation goes to the Google Enterprise Search. Can the custom search in the Google Webmaster tools be used instead? It would be hard to justify the cost of the Google Enterprise Search. However, the Google forums contain many posts concerning the difficulties of searching PDF files!
As Google are now charging $100 a year for the enterprise version of their search engine, I have been attempting to set up the free version to find files (pdf, docx, pptx etc). (Unfortunately, it does not appear to supply a cse_key to enable me to use your googlesitesearch module.)
Google finds no errors in the sitemap and now appears to have indexed the site's pages. Still doesn't find any linked files though! The links are found under a members' only section but I have seen a forum post asking how to avoid such files from appearing in the search results so that should not be the problem. Also Google reports no blocked URLs.
Any ideas about what newbie mistake I must be making?
I believe that I have isolated the problem to Google search not searching the pages containing the links to documents. It is not finding non-link text on those pages. In the sitemap, the three pages have a 50% priority. I have increased their importance in the CMS but this is yet to show in the sitemap. I hope that giving Google time to reindex the site will solve the problems.