10809 Posts in 2495 Topics by 1783 members
|Go to End||Next >|
8 March 2011 at 4:26am
hey, just found indexing relations is now possible. http://code.google.com/p/lucene-silverstripe-plugin/source/detail?r=29
i'll definitely give this a try in my next silverstripe project.
thanks for all the work!
8 March 2011 at 9:02am
Cheers for your interest!
I've just released a new version incorporating some ideas from users. It indexes files a LOT quicker, and can index older Word/Excel documents too. If you're playing with it at the moment, I'd suggest grabbing the 0.3.3 release. =]
22 June 2011 at 12:46pm
Darren - does version 0.3.3 only work with older Microsoft Word/Excel documents (ie Office 97)?
As it seems my install of SilverStripe (2.4.4) returns search results from .docx and .xlsx but not .doc and .xls.
22 June 2011 at 1:29pm
You actually need the 'zip' extension for PHP loaded for scanning xslx/docx/pptx documents to work. To check if you have this extension, visit a phpinfo() page and see if there is a section under 'Configuration' for 'zip'.
If you're on a debian-based host (most hosting vendors use debian) the zip extension is installed by default, if you're on WAMP/MAMP you may need to go to 'PHP Extensions' and enable the php_zip option.
Hope this helps!
22 June 2011 at 1:54pm
Thanks Darren, but as explained I can actually scan those documents (.docx, .xlsx) you have mentioned however it wont scan .doc or .xls documents.
So it seems to me, being rather new to al of this so certainly can be wrong, that the Lucene 0.3.3 module only searches for Office 97 documents (.docx and .xlsx) but not the later versions (.doc and .xls). I was looking at seeing how hard it would be to modify the code to include .doc and .xls documents but thought would ask here first if they should be supoprted and maybe my SilverStripe installation is at fault.
BTW - I do have the PHP Zip module installed (verified by phpinfo), and am running SS 2.4.4 on a Windows server with SQL2008 as the backend.
22 June 2011 at 2:19pm
Apologies, I'm at work atm so my mind is elsewhere...!
You need to install the commandline 'catdoc' utility suite to enable scanning doc/xsl/ppt documents. This isn't in the documentation which is a bit of an oversight!
If you're on debian/ubuntu you can do apt-get install catdoc, if you're on another *nix or Mac OS X you can use your own package management system or will possibly need to compile them from source:
If you are using Windows then sorry but you'll need to do some coding in ZendSearchLuceneWrapper::index() function, there are some lines like this that won't work on Windows:
$catdoc = trim(shell_exec('which catdoc'));
You'll need to replace them with either hardcoded file paths to wherever the catdoc utilities are installed, or set up some sort of config system yourself.
Hopefully I can find time to get this going on Windows soon so I can release the current SVN trunk, where all this is fixed already!
Hope that's a better answer =]
22 June 2011 at 6:51pm
Many thanks for quick response!
I might have a crack at getting this to work through some coding (windows), as suggested, if even to get to become better acquainted with SilverStripe a little more
25 June 2011 at 5:24am
First, thanks for a great module.
I'm testing it on shared hosting, which has a limitation of 64 open files. As a result every time I rebuild index or try to index page after that I get "Too many files open" error. Is there a way to reduce number of open files?
|Go to Top||Next >|