[opencms-dev] Lucene and Binary Documents
M Butcher
mbutcher at grcomputing.net
Fri Oct 17 20:39:01 CEST 2003
Ben,
On Thu, 2003-10-16 at 22:46, Ben Rometsch wrote:
> Hi Matt,
>
> Thanks for the reply. If I just want to get the document title to be
> included in the Lucene index, looking at the code in the
> net.grcomputing.opencms.search.BodylessDocument class it appears to ignore
> what the CMSObject is, and attempt to index it regardless. Is this correct?
>
Correct. It will already index the title, but it will not attempt to
index the body.
> If this is the case, is it simply a matter of instructing Lucene to index
> obects other than HTML files in the VFS (i.e. Documents) ? Or would I have
> to create another class, something like
> net.grcomputing.opencms.search.FileDocument and add a new hook into that
> class via the registry.xml fragment? Or does the BodyLess document provide
> this functionality, and it's just a matter of adding a new XML fragment to
> the registry.xml are?
Again, you are right -- simply adding the appropriate configuration to
the registry.xml file will suffice. I believe that you will just need to
extend the plainDocument tag set to include extensions and processors...
I _think_ that binary files get handled by the plain handler.
Matt
--
M Butcher <mbutcher at grcomputing.net>
More information about the opencms-dev
mailing list