RE: [opencms-dev] Lucene-search: stop words aren'tdisplayed insearchresultlist

Christian Steinert christian_steinert at web.de
Tue May 30 16:40:53 CEST 2006


> 
> I'm adopting this approach at the moment.  I've written a Lucene Document
> factory, implementing I_CmsDocumentFactory, which indexes all the usual
> OpenCms Fields then adds a few of its own for good measure.  The ones used
> for specialised searching are indexed with Field.Store.NO and
> Field.Index.TOKENIZED/Field.Index.UNTOKENIZED; and the others which save me
> having to hit the VFS unnecessarily are indexed with Field.STORE.YES,
> Field.Index.NO.
> 

Dear Jonathan

This sounds interesting. If Opencms really tries to display the information as it was indexed, then this woule certainly be strange.

Jason: could you post your patched code? Where is it that opencms uses an analyzer, where it shouldn't? During index time or while retrieving result information?

=====

Jonathan: Does that mean that you want store each page in lucene two flavors, when indexing it - storing it once in its original form and once again in an analyzer-filtered form? Would sound like a good idea. I just did not know that this is possible at all with lucene to store non-tokenized information along with the tokenized information that actually gets searched. As this seems to be possible, it'd of course be easier and probably a small bit faster to store the pages redundantly in lucene and not read them from the VFS for this purpose.

But: would the retrieval code not also have to know about this so that it could take the result from the right - unfiltered - fields before extracting the excerpts?

C.
______________________________________________________________
Verschicken Sie romantische, coole und witzige Bilder per SMS!
Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193




More information about the opencms-dev mailing list