Lucene search (was: Re: [opencms-dev] "Managing and CustomizingOpenCms 6" book will bereleased soon)

Jonathan Woods jonathan.woods at scintillance.com
Wed May 31 18:18:30 CEST 2006


Alessandro -

I remember there was a question on the list a while back (3 or 4 months?)
about indexing JSPs, and I replied rather naively, not having then dived
into Lucene + OpenCms...

I'm integrating full text search results with various attributes, including
priority and multi-valued hierarchical attributes (including 'topic', a bit
like category on steroids, and organisational unit).  The 'full text' comes
from XML contents, because that's how nearly all the site is made up.

For the JSP angle, assuming the JSPs are doing something which can't be
replicated with XML contents-plus-JSP template, I guess I'd fire off an HTTP
request at my server (from itself) and index the results using standard
Lucene HTML analysis.  When it comes to processing search results, there are
various ways to filter out Hits on documents which the then-current user
isn't allowed to view - at the moment I'm using

	CmsResource cmsResource =
cmsObject.readResource(resourceRelativePath, CmsResourceFilter.DEFAULT);

and checking for null results, discarding Hits associated with these
resources.  resourceRelativePath is derived from a Field stored in each
Lucene Document in the index and therefore available via every Hit.  The
OpenCms default Field used for this purpose is called
I_CmsDocumentFactory.DOC_PATH.  Filtering can either be carried out
post-search or via Lucene Filters, though the latter require a bit of state
management as VFS contents and index contents change.

If you have any comments on this approach, I'd welcome them - and of course
I'll let you know when things go live.

Jon

-----Original Message-----
From: opencms-dev-bounces at opencms.org
[mailto:opencms-dev-bounces at opencms.org] On Behalf Of Alessandro Magnolo
Sent: 31 May 2006 14:22
To: The OpenCms mailing list
Subject: Lucene search (was: Re: [opencms-dev] "Managing and
CustomizingOpenCms 6" book will bereleased soon)

On 5/30/06, Jonathan Woods <jonathan.woods at scintillance.com> wrote:
> I'm just finishing development
> of a module which integrates more generic Lucene searching with the 
> collector framework, and if this e-mail is of interest to you I'll 
> send you a link to the web site incorporating it  - once it's live in 
> a couple of weeks' time.

Do you mean that you can index jsp pages in opencms?
Or that you can integrate full text search results with opencms attributes
(like priority and popularity) ?

I'd be interested in seeing it in action, when published.

Thank you,
Alessandro Magnolo

_______________________________________________
This mail is sent to you from the opencms-dev mailing list To change your
list options, or to unsubscribe from the list, please visit
http://lists.opencms.org/mailman/listinfo/opencms-dev





More information about the opencms-dev mailing list