[opencms-dev] OCMS5 - Lucene indexing of page 'properties' inopencms???

Claus Priisholm cpr at codedroids.com
Wed Oct 12 09:51:47 CEST 2005


You must update the appropriate classes for the indexer (i.e. 
PageDocument or BodylessDocument) to get it to include the wanted 
properties in the Lucene index files. Only then can you use the 
doc.get() method to retrieve the information - it does not automagically 
map all file properties to fields in the lucene index.

But you can always read the properties from the VFS by means of the 
"abs_path" - something like

((CmsJspActionElement)cms).property("region", doc.get("abs_path"));

There is an overhead but it is really not that much of an problem (if 
you're doing paging you only need to look up those files actually being 
displayed making it even more neglectible).
I typically run thorugh the entire search result to make sure that the 
user performing the search actually has access to the found documents - 
if not I'll remove it from the result set and adjust the hit count 
accordingly - this is not making the search appear any slower (and is 
btw. exactly what oc6's built-in search mechanism does).

PS. I think you can do doc.get("keywords") out of the box, i.e. use 
lower case. Same thing for description.

Erik Lyons wrote:
> Was this ever resolved or did it ultimately become a "feature"? 
> 
> 
>>>>jim at futurate.com 3/22/2005 2:46 AM >>>
> 
> I have seen, and use, the simple-search.jsp which is included with the
> lucene module.
> 
>  
> 
> The only property I can get to display is 'title'!
> 
>  
> 
> I have added more properties to be displayed on the search results page
> (if
> only for testing purposes):
> 
> Pagecolour, pagetype and region are properties that I have created 
> 
>  
> 
>             String descrip = doc.get("Description");
> 
>             String Keywords = doc.get("Keywords");
> 
>             String pagecolour= doc.get("pagecolour");
> 
>             String pagetype= doc.get("pagetype");
> 
>             String globalnav = doc.get("region");
> 
> ALL SHOW NULL!
> 
>  
> 
> The only things that work are:
> 
> hits.score(i);
> 
> cms.link(doc.get("abs_path"));
> 
> doc.get("title");
> 
>  
> 
>  
> 
> My plan is to create a 'detailed search' page which allows users to
> choose
> from a combination of two results filters (drop-down boxes):
> 
>  
> 
> 1.    filter on media type (e.g. .doc, .pdf, .txt, .html)
> 
> 2.    filter on page property (e.g. 'pagecolour', 'region')
> 
>  
> 
> The detailed search would search the index created from the root folder
> of
> the site.
> 
>  
> 
>  
> 
>  
> 
> ANY help much much appreciated.
> 
> Thankyou
> 
>  
> 
>  
> 
> P.S. This is the lucene part of my registry.xml:
> 
>  
> 
> <luceneSearch>
>    <mergeFactor>100000</mergeFactor>
>    <permCheck>true</permCheck</permcheck>
>    <indexDir>/opt/luceneindex/</indexDir>
>   
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
>    <subsearch>true</subsearch>
>    <project>online</project>
>    <docFactories>
>        <docFactory enabled="true" type="page">
>         
> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>        </docFactory>
>        <docFactory enabled="true" type="plain">
>           <fileType name="plaintext">
>             <extension>.txt</extension>
>  
> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>           </fileType>
>           <fileType name="taggedtext">
>             <extension>.html</extension>
>             <extension>.htm</extension>
>             <extension>.xml</extension>
>  
> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>           </fileType>
>        </docFactory>
>        <docFactory enabled="false" type="jsp">
>         
> <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>        </docFactory>
>        <! --   <docFactory enabled="false" type="XML Template"/>   -->
>    </docFactories>
>    <directories>
>        <directory location="/cida/">
>          <section>Cida root</section>
>          <subsearch>true</subsearch>
>        </directory>
>    </directories>
>    <contentDefinitions>
>    </contentDefinitions>
> </luceneSearch>
> 
>  
> 
>  
> 
>  
> 
>  
> 
> -----Original Message-----
> From: opencms-dev-bounces at opencms.org 
> [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Phan Dang Dinh
> Sent: 22 March 2005 01:40
> To: The OpenCms mailing list
> Subject: Re: [opencms-dev] OCMS5 - Lucene indexing of page
> 'properties'
> inopencms???
> 
>  
> 
> Pls see the sample of lucene module
> 
>  
> 
> --- James <jim at futurate.com> wrote:
> 
> 
>>How do I pick up the 'description' property of my
> 
> 
>>opencms pages with Lucene?
> 
> 
> 
>> 
> 
> 
> 
>>Can I get Lucene to index ALL the properties of
> 
> 
>>every page it indexes????
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
> 

-- 
Claus Priisholm, CodeDroids ApS
Phone: +45 48 22 46 46
cpr (you know what) codedroids.com - http://www.codedroids.com
cpr (you know what) interlet.dk - http://www.interlet.dk
--
Javadocs and other OpenCms stuff: 
http://www.codedroids.com/community/opencms



More information about the opencms-dev mailing list