[opencms-dev] Lucene configurations

Christian Steinert christian_steinert at web.de
Wed Feb 17 23:13:36 CET 2010


Dear Ulysses
> actually I found most of I want in this message: 
> http://osdir.com/ml/solr-user.lucene.apache.org/2009-10/msg01091.html
> ...
> | <analyzer>
> | <tokenizer class="solr.StandardTokenizerFactory"/>
> | <filter class="solr.StandardFilterFactory"/>
> | <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> |
> | generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> | catenateAll="0" splitOnCaseChange="1" />
> | <filter class="solr.LowerCaseFilterFactory"/>
> | <filter class="solr.StopFilterFactory" ignoreCase="true"
> | words="stopwords.txt" />
> | <filter class="solr.SnowballPorterFilterFactory" language="French"/>
> | <filter class="solr.LowerCaseFilterFactory"/>
> | <filter class="solr.ISOLatin1AccentFilterFactory"/>
> | <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> | </analyzer>
> ...
>
> but I could not translate this configuration to opencms... for example: 
> how do I add the <tokenizer class/> and <filter class/> tags in the 
> opencms/lucene configuration ?
>   
You cannot express these options in opencms. Opencms only supports the 
default analyzers of Lucene and does not allow to change with their 
default configuration.  In addition, the  the snowball stemmers for 
Lucene are supported and for those, there is an additional opencms 
configuration option to cntrol the language that is supposed to be used. 
That is all what you can control in the opencms xml configuration.

Any additional options can only be controlled by implementing your own 
analyzer or sub-classing one of the existing ones.

Kind regards
Christian

> Thanks in advance,
>
> Ulysses Tannure
>   
>> ------------------------------
>>
>> Message: 7
>> Date: Fri, 12 Feb 2010 15:49:33 +0100
>> From: Christian Steinert <christian_steinert at web.de>
>> Subject: Re: [opencms-dev] Lucene configurations
>> To: The OpenCms mailing list <opencms-dev at opencms.org>
>> Message-ID: <4B756A7D.5000605 at web.de>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Ulysses Tannure wrote:
>>   
>>     
>>> Hello,
>>>
>>> I would like to know how to set Lucene to ignore upper/lower case and
>>> accents like ?, ?, ?.
>>>   
>>>     
>>>       
>> You would have to look which analyzers are available and read their 
>> Lucene documentation and then chose the search analyzer that is best for 
>> you. You can chose the analyzer that should be used for content a 
>> certain locale by configuring the lucene analyzer class that you want to 
>> use in opencms-search.xml.
>>
>> By the way, I would really be surprised if there is any analyzer that 
>> does not ignore differences in case.
>>
>>
>> Kind regads
>> Christian
>>
>>   
>>     
>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20100217/0209c23a/attachment.htm>


More information about the opencms-dev mailing list