<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Ulysses<br>
<blockquote cite="mid:4B7E7565.8030105@oriens.com.br" type="cite">
<pre wrap="">actually I found most of I want in this message:
<a class="moz-txt-link-freetext" href="http://osdir.com/ml/solr-user.lucene.apache.org/2009-10/msg01091.html">http://osdir.com/ml/solr-user.lucene.apache.org/2009-10/msg01091.html</a>
...
| <analyzer>
| <tokenizer class="solr.StandardTokenizerFactory"/>
| <filter class="solr.StandardFilterFactory"/>
| <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
|
| generateNumberParts="1" catenateWords="1" catenateNumbers="1"
| catenateAll="0" splitOnCaseChange="1" />
| <filter class="solr.LowerCaseFilterFactory"/>
| <filter class="solr.StopFilterFactory" ignoreCase="true"
| words="stopwords.txt" />
| <filter class="solr.SnowballPorterFilterFactory" language="French"/>
| <filter class="solr.LowerCaseFilterFactory"/>
| <filter class="solr.ISOLatin1AccentFilterFactory"/>
| <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
| </analyzer>
...
but I could not translate this configuration to opencms... for example:
how do I add the <tokenizer class/> and <filter class/> tags in the
opencms/lucene configuration ?
</pre>
</blockquote>
You cannot express these options in opencms. Opencms only supports the
default analyzers of Lucene and does not allow to change with their
default configuration. In addition, the the snowball stemmers for
Lucene are supported and for those, there is an additional opencms
configuration option to cntrol the language that is supposed to be
used. That is all what you can control in the opencms xml configuration.<br>
<br>
Any additional options can only be controlled by implementing your own
analyzer or sub-classing one of the existing ones.<br>
<br>
Kind regards<br>
Christian<br>
<br>
<blockquote cite="mid:4B7E7565.8030105@oriens.com.br" type="cite">
<pre wrap="">
Thanks in advance,
Ulysses Tannure
</pre>
<blockquote type="cite">
<pre wrap="">------------------------------
Message: 7
Date: Fri, 12 Feb 2010 15:49:33 +0100
From: Christian Steinert <a class="moz-txt-link-rfc2396E" href="mailto:christian_steinert@web.de"><christian_steinert@web.de></a>
Subject: Re: [opencms-dev] Lucene configurations
To: The OpenCms mailing list <a class="moz-txt-link-rfc2396E" href="mailto:opencms-dev@opencms.org"><opencms-dev@opencms.org></a>
Message-ID: <a class="moz-txt-link-rfc2396E" href="mailto:4B756A7D.5000605@web.de"><4B756A7D.5000605@web.de></a>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Ulysses Tannure wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hello,
I would like to know how to set Lucene to ignore upper/lower case and
accents like ?, ?, ?.
</pre>
</blockquote>
<pre wrap="">You would have to look which analyzers are available and read their
Lucene documentation and then chose the search analyzer that is best for
you. You can chose the analyzer that should be used for content a
certain locale by configuring the lucene analyzer class that you want to
use in opencms-search.xml.
By the way, I would really be surprised if there is any analyzer that
does not ignore differences in case.
Kind regads
Christian
</pre>
</blockquote>
<pre wrap=""><!---->
_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<a class="moz-txt-link-freetext" href="http://lists.opencms.org/mailman/listinfo/opencms-dev">http://lists.opencms.org/mailman/listinfo/opencms-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>