<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Dear Ulysses<br>

<blockquote cite="mid:4B7E7565.8030105@oriens.com.br" type="cite">

  <pre wrap="">actually I found most of I want in this message: 

<a class="moz-txt-link-freetext" href="http://osdir.com/ml/solr-user.lucene.apache.org/2009-10/msg01091.html">http://osdir.com/ml/solr-user.lucene.apache.org/2009-10/msg01091.html</a>

...

| <analyzer>

| <tokenizer class="solr.StandardTokenizerFactory"/>

| <filter class="solr.StandardFilterFactory"/>

| <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"

|

| generateNumberParts="1" catenateWords="1" catenateNumbers="1"

| catenateAll="0" splitOnCaseChange="1" />

| <filter class="solr.LowerCaseFilterFactory"/>

| <filter class="solr.StopFilterFactory" ignoreCase="true"

| words="stopwords.txt" />

| <filter class="solr.SnowballPorterFilterFactory" language="French"/>

| <filter class="solr.LowerCaseFilterFactory"/>

| <filter class="solr.ISOLatin1AccentFilterFactory"/>

| <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

| </analyzer>

...

but I could not translate this configuration to opencms... for example: 

how do I add the <tokenizer class/> and <filter class/> tags in the 

opencms/lucene configuration ?

  </pre>

</blockquote>

You cannot express these options in opencms. Opencms only supports the

default analyzers of Lucene and does not allow to change with their

default configuration.  In addition, the  the snowball stemmers for

Lucene are supported and for those, there is an additional opencms

configuration option to cntrol the language that is supposed to be

used. That is all what you can control in the opencms xml configuration.<br>

<br>

Any additional options can only be controlled by implementing your own

analyzer or sub-classing one of the existing ones.<br>

<br>

Kind regards<br>

Christian<br>

<br>

<blockquote cite="mid:4B7E7565.8030105@oriens.com.br" type="cite">

  <pre wrap="">

Thanks in advance,

Ulysses Tannure

  </pre>

  <blockquote type="cite">

    <pre wrap="">------------------------------

Message: 7

Date: Fri, 12 Feb 2010 15:49:33 +0100

From: Christian Steinert <a class="moz-txt-link-rfc2396E" href="mailto:christian_steinert@web.de"><christian_steinert@web.de></a>

Subject: Re: [opencms-dev] Lucene configurations

To: The OpenCms mailing list <a class="moz-txt-link-rfc2396E" href="mailto:opencms-dev@opencms.org"><opencms-dev@opencms.org></a>

Message-ID: <a class="moz-txt-link-rfc2396E" href="mailto:4B756A7D.5000605@web.de"><4B756A7D.5000605@web.de></a>

Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Ulysses Tannure wrote:

    </pre>

    <blockquote type="cite">

      <pre wrap="">Hello,

I would like to know how to set Lucene to ignore upper/lower case and

accents like ?, ?, ?.

      </pre>

    </blockquote>

    <pre wrap="">You would have to look which analyzers are available and read their 

Lucene documentation and then chose the search analyzer that is best for 

you. You can chose the analyzer that should be used for content a 

certain locale by configuring the lucene analyzer class that you want to 

use in opencms-search.xml.

By the way, I would really be surprised if there is any analyzer that 

does not ignore differences in case.

Kind regads

Christian

    </pre>

  </blockquote>

  <pre wrap=""><!---->

_______________________________________________

This mail is sent to you from the opencms-dev mailing list

To change your list options, or to unsubscribe from the list, please visit

<a class="moz-txt-link-freetext" href="http://lists.opencms.org/mailman/listinfo/opencms-dev">http://lists.opencms.org/mailman/listinfo/opencms-dev</a>

  </pre>

</blockquote>

<br>

</body>

</html>