[opencms-dev] Lucene 1.4, Spanish Analyzer

Hartmann, Waehrisch & Feykes GmbH hartmann at waehrisch-feykes.de
Thu Mar 4 15:54:01 CET 2004


Hi Ernesto,

seems that your analyzer reduces the words to their stem, i.e. it removes
the inflections and declinations. This is just what a stemmer should do.
Can you verify this for verbs ending -ir and -er or their inflections, too?
To get the right results when you are searching for your words you should
use your analyzer also to parse the search query
(SearchHelper.doSimpleSearch only uses a StopAnalyzer).

Regards,
Stephan

----- Original Message ----- 
From: "Ernesto De Santis" <ernesto.desantis at colaborativa.net>
To: "OpenCms List" <opencms-dev at opencms.org>
Sent: Thursday, March 04, 2004 3:16 PM
Subject: [opencms-dev] Lucene 1.4, Spanish Analyzer


> > AFAIK, the Spanish analyzer is in the SnowballAnalyzers package at
> > jakarta.apache.com/lucene.
> > I've never used these, since my content is in English. Please tell me if
> > you have any trouble using them with the Lucene module.
>
> Hi Matt, I remember your mail of past year.
> my SpanishAnalyzer build with SnowballAnalyzers, (SpanishStemmer) work
very
> fine.
>
> I have some commentaries:
>
> The words finished in "ado" indexes them clearing that string. The same
with
> "ar".
> For example: "teclado" is indexed like "tecl", but "tecl" does not exist
in
> Spanish. "teclear" is also indexed like "tecl". but if the user enters
> "tecl" it finds that document. That not this good...
> In any case, it would have that to index "tecla".
>
> ----- spanish to english ----
> teclado = keyboard
> tecla = key
> teclear = to key in
>
> Other case: more complicated
>
> ----- spanish to english ----
> Comer: to eat
> Como: How to AND I eat
> Come: it eats
>
> If the user write: "Mi nena no come". (My baby does not eat)
> allĀ“s Como .... (how to....) are founded.
>
> Ernesto.
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev




More information about the opencms-dev mailing list