[opencms-dev] Lucene 1.4, Spanish Analyzer

Ernesto De Santis ernesto.desantis at colaborativa.net
Thu Mar 4 15:16:00 CET 2004


> AFAIK, the Spanish analyzer is in the SnowballAnalyzers package at
> jakarta.apache.com/lucene.
> I've never used these, since my content is in English. Please tell me if
> you have any trouble using them with the Lucene module.

Hi Matt, I remember your mail of past year.
my SpanishAnalyzer build with SnowballAnalyzers, (SpanishStemmer) work very
fine.

I have some commentaries:

The words finished in "ado" indexes them clearing that string. The same with
"ar".
For example: "teclado" is indexed like "tecl", but "tecl" does not exist in
Spanish. "teclear" is also indexed like "tecl". but if the user enters
"tecl" it finds that document. That not this good...
In any case, it would have that to index "tecla".

----- spanish to english ----
teclado = keyboard
tecla = key
teclear = to key in

Other case: more complicated

----- spanish to english ----
Comer: to eat
Como: How to AND I eat
Come: it eats

If the user write: "Mi nena no come". (My baby does not eat)
all´s Como .... (how to....) are founded.

Ernesto.




More information about the opencms-dev mailing list