[opencms-dev] OpenCMS 5.0.1 and Lucene: searching japanese text

??? shiys at langhua.cn
Sat Apr 2 18:58:29 CEST 2005


Hi Uwe,

Try to seperate your japanese kanji with space(s), like this: "kanji1 kanji2" as the query string (not kanji1kanji2). If you can get a correct result, then everything is ok. If not, your lucene index may be wrong built.

The sentence
        String newquery = QueryParser.parse(query,"", analyzer).toString(); 
is an implement to add spaces between kanjis. Of course you can use your method to substitue it.

http://mail.opencms.org/pipermail/opencms-dev/2005q1/015007.html

Regards,

Shi Yusen/Beijing Langhua Ltd.


-----????-----
???: Uwe König [mailto:uwederkoenig at web.de]
????: 2005?4?2? 1:34
???: The OpenCms mailing list
??: Re: [opencms-dev] OpenCMS 5.0.1 and Lucene: searching japanese text




Hi Shi Yusen, 

I tried the following code-snippet:

	Analyzer analyzer = new StandardAnalyzer();
	String newquery = QueryParser.parse(query,"", 
		analyzer).toString("");

which transforms my two japanese kanji into the UTF-8 code numbers,
double quoted. But when I use the SearchHelper.doSimpleSearch() with
that parsed querystring, again, nothing is found. 

One more question: Do I have to do something to ensure the browser
knows about the forms encoding? The page encoding is UTF-8. 
Best regards, 

Uwe König 



More information about the opencms-dev mailing list