[opencms-dev] OpenCMS 5.0.1 and Lucene: searching japanese text
???
shiys at langhua.cn
Sat Apr 2 18:58:29 CEST 2005
Hi Uwe,
Try to seperate your japanese kanji with space(s), like this: "kanji1 kanji2" as the query string (not kanji1kanji2). If you can get a correct result, then everything is ok. If not, your lucene index may be wrong built.
The sentence
String newquery = QueryParser.parse(query,"", analyzer).toString();
is an implement to add spaces between kanjis. Of course you can use your method to substitue it.
http://mail.opencms.org/pipermail/opencms-dev/2005q1/015007.html
Regards,
Shi Yusen/Beijing Langhua Ltd.
-----????-----
???: Uwe König [mailto:uwederkoenig at web.de]
????: 2005?4?2? 1:34
???: The OpenCms mailing list
??: Re: [opencms-dev] OpenCMS 5.0.1 and Lucene: searching japanese text
Hi Shi Yusen,
I tried the following code-snippet:
Analyzer analyzer = new StandardAnalyzer();
String newquery = QueryParser.parse(query,"",
analyzer).toString("");
which transforms my two japanese kanji into the UTF-8 code numbers,
double quoted. But when I use the SearchHelper.doSimpleSearch() with
that parsed querystring, again, nothing is found.
One more question: Do I have to do something to ensure the browser
knows about the forms encoding? The page encoding is UTF-8.
Best regards,
Uwe König
More information about the opencms-dev
mailing list