[opencms-dev] proper configuration solr configuration for chinese
Patric Dosch
patric.dosch at virtual-identity.com
Wed Jun 25 12:01:13 CEST 2014
Hey Kai,
I have added Chinese similar. My configuration uses the SmartChineseSentenceTokenizerFactory. Currently, there are no problems which were reported by the customer.
<analyzer>
<tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
<filter class="solr.SmartChineseWordTokenFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PositionFilterFactory" />
</analyzer>
In addition, I still have a copy Field in my schema.xml. Perhaps this helps?
<copyField source="*_zh" dest="text_zh"/>
Regards,
Patric
Von: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] Im Auftrag von Schliemann, Kai
Gesendet: Freitag, 20. Juni 2014 18:59
An: 'The OpenCms mailing list (opencms-dev at opencms.org)'
Betreff: [opencms-dev] proper configuration solr configuration for chinese
Hi list,
can somebody give me a hint on a proper configuration of the solr search for Chinese or check if our configuration is correct, please.
I defined the following in \WEB-INF\solr\conf\schema.xml:
...
<types>
...
<!-found this on the net, but not sure if it is the right tokenizer and if I need some filters -->
<fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.CJKTokenizerFactory"/>
</analyzer>
</fieldType>
...
</types>
...
<fields>
...
<!-just copied "text_de" fields and replaced "de" by "zh" -->
<field name="text_zh" type="text_zh" indexed="true" stored="false" multiValued="true"/><!-- Catchall for Chinese text fields -->
...
<!-just copied "text_de" fields and replaced "de" by "zh" -->
<dynamicField name="*_zh" type="text_zh" indexed="true" stored="true"/>
</fields>
...
I get search results but some search phrases don't give results, even if the word is in the document (checked with luke).
Thanks a lot in advance.
Best regards
Kai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20140625/79e62cf6/attachment.htm>
More information about the opencms-dev
mailing list