[opencms-dev] a solution of double-byte content search (based on lucene and M. Butcher's modu)
石 羽森
shi_yusen at hotmail.com
Mon Jul 5 16:53:01 CEST 2004
hi there,
the following is on double-byte content search, try it if you are in a CJK
project.
1. install net.grcomputing.opencms.search.lucene 1.5 which contents
lucene-1.3-final.jar. lucene 1.3 fixed StandardTokenizer's handling of CJK
characters (Chinese, Japanese and Korean ideograms).
2. modify the content branch of CmsNewExplorerFileList.java as following:
CmsSearchFormObject searchForm =
(CmsSearchFormObject)((Hashtable)session.getValue("ocms_search.allfilter")).get(currentFilter);
String query = searchForm.getValue01();
SearchHelper search = new SearchHelper(cms);
Query tempquery = null;
try {
Analyzer analyzer = new StandardAnalyzer();
tempquery = QueryParser.parse(query, "", analyzer);
} catch (Exception e) {
}
Hits hits = search.doSimpleSearch(tempquery.toString());
int i, j = hits.length();
if(j == 0) {
content.append("<h2>Your search found no matches. Please try
again.</h2>");
} else {
float score;
Document doc;
String tLastMod;
if(j == 1)
content.append("<h2 class=\"search-mathces\">Your search found
1 match.</h2>");
else
content.append("<h2 class=\"search-matches\">Your search found
" + Integer.toString(j) + " matches.</h2>");
// For each hit, get the Document and print out some information
(including a link) about each item that
// matches.
for(i = 0; i<j; ++i) {
score = hits.score(i);
doc = hits.doc(i);
String lms = doc.get("last_modified");
if(lms != null && !"".equals(lms))
tLastMod = DateField.stringToDate(lms).toString();
else tLastMod = "unknown";
//tLastMod = "unknown";
content.append("<p class=\"search-hit\"><b
class=\"search-hit-title\">"
+ "<a href=\"" + cms.link(doc.get("abs_path")) + "\"
class=\"search-hit-link\">"
+ doc.get("title") + "</a></b><br><i
class=\"search-hit-score\">");
//out.print(score); // Score is between 0.0 and 1.0
content.append("</i> " + doc.get("description") + " <br><span
class=\"smalltext\">(Last modified: " + tLastMod + ")</span></p>");
}
}
3. set searchbylucene in regitry.xml to on.
4. compile and update opencms.jar.
5. restart tomcat.
now you can search double-byte phrases.
Shi Yusen
shiys at langhua.cn
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
More information about the opencms-dev
mailing list