<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=gb2312">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="PersonName"/>
<!--[if !mso]>
<style>
st1\:*{behavior:url(#default#ieooui) }
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"\@宋体";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{margin:0cm;
margin-bottom:.0001pt;
font-size:9.0pt;
font-family:宋体;}
span.EmailStyle17
{mso-style-type:personal;
font-family:Arial;
color:navy;}
@page Section1
{size:595.3pt 841.9pt;
margin:72.0pt 126.65pt 72.0pt 126.65pt;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=ZH-CN link=blue vlink=purple>
<div class=Section1>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Hi Jon,<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>I think Lucene is ok for your requirement. Please see
http://lucene.apache.org/java/docs/queryparsersyntax.html.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>'all articles on --this-- subject within --this date range containing
these words', in Lucene, the search syntax could be: </span></font><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'>‘</span></font><span
lang=EN-US>content:</span><font face="Courier New"><span lang=EN-US
style='font-family:"Courier New"'>”</span></font><span lang=EN-US>these words</span><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'>”</span></font><span
lang=EN-US> AND subject:</span><font face="Courier New"><span lang=EN-US
style='font-family:"Courier New"'>”—</span></font><span lang=EN-US>this</span><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'>—“</span></font><span
lang=EN-US> AND date:[20010101 TO 20020101]</span><font face="Courier New"><span
lang=EN-US style='font-family:"Courier New"'>’</span></font><span lang=EN-US>.<o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Regards,<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Shi Yusen/Beijing Langhua Ltd.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>________________________________________<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span style='font-size:9.0pt'>发件人<span
lang=EN-US>: Jonathan Woods [mailto:jonathan.woods@scintillance.com] <o:p></o:p></span></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span style='font-size:9.0pt'>发送时间<span
lang=EN-US>: 2006</span>年<span lang=EN-US>1</span>月<span lang=EN-US>4</span>日<span
lang=EN-US> 22:59<o:p></o:p></span></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span style='font-size:9.0pt'>收件人<span
lang=EN-US>: '<st1:PersonName w:st="on">The OpenCms mailing list</st1:PersonName>'<o:p></o:p></span></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span style='font-size:9.0pt'>主题<span
lang=EN-US>: [opencms-dev] Best way to implement searching across article
metadata<o:p></o:p></span></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Dear list...<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=1 face="Courier New"><span lang=EN-US
style='font-size:9.0pt;font-family:"Courier New"'> </span></font><span
lang=EN-US><o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>I need to develop an OpenCms-based site which will allow users to search
for articles with matching metadata.</span></font><font face="Courier New"><span
lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> Some of this metadata is textual and therefore ready for a
Lucene-based search, but much of it is strongly typed - e.g. dates, article
importance, and references to other reference/standing data which it would be a
shame to model merely as text strings.</span><font face="Courier New"><span
lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> By the way, the dates I mention are not simply dates already
meaningful to OpenCms (e.g. article creation date).</span><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> An example search might be</span><font face="Courier New"><span
lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US>'show all articles with --this-- subject concerning a date between
--these-- dates'.<o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face="Courier New"><span lang=EN-US
style='font-size:9.0pt;font-family:"Courier New"'> </span></font><span
lang=EN-US><o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Could someone tell me roughly the best way to go about modelling
articles to make this kind of searching possible?</span></font><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> Is this something which the Lucene module can cope with, provided that
articles and their metadata are hooked up to the Lucene index correctly?</span><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> Would I model this metadata using OpenCms template properties
(which I believe are merely strings), or as XML document elements, or (God
forbid!) in a separately maintained database structure?</span><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> I'm concerned not only about the most natural way to do this such
that content can be managed within OpenCms, but also about search performance -
I imagine searching through XML documents, for example, would be slow.<o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face="Courier New"><span lang=EN-US
style='font-size:9.0pt;font-family:"Courier New"'> </span></font><span
lang=EN-US><o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>As if this weren't enough, I'm also wondering how best to mix metadata
searching like this with free-text searching (i.e. 'all articles on --this--
subject within --this date range containing these words').</span></font><font
face="Courier New"><span lang=EN-US style='font-family:"Courier New"'> </span></font><span
lang=EN-US> If Lucene is the answer above, then I can see how this can be done,
but if not then the picture isn't yet as clear.<o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face="Courier New"><span lang=EN-US
style='font-size:9.0pt;font-family:"Courier New"'> </span></font><span
lang=EN-US><o:p></o:p></span></p>
<p class=MsoPlainText><font size=1 face=宋体><span lang=EN-US style='font-size:
9.0pt'>Jon<o:p></o:p></span></font></p>
</div>
</body>
</html>