<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.2873" name=GENERATOR></HEAD>
<BODY>
<DIV>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial size=2><SPAN class=467083106-23052006>I'd be interested in hearing
how anyone else has found it best to deal with the following problem, assuming
I've got things right.</SPAN></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial size=2><?xml:namespace prefix = o ns =
"urn:schemas-microsoft-com:office:office" /><o:p><SPAN
class=467083106-23052006>I've implemented my own (working) Lucene document
factory and configured its so-called document type in
opencms-search.xml. The trouble is that it's getting hidden by other
document types - and I believe this is because in general
</SPAN></o:p></FONT></SPAN><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2>document factories which declare interest in the same
document keys as other document factories may win or lose, unpredictably, in
getting their association recognised by
OpenCms.<o:p></o:p></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN class=467083106-23052006>Wouldn't it be
</SPAN>better if <SPAN class=467083106-23052006>opencms-search.xml were
processed in a way that embodied some kind of </SPAN>shadowing or priority<SPAN
class=467083106-23052006>, so that later entries beat earlier ones? Then
we could have (say) document type 'xmlcontent' dealing with all XML content
types except those which were dealt with by a custom Lucene document factory...
which is what I'm looking for! When I've tried to change the 'xmlcontent'
document type config to handle only xmlcontent resource types, then my document
factory still isn't associated with my custom XML resources types, because the
'generic' document type takes over.</SPAN></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN
class=467083106-23052006></SPAN></FONT></FONT></SPAN><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2>Analysis:<o:p></o:p></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN class=467083106-23052006>1. </SPAN>The
behaviour of indexing and searching is defined in opencms-search.xml, which
allows developers to create their own (Lucene) document factories for specific
purposes.<SPAN style="mso-spacerun: yes"> </SPAN>opencms-search.xml,
together with the <SPAN class=467083106-23052006>document </SPAN>factory
implementations, help associate resource types and MIME types with the right
factory implementation.<SPAN style="mso-spacerun: yes"> </SPAN>The key to
this association is a so-called 'document key', which is either MIME-specific
(it contains the resource type and the MIME type) or non-MIME-specific (it
contains only the resource type).</FONT></FONT></SPAN></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT><FONT><FONT
face=Arial><FONT size=2><SPAN class=467083106-23052006>2</SPAN>.<SPAN
style="mso-spacerun: yes"> </SPAN>At start-up time, search document types
('document types') are read from opencms-search.xml and put in a HashMap m_<SPAN
style="COLOR: black">
m_documentTypeConfigs</SPAN>.<o:p></o:p></FONT></FONT></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN class=467083106-23052006>3</SPAN>.<SPAN
style="mso-spacerun: yes"> </SPAN>Still at start-up, OpenCms then iterates
through the keys of that document type config map, and for each document type
config it does the following: (i) it instantiates the associated document
factory (class), (ii) it calls 'getDocumentKeys' on that instance, telling it
which combination of resource types and MIME types it may consider (as
configured in opencms-search.xml's resourcetypes and mimetypes nodes); (iii) it
uses HashMap m_documentTypes to map each of the document keys to the document
factory instance. <o:p></o:p></FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial><FONT size=2><SPAN class=467083106-23052006>4</SPAN>.<SPAN
style="mso-spacerun: yes"> </SPAN>The iteration order through keys of
m_documentTypeConfigs is unpredictable, and therfore so too is the order in
which document factories are asked to declare their interest in various document
keys.<SPAN style="mso-spacerun: yes"> </SPAN>A document factory may
therefore end up replacing the value of an existing key - i.e. a document
factory instance already configured - with its own
instance.</FONT></FONT></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 6pt"><SPAN
style="FONT-SIZE: 9pt; FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: 'Lucida Console'"><FONT
face=Arial size=2><SPAN
class=467083106-23052006>Jon</SPAN></FONT></SPAN></P></DIV></BODY></HTML>