[opencms-dev] XMLContent search settings

Tony Thul TTHUL at regina.ca
Fri Apr 1 15:55:24 CEST 2011


Thanks Paul!
 
That's just what we were looking for!
 
Tony

>>> Paul-Inge Flakstad <flakstad at npolar.no> 31/Mar/2011 8:48 pm >>>
Hi Tony,

If I understand you correctly, your xmlcontent ("structured content") contains a mix of elements that you do want to be indexed and elements that you don't want to be indexed.

In this case, use the "searchsetting" in your XSD to specify elements that shouldn't be indexed, like this:

<xsd:appinfo>
    ...
    <searchsettings>
        <searchsetting element="MyElementName" searchcontent="false" />
    </searchsettings>
    ...
</xsd:appinfo>

In the example above, any content inside "MyElementName" will not be indexed.

Note that for an existing index to reflect changes in the "searchsettings" section, simply rebuilding the index is not suffucient. Assuming you're using Tomcat, you'll need to stop Tomcat, delete the search index - located at {Tomcat home}/webapps/{OpenCms webapp name}/WEB-INF/index/ - start Tomcat again, and then rebuild the search index in OpenCms.

Additionally, a handy feature (in case you were not aware of it):
To prevent entire files/folders from being indexed, set the property "search.exclude" to "true". (I use this approach for certain resource types.)

Hope this helps.

Best regards,
Paul
________________________________________
Fra: opencms-dev-bounces at opencms.org [opencms-dev-bounces at opencms.org] på vegne av Tony Thul [TTHUL at regina.ca] 
Sendt: 31. mars 2011 19:04
Til: opencms-dev at opencms.org 
Emne: [opencms-dev] XMLContent search settings

We want users to be able to search both XML content pages and PDF's through one index. In our field configuration we have Content which is mapped to "content" and XMLContent which is mapped to Parameter = Content[1].

The problem is the XML content pages are analyzed for Content and XMLContent. This results in the excerpt using text from Content which displays text from other elements in the XML file that we don't want part of the excerpt. If we remove Content from the field configuration then XML content pages work but PDF content doesn't get indexed.

Any Ideas?

Thanks!
Tony


DISCLAIMER: The information transmitted is intended only for the addressee and may contain confidential, proprietary and/or privileged material. Any unauthorized review, distribution or other use of or the taking of any action in reliance upon this information is prohibited. If you received this in error, please contact the sender and delete or destroy this message and any copies.


_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://lists.opencms.org/mailman/listinfo/opencms-dev


DISCLAIMER: The information transmitted is intended only for the addressee and may contain confidential, proprietary and/or privileged material. Any unauthorized review, distribution or other use of or the taking of any action in reliance upon this information is prohibited. If you received this in error, please contact the sender and delete or destroy this message and any copies. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20110401/e30e96bd/attachment.htm>


More information about the opencms-dev mailing list