[opencms-dev] can't get lucene 1.5 to work

Kelvin kelvin.ang at resonance.com.sg
Wed Jun 23 10:49:01 CEST 2004


Thanks a lot Olli. Finally got it to work.

Cheers,
Kelvin

>Here is a resend, since I seemed to miss the crucial word 'off' from my 
>previous mailJ The configuration below should work. The reason why your 
>original configuration does not work is because the cron is trying to 
>index some content definitions as defined in the configuration, which your 
>site most likely doesn't have.
>
>Olli
>
><?xml version="1.0" encoding="ISO-8859-1"?>
><registry>
>     <system>
>- <luceneSearch>
>- <!--
>           - mergeFactor and permCheck are currently ignored.
>
>
>   -->
>   <mergeFactor>100000</mergeFactor>
>   <permCheck>true</permCheck>
>- <!--
>           - directory in which lucene will store its indexes. Note: this 
> is real
>           - fs, not VFS.
>
>
>   -->
>   <indexDir>C:\luceneindex</indexDir>
>- <!--  <indexDir>F:\luceneindex\</indexDir>
>   -->
>- <!--
>           - The analyzer is used for parsing documents. Choose one for your
>           - language. If language is English, use the StandardAnalyzer.
>           - There are additional analyzers at 
> http://jakarta.apache.org/lucene
>
>
>   -->
>   <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
>- <!--  <analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</analyzer>
>   -->
>- <!--
>           - If subsearch is true, subfolders will be searched by default.
>           - This can be turned on/off per directory.
>
>
>   -->
>   <subsearch>true</subsearch>
>- <!--
>           - Name of the project to index. Online is recommended.
>
>
>   -->
>   <project>online</project>
>- <!--
>           - docFactories determine how documents are processed. 
> Generally, one
>           - docFactory exists for each type of content (viz. JSP, Page, 
> Plain)
>           - that you want to index.
>
>
>   -->
>- <docFactories>
>- <!--
>              - This docFactory indexes documents with type page (e.g. HTML
>                 - files edited with the WYSIWYG editor).
>
>
>   -->
>- <docFactory enabled="true" type="page">
>   <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>   </docFactory>
>- <!--
>              - This docFactory is a little more complex. It takes 
> documents of
>                 - type "plain" and determines, by extension, what class 
> should be
>                 - used to index each particular file. In this example, we 
> want to
>                 - index plain text files exactly as they are, but any 
> files that
>                 - contain tags need the tags stripped out before they are 
> indexed.
>                 -
>                 - Note that the name="" attribute is simply for pretty 
> output, and
>                 - can contain any allowable PCDATA text.
>
>
>   -->
>- <docFactory enabled="true" type="plain">
>- <fileType name="plaintext">
>   <extension>.txt</extension>
>   <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>   </fileType>
>- <fileType name="taggedtext">
>   <extension>.html</extension>
>   <extension>.htm</extension>
>   <extension>.xml</extension>
>- <!--  This will strip tags before processing
>   -->
>   <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>   </fileType>
>   </docFactory>
>- <!--
>              - This will strip JSP tags and all scriptlets. IT WILL NOT 
> RENDER THE
>                 - JSP FIRST, as JSPs are, by nature, dynamic.
>                 -
>                 - Usually, this is off by default.
>
>
>   -->
>- <docFactory enabled="false" type="jsp">
>   <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>   </docFactory>
>- <!--  For the news module. Enable if you use news
>   -->
>- <docFactory enabled="false" type="news">
>   <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
>   </docFactory>
>- <!--  For the forum module. Enable if you use forums.
>   -->
>- <docFactory enabled="false" type="forum">
>   <class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
>   </docFactory>
>- <!--  If you need to index XML Template files (bad idea) use this:
>   -->
>   <docFactory enabled="false" type="XML Template" />
>   </docFactories>
>- <!--
>           - <directories/> determines which directories are indexed. By 
> default,
>           - the /system directory is never indexed, so it is safe to 
> index root.
>           -
>           - If you want to specify only certain directories for indexing, 
> create
>           - one <directory/> entry per directory. Again, you may use 
> subsearch to
>           - override the default subsearch setting discussed above.
>
>
>   -->
>- <directories>
>- <directory location="/">
>   <section>Root</section>
>   <subsearch>true</subsearch>
>   </directory>
>   </directories>
>- <!--
>      - Use this section to define specific contentDefinitions. Provided below
>         - are entries for the news and forum modules.
>
>
>   -->
>- <contentDefinitions/>
>   </luceneSearch>
>- <!--
>    - END lucene config
>
>
>   -->
>
>
>----------
>
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/2004
>
>---
>Outgoing mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/2004
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/2004
>
>---
>Outgoing mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/2004
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20040623/0ef2ee48/attachment.htm>


More information about the opencms-dev mailing list