[opencms-dev] lucene indexing doesn't start
Konstantins Dorodovs
K.Dorodovs at mebius.lv
Tue May 11 20:23:01 CEST 2004
Hi,
I have a problem with lucene indexing
(opencms version: 5.0.6b1, lucene module: 1.5, tomcat: 4.1.30)
cron job seems doesn't start: looked at log
entry in Scheduler(
11 21 * * * admin Administrators
net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true
)
seems, I did according to docs,
(cron is enabled: [11.05.2004 20:10:04] <opencms_init> . OpenCms
scheduler : enabled)
below, there is an excerpt from my registry.xml:
Thanks
Konstantin
---------- cut ------
<tempfileproject>3</tempfileproject>
<luceneSearch>
<!--
- mergeFactor and permCheck are currently ignored.
-->
<mergeFactor>100000</mergeFactor>
<permCheck>true</permCheck>
<!--
- directory in which lucene will store its indexes. Note: this is real
- fs, not VFS.
-->
<indexDir>C:\luceneindex\</indexDir>
<!-- <indexDir>F:\luceneindex\</indexDir> -->
<!--
- The analyzer is used for parsing documents. Choose one for your
- language. If language is English, use the StandardAnalyzer.
- There are additional analyzers at http://jakarta.apache.org/lucene
-->
<analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
<!--
<analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</analyzer> -->
<!--
- If subsearch is true, subfolders will be searched by default.
- This can be turned on/off per directory.
-->
<subsearch>true</subsearch>
<!--
- Name of the project to index. Online is recommended.
-->
<project>online</project>
<!--
- docFactories determine how documents are processed. Generally, one
- docFactory exists for each type of content (viz. JSP, Page, Plain)
- that you want to index.
-->
<docFactories>
<!--
- This docFactory indexes documents with type page (e.g. HTML
- files edited with the WYSIWYG editor).
-
- Note that the 'type' attribute specifies which content definition
- to use. Built in content types include page, plain, binary,
and jsp
- (there are others, too). Custom content types can be used as well
- (see the contentDefinitions section below).
-->
<docFactory enabled="true" type="page">
<class>net.grcomputing.opencms.search.lucene.PageDocument</class>
</docFactory>
<!--
- This docFactory is a little more complex. It takes documents of
- type "plain" and determines, by extension, what class should be
- used to index each particular file. In this example, we want to
- index plain text files exactly as they are, but any files that
- contain tags need the tags stripped out before they are indexed.
-
- Note that the name="" attribute is simply for pretty output, and
- can contain any allowable PCDATA text.
-->
<docFactory enabled="true" type="plain">
<fileType name="plaintext">
<extension>.txt</extension>
<class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
</fileType>
<fileType name="taggedtext">
<extension>.html</extension>
<extension>.htm</extension>
<extension>.xml</extension>
<!-- This will strip tags before processing -->
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
</fileType>
</docFactory>
<!-- This is for binary files. PDF and DOC files are binary, as are
- CLASS and JAR files.
-->
<docFactory enabled="true" type="binary">
<!-- This is for indexing PDF files -->
<fileType name="PDF">
<extension>.pdf</extension>
<class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
</fileType>
<!-- This is for indexing MS Word documents -->
<fileType name="Word">
<extension>.doc</extension>
<extension>.dot</extension>
<class>net.grcomputing.opencms.search.lucene.WordDocument</class>
</fileType>
</docFactory>
<!--
- This will strip JSP tags and all scriptlets. IT WILL NOT
RENDER THE
- JSP FIRST, as JSPs are, by nature, dynamic.
-
- Usually, this is off by default.
-->
<docFactory enabled="false" type="jsp">
<class>net.grcomputing.opencms.search.lucene.JspDocument</class>
</docFactory>
<!-- For the news module. Enable if you use news -->
<!-- <docFactory enabled="false" type="news">
<class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
</docFactory>
-->
<!-- For the forum module. Enable if you use forums. -->
<!--
<docFactory enabled="false" type="forum">
<class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
</docFactory>
-->
<!-- If you need to index XML Template files (bad idea) use this: -->
<docFactory enabled="false" type="XML Template"/>
</docFactories>
<!--
- <directories/> determines which directories are indexed. By default,
- the /system directory is never indexed, so it is safe to index root.
-
- If you want to specify only certain directories for indexing, create
- one <directory/> entry per directory. Again, you may use
subsearch to
- override the default subsearch setting discussed above.
-->
<directories>
<directory location="/">
<section>Root</section>
<subsearch>true</subsearch>
</directory>
</directories>
<!--
- Use this section to define specific contentDefinitions. Provided
below
- are entries for the news and forum modules.
- (Uncomment these only after you have installed the corresponding
- modules)
-->
<contentDefinitions>
<!--
<contentDefinition type="news">
-->
<!--
- <class /> determines the class of the content definition.
Should
- be a subclass of com.opencms.defaults.A_CmsContentDefinition.
-->
<!--
<class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
-->
<!--
- <initClass /> is optional and has to implement
-
net.grcomputing.opencms.search.lucene.I_ContentDefinitionInitialization.
- It provides you with the ability to perform some
- initialization before the content definition class can be
used.
- In case of the news module the
NewsChannelContentDefinition class
- has to be loaded.
-->
<!--
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initClass>
-->
<!--
- <listMethod /> defines the method of the content
definition class
- which should be used to retrieve all content definition
objects
- (or any subset).
- Usually you use this method also in the backoffice or any
other
- list view.
-->
<!--
<listMethod name="getNewsList">
<param type="java.lang.Integer">1</param>
<param type="java.lang.String">-1</param>
</listMethod>
-->
<!--
- <page /> determines a page in the virtual file system
that can
- display a single entry of a content definition. You must
provide
- also a method of the content definition class that
retrieves an
- id (or something else that has to be appended to your
page uri
- to determine which entry has to be displayed). The result
will
- look like:
- /news.html?__element=entry&newsid=<result of getIntId>
- for each content definition instance object.
-->
<!--
<page uri="/news.html?__element=entry">
<param method="getIntId" name="newsid"/>
</page>
-->
<!--
<page uri="/singleNews.jsp">
<param method="getIntId" name="id"/>
</page>
-->
<!--
</contentDefinition>
-->
<!-- for Forums modules
<contentDefinition type="forum">
<class>de.wfnetz.opencms.modules.forum.ContributionContentDefinition</class>
<listMethod name="getSortedList">
<param type="java.lang.String"/>
</listMethod>
<page uri="/forum.html?forumtemplate=viewcontributionentry">
<param method="getId" name="conid"/>
</page>
</contentDefinition>
-->
</contentDefinitions>
</luceneSearch>
</system>
---------- cut ------
More information about the opencms-dev
mailing list