[opencms-dev] lucene indexing doesn't start
Konstantins Dorodovs
K.Dorodovs at mebius.lv
Wed May 12 10:11:01 CEST 2004
looked in %CATALINA_HOME%\logs\localhost_log.MYDATE.txt
no relevant errors there :(
M Butcher wrote:
>
> Any errors in the catalina.log file?
>
> Matt
>
> Konstantins Dorodovs wrote:
>
>> Hi,
>>
>> I have a problem with lucene indexing
>> (opencms version: 5.0.6b1, lucene module: 1.5, tomcat: 4.1.30)
>>
>> cron job seems doesn't start: looked at log
>> entry in Scheduler(
>> 11 21 * * * admin Administrators
>> net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true
>> )
>>
>> seems, I did according to docs,
>> (cron is enabled: [11.05.2004 20:10:04] <opencms_init> . OpenCms
>> scheduler : enabled)
>> below, there is an excerpt from my registry.xml:
>>
>> Thanks
>>
>> Konstantin
>>
>>
>> ---------- cut ------
>> <tempfileproject>3</tempfileproject>
>>
>> <luceneSearch>
>> <!--
>> - mergeFactor and permCheck are currently ignored.
>> -->
>> <mergeFactor>100000</mergeFactor>
>> <permCheck>true</permCheck>
>>
>> <!--
>> - directory in which lucene will store its indexes. Note: this
>> is real
>> - fs, not VFS.
>> -->
>> <indexDir>C:\luceneindex\</indexDir>
>> <!-- <indexDir>F:\luceneindex\</indexDir> -->
>>
>> <!--
>> - The analyzer is used for parsing documents. Choose one for your
>> - language. If language is English, use the StandardAnalyzer.
>> - There are additional analyzers at
>> http://jakarta.apache.org/lucene
>> -->
>>
>> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
>>
>> <!--
>> <analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</analyzer> -->
>>
>> <!--
>> - If subsearch is true, subfolders will be searched by default.
>> - This can be turned on/off per directory.
>> -->
>> <subsearch>true</subsearch> <!--
>> - Name of the project to index. Online is recommended.
>> -->
>> <project>online</project>
>>
>> <!--
>> - docFactories determine how documents are processed. Generally,
>> one
>> - docFactory exists for each type of content (viz. JSP, Page,
>> Plain)
>> - that you want to index.
>> -->
>> <docFactories>
>>
>> <!--
>> - This docFactory indexes documents with type page (e.g. HTML
>> - files edited with the WYSIWYG editor).
>> -
>> - Note that the 'type' attribute specifies which content
>> definition
>> - to use. Built in content types include page, plain, binary,
>> and jsp
>> - (there are others, too). Custom content types can be used
>> as well
>> - (see the contentDefinitions section below).
>> -->
>> <docFactory enabled="true" type="page">
>>
>> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>> </docFactory>
>>
>> <!--
>> - This docFactory is a little more complex. It takes
>> documents of
>> - type "plain" and determines, by extension, what class
>> should be
>> - used to index each particular file. In this example, we
>> want to
>> - index plain text files exactly as they are, but any files that
>> - contain tags need the tags stripped out before they are
>> indexed.
>> -
>> - Note that the name="" attribute is simply for pretty
>> output, and
>> - can contain any allowable PCDATA text.
>> -->
>> <docFactory enabled="true" type="plain">
>> <fileType name="plaintext">
>> <extension>.txt</extension>
>>
>> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>> </fileType>
>> <fileType name="taggedtext">
>> <extension>.html</extension>
>> <extension>.htm</extension>
>> <extension>.xml</extension>
>> <!-- This will strip tags before processing -->
>>
>> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>> </fileType>
>> </docFactory>
>>
>> <!-- This is for binary files. PDF and DOC files are binary,
>> as are
>> - CLASS and JAR files.
>> -->
>> <docFactory enabled="true" type="binary">
>> <!-- This is for indexing PDF files -->
>> <fileType name="PDF">
>> <extension>.pdf</extension>
>>
>> <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
>> </fileType>
>> <!-- This is for indexing MS Word documents -->
>> <fileType name="Word">
>> <extension>.doc</extension>
>> <extension>.dot</extension>
>>
>> <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
>> </fileType>
>> </docFactory>
>>
>> <!--
>> - This will strip JSP tags and all scriptlets. IT WILL NOT
>> RENDER THE
>> - JSP FIRST, as JSPs are, by nature, dynamic.
>> -
>> - Usually, this is off by default.
>> -->
>> <docFactory enabled="false" type="jsp">
>> <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>> </docFactory>
>>
>> <!-- For the news module. Enable if you use news -->
>>
>> <!-- <docFactory enabled="false" type="news">
>>
>> <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
>> </docFactory>
>> -->
>>
>> <!-- For the forum module. Enable if you use forums. -->
>> <!--
>> <docFactory enabled="false" type="forum">
>>
>> <class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
>> </docFactory>
>> -->
>>
>> <!-- If you need to index XML Template files (bad idea) use
>> this: -->
>> <docFactory enabled="false" type="XML Template"/>
>> </docFactories>
>>
>> <!--
>> - <directories/> determines which directories are indexed. By
>> default,
>> - the /system directory is never indexed, so it is safe to index
>> root.
>> -
>> - If you want to specify only certain directories for indexing,
>> create
>> - one <directory/> entry per directory. Again, you may use
>> subsearch to
>> - override the default subsearch setting discussed above.
>> -->
>> <directories>
>> <directory location="/">
>> <section>Root</section>
>> <subsearch>true</subsearch>
>> </directory>
>> </directories>
>>
>> <!--
>> - Use this section to define specific contentDefinitions.
>> Provided below
>> - are entries for the news and forum modules.
>> - (Uncomment these only after you have installed the corresponding
>> - modules)
>> -->
>> <contentDefinitions>
>> <!--
>> <contentDefinition type="news">
>> -->
>> <!--
>> - <class /> determines the class of the content
>> definition. Should
>> - be a subclass of
>> com.opencms.defaults.A_CmsContentDefinition.
>> -->
>> <!--
>>
>> <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
>> -->
>> <!--
>> - <initClass /> is optional and has to implement
>> -
>> net.grcomputing.opencms.search.lucene.I_ContentDefinitionInitialization.
>> - It provides you with the ability to perform some
>> - initialization before the content definition class can
>> be used.
>> - In case of the news module the
>> NewsChannelContentDefinition class
>> - has to be loaded.
>> -->
>> <!--
>>
>> <initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initClass>
>>
>> -->
>> <!--
>> - <listMethod /> defines the method of the content
>> definition class
>> - which should be used to retrieve all content definition
>> objects
>> - (or any subset).
>> - Usually you use this method also in the backoffice or
>> any other
>> - list view.
>> -->
>> <!--
>> <listMethod name="getNewsList">
>> <param type="java.lang.Integer">1</param>
>> <param type="java.lang.String">-1</param>
>> </listMethod>
>> -->
>> <!--
>> - <page /> determines a page in the virtual file system
>> that can
>> - display a single entry of a content definition. You
>> must provide
>> - also a method of the content definition class that
>> retrieves an
>> - id (or something else that has to be appended to your
>> page uri
>> - to determine which entry has to be displayed). The
>> result will
>> - look like:
>> - /news.html?__element=entry&newsid=<result of getIntId>
>> - for each content definition instance object.
>> -->
>> <!--
>> <page uri="/news.html?__element=entry">
>> <param method="getIntId" name="newsid"/>
>> </page>
>> -->
>> <!--
>> <page uri="/singleNews.jsp">
>> <param method="getIntId" name="id"/>
>> </page>
>> -->
>> <!--
>> </contentDefinition>
>> -->
>> <!-- for Forums modules
>> <contentDefinition type="forum">
>>
>> <class>de.wfnetz.opencms.modules.forum.ContributionContentDefinition</class>
>>
>> <listMethod name="getSortedList">
>> <param type="java.lang.String"/>
>> </listMethod>
>> <page uri="/forum.html?forumtemplate=viewcontributionentry">
>> <param method="getId" name="conid"/>
>> </page>
>> </contentDefinition>
>> -->
>> </contentDefinitions>
>> </luceneSearch>
>>
>> </system>
>> ---------- cut ------
>>
>>
>> _______________________________________________
>> This mail is send to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please
>> visit
>> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please
> visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
More information about the opencms-dev
mailing list