[opencms-dev] lucene indexing doesn't start

Konstantins Dorodovs K.Dorodovs at mebius.lv
Wed May 12 10:11:01 CEST 2004


looked in %CATALINA_HOME%\logs\localhost_log.MYDATE.txt
no relevant errors there :(




M Butcher wrote:

>
> Any errors in the catalina.log file?
>
> Matt
>
> Konstantins Dorodovs wrote:
>
>> Hi,
>>
>> I have a problem with lucene indexing
>> (opencms version: 5.0.6b1, lucene module: 1.5, tomcat: 4.1.30)
>>
>> cron job seems doesn't start:  looked at log
>> entry in Scheduler(
>> 11 21 * * * admin Administrators 
>> net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true
>> )
>>
>> seems, I did according to docs,
>> (cron is enabled: [11.05.2004 20:10:04] <opencms_init> . OpenCms 
>> scheduler    : enabled)
>> below, there is an excerpt from my registry.xml:
>>
>> Thanks
>>
>> Konstantin
>>
>>
>> ---------- cut ------
>>        <tempfileproject>3</tempfileproject>
>>
>> <luceneSearch>
>>    <!--
>>      - mergeFactor and permCheck are currently ignored.
>>      -->
>>   <mergeFactor>100000</mergeFactor>
>>   <permCheck>true</permCheck>
>>
>>    <!--
>>      - directory in which lucene will store its indexes. Note: this 
>> is real
>>      - fs, not VFS.
>>      -->
>>   <indexDir>C:\luceneindex\</indexDir>
>>   <!-- <indexDir>F:\luceneindex\</indexDir> -->
>>
>>    <!--
>>      - The analyzer is used for parsing documents. Choose one for your
>>      - language. If language is English, use the StandardAnalyzer.
>>      - There are additional analyzers at 
>> http://jakarta.apache.org/lucene
>>      -->
>>   
>> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer> 
>>
>>   <!-- 
>> <analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</analyzer> -->
>>
>>    <!--
>>      - If subsearch is true, subfolders will be searched by default.
>>      - This can be turned on/off per directory.
>>      -->
>>   <subsearch>true</subsearch>     <!--
>>      - Name of the project to index. Online is recommended.
>>      -->
>>   <project>online</project>
>>  
>>    <!--
>>      - docFactories determine how documents are processed. Generally, 
>> one
>>      - docFactory exists for each type of content (viz. JSP, Page, 
>> Plain)
>>      - that you want to index.
>>      -->
>>   <docFactories>
>>  
>>       <!--
>>         - This docFactory indexes documents with type page (e.g. HTML
>>         - files edited with the WYSIWYG editor).
>>         -
>>         - Note that the 'type' attribute specifies which content 
>> definition
>>         - to use. Built in content types include page, plain, binary, 
>> and jsp
>>         - (there are others, too). Custom content types can be used 
>> as well
>>         - (see the contentDefinitions section below).
>>         -->
>>       <docFactory enabled="true" type="page">
>>         
>> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>>       </docFactory>
>>
>>       <!--
>>         - This docFactory is a little more complex. It takes 
>> documents of
>>         - type "plain" and determines, by extension, what class 
>> should be
>>         - used to index each particular file. In this example, we 
>> want to
>>         - index plain text files exactly as they are, but any files that
>>         - contain tags need the tags stripped out before they are 
>> indexed.
>>         -
>>         - Note that the name="" attribute is simply for pretty 
>> output, and
>>         - can contain any allowable PCDATA text.
>>         -->
>>       <docFactory enabled="true" type="plain">
>>          <fileType name="plaintext">
>>            <extension>.txt</extension>
>>            
>> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>>          </fileType>
>>          <fileType name="taggedtext">
>>            <extension>.html</extension>
>>            <extension>.htm</extension>
>>            <extension>.xml</extension>
>>            <!-- This will strip tags before processing -->
>>            
>> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>>          </fileType>
>>       </docFactory>
>>
>>        <!-- This is for binary files. PDF and DOC files are binary, 
>> as are
>>          - CLASS and JAR files.
>>          -->
>>       <docFactory enabled="true" type="binary">
>>          <!-- This is for indexing PDF files -->
>>          <fileType name="PDF">
>>            <extension>.pdf</extension>
>>            
>> <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
>>          </fileType>
>>          <!-- This is for indexing MS Word documents -->
>>          <fileType name="Word">
>>            <extension>.doc</extension>
>>            <extension>.dot</extension>
>>            
>> <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
>>          </fileType>
>>       </docFactory>
>>
>>       <!--
>>         - This will strip JSP tags and all scriptlets. IT WILL NOT 
>> RENDER THE
>>         - JSP FIRST, as JSPs are, by nature, dynamic.
>>         -
>>         - Usually, this is off by default.
>>         -->
>>       <docFactory enabled="false" type="jsp">
>>         <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>>       </docFactory>
>>
>>       <!-- For the news module. Enable if you use news -->
>>
>> <!--       <docFactory enabled="false" type="news">
>>         
>> <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
>>       </docFactory>
>> -->
>>
>>       <!-- For the forum module. Enable if you use forums. -->
>> <!--
>>       <docFactory enabled="false" type="forum">
>>         
>> <class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
>>       </docFactory>
>> -->
>>
>>       <!-- If you need to index XML Template files (bad idea) use 
>> this: -->
>>       <docFactory enabled="false" type="XML Template"/>
>>   </docFactories>
>>  
>>    <!--
>>      - <directories/> determines which directories are indexed. By 
>> default,
>>      - the /system directory is never indexed, so it is safe to index 
>> root.
>>      -
>>      - If you want to specify only certain directories for indexing, 
>> create
>>      - one <directory/> entry per directory. Again, you may use 
>> subsearch to
>>      - override the default subsearch setting discussed above.
>>      -->
>>   <directories>
>>       <directory location="/">
>>         <section>Root</section>
>>         <subsearch>true</subsearch>
>>       </directory>
>>   </directories>
>>
>>   <!--
>>     - Use this section to define specific contentDefinitions. 
>> Provided below
>>     - are entries for the news and forum modules.
>>     - (Uncomment these only after you have installed the corresponding
>>     - modules)
>>     -->
>>   <contentDefinitions>
>>       <!--
>>       <contentDefinition type="news">
>>        -->
>>          <!--
>>            - <class /> determines the class of the content 
>> definition. Should
>>            - be a subclass of 
>> com.opencms.defaults.A_CmsContentDefinition.
>>            -->
>>         <!--
>>         
>> <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
>>          -->
>>          <!--
>>            - <initClass /> is optional and has to implement
>>            - 
>> net.grcomputing.opencms.search.lucene.I_ContentDefinitionInitialization.
>>            - It provides you with the ability to perform some
>>            - initialization before the content definition class can 
>> be used.
>>            - In case of the news module the 
>> NewsChannelContentDefinition class
>>            - has to be loaded.
>>            -->
>>         <!--
>>         
>> <initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initClass> 
>>
>>          -->
>>           <!--
>>             - <listMethod /> defines the method of the content 
>> definition class
>>             - which should be used to retrieve all content definition 
>> objects
>>             - (or any subset).
>>             - Usually you use this method also in the backoffice or 
>> any other
>>             - list view.
>>             -->
>>         <!--
>>         <listMethod name="getNewsList">
>>           <param type="java.lang.Integer">1</param>
>>           <param type="java.lang.String">-1</param>
>>         </listMethod>
>>          -->
>>           <!--
>>             - <page /> determines a page in the virtual file system 
>> that can
>>             - display a single entry of a content definition. You 
>> must provide
>>             - also a method of the content definition class that 
>> retrieves an
>>             - id (or something else that has to be appended to your 
>> page uri
>>             - to determine which entry has to be displayed). The 
>> result will
>>             - look like:
>>             - /news.html?__element=entry&newsid=<result of getIntId>
>>             - for each content definition instance object.
>>             -->
>>         <!--
>>         <page uri="/news.html?__element=entry">
>>           <param method="getIntId" name="newsid"/>
>>         </page>
>>          -->
>>         <!--
>>           <page uri="/singleNews.jsp">
>>             <param method="getIntId" name="id"/>
>>           </page>
>>           -->
>>       <!--
>>       </contentDefinition>
>>        -->
>>        <!-- for Forums modules
>>       <contentDefinition type="forum">
>>         
>> <class>de.wfnetz.opencms.modules.forum.ContributionContentDefinition</class> 
>>
>>         <listMethod name="getSortedList">
>>           <param type="java.lang.String"/>
>>         </listMethod>
>>         <page uri="/forum.html?forumtemplate=viewcontributionentry">
>>           <param method="getId" name="conid"/>
>>         </page>
>>       </contentDefinition>
>>       -->
>>   </contentDefinitions>
>> </luceneSearch>
>>
>>    </system>
>> ---------- cut ------
>>
>>
>> _______________________________________________
>> This mail is send to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please 
>> visit
>> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>



More information about the opencms-dev mailing list