[opencms-dev] lucene indexing doesn't start
Konstantins Dorodovs
K.Dorodovs at mebius.lv
Fri May 14 10:58:01 CEST 2004
it's ok, task was run, only later, then expected
a new problem is:
doc.get("title") returns null when lucene indexes on linux,
when I run on windows it seems ok
Konstantin
M Butcher wrote:
>
> Are any other cron tasks executing? It sounds like the
> CronIndexManager is never being run.
>
> If you suspect otherwise, a simple test is to run the CronIndexManager
> from a JSP. That would print any exceptions directly to the browser
> window, which would be helpful.
>
> CmsJspActionElement cmsjsp =
> new CmsJspActionElement(pageContext, request, response)
> CronIndexManager c = new CronIndexManager();
> c.launch(cmsjsp.getCmsObject(), "createIndex=true");
>
> Matt
>
> Konstantins Dorodovs wrote:
>
>> looked in %CATALINA_HOME%\logs\localhost_log.MYDATE.txt
>> no relevant errors there :(
>>
>>
>>
>>
>> M Butcher wrote:
>>
>>>
>>> Any errors in the catalina.log file?
>>>
>>> Matt
>>>
>>> Konstantins Dorodovs wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a problem with lucene indexing
>>>> (opencms version: 5.0.6b1, lucene module: 1.5, tomcat: 4.1.30)
>>>>
>>>> cron job seems doesn't start: looked at log
>>>> entry in Scheduler(
>>>> 11 21 * * * admin Administrators
>>>> net.grcomputing.opencms.search.lucene.CronIndexManager
>>>> createIndex=true
>>>> )
>>>>
>>>> seems, I did according to docs,
>>>> (cron is enabled: [11.05.2004 20:10:04] <opencms_init> . OpenCms
>>>> scheduler : enabled)
>>>> below, there is an excerpt from my registry.xml:
>>>>
>>>> Thanks
>>>>
>>>> Konstantin
>>>>
>>>>
>>>> ---------- cut ------
>>>> <tempfileproject>3</tempfileproject>
>>>>
>>>> <luceneSearch>
>>>> <!--
>>>> - mergeFactor and permCheck are currently ignored.
>>>> -->
>>>> <mergeFactor>100000</mergeFactor>
>>>> <permCheck>true</permCheck>
>>>>
>>>> <!--
>>>> - directory in which lucene will store its indexes. Note: this
>>>> is real
>>>> - fs, not VFS.
>>>> -->
>>>> <indexDir>C:\luceneindex\</indexDir>
>>>> <!-- <indexDir>F:\luceneindex\</indexDir> -->
>>>>
>>>> <!--
>>>> - The analyzer is used for parsing documents. Choose one for your
>>>> - language. If language is English, use the StandardAnalyzer.
>>>> - There are additional analyzers at
>>>> http://jakarta.apache.org/lucene
>>>> -->
>>>>
>>>> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
>>>>
>>>> <!--
>>>> <analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</analyzer> -->
>>>>
>>>> <!--
>>>> - If subsearch is true, subfolders will be searched by default.
>>>> - This can be turned on/off per directory.
>>>> -->
>>>> <subsearch>true</subsearch> <!--
>>>> - Name of the project to index. Online is recommended.
>>>> -->
>>>> <project>online</project>
>>>>
>>>> <!--
>>>> - docFactories determine how documents are processed.
>>>> Generally, one
>>>> - docFactory exists for each type of content (viz. JSP, Page,
>>>> Plain)
>>>> - that you want to index.
>>>> -->
>>>> <docFactories>
>>>>
>>>> <!--
>>>> - This docFactory indexes documents with type page (e.g. HTML
>>>> - files edited with the WYSIWYG editor).
>>>> -
>>>> - Note that the 'type' attribute specifies which content
>>>> definition
>>>> - to use. Built in content types include page, plain,
>>>> binary, and jsp
>>>> - (there are others, too). Custom content types can be used
>>>> as well
>>>> - (see the contentDefinitions section below).
>>>> -->
>>>> <docFactory enabled="true" type="page">
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>>>> </docFactory>
>>>>
>>>> <!--
>>>> - This docFactory is a little more complex. It takes
>>>> documents of
>>>> - type "plain" and determines, by extension, what class
>>>> should be
>>>> - used to index each particular file. In this example, we
>>>> want to
>>>> - index plain text files exactly as they are, but any files
>>>> that
>>>> - contain tags need the tags stripped out before they are
>>>> indexed.
>>>> -
>>>> - Note that the name="" attribute is simply for pretty
>>>> output, and
>>>> - can contain any allowable PCDATA text.
>>>> -->
>>>> <docFactory enabled="true" type="plain">
>>>> <fileType name="plaintext">
>>>> <extension>.txt</extension>
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>>>> </fileType>
>>>> <fileType name="taggedtext">
>>>> <extension>.html</extension>
>>>> <extension>.htm</extension>
>>>> <extension>.xml</extension>
>>>> <!-- This will strip tags before processing -->
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>>>>
>>>> </fileType>
>>>> </docFactory>
>>>>
>>>> <!-- This is for binary files. PDF and DOC files are binary,
>>>> as are
>>>> - CLASS and JAR files.
>>>> -->
>>>> <docFactory enabled="true" type="binary">
>>>> <!-- This is for indexing PDF files -->
>>>> <fileType name="PDF">
>>>> <extension>.pdf</extension>
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
>>>> </fileType>
>>>> <!-- This is for indexing MS Word documents -->
>>>> <fileType name="Word">
>>>> <extension>.doc</extension>
>>>> <extension>.dot</extension>
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
>>>> </fileType>
>>>> </docFactory>
>>>>
>>>> <!--
>>>> - This will strip JSP tags and all scriptlets. IT WILL NOT
>>>> RENDER THE
>>>> - JSP FIRST, as JSPs are, by nature, dynamic.
>>>> -
>>>> - Usually, this is off by default.
>>>> -->
>>>> <docFactory enabled="false" type="jsp">
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>>>> </docFactory>
>>>>
>>>> <!-- For the news module. Enable if you use news -->
>>>>
>>>> <!-- <docFactory enabled="false" type="news">
>>>>
>>>> <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
>>>> </docFactory>
>>>> -->
>>>>
>>>> <!-- For the forum module. Enable if you use forums. -->
>>>> <!--
>>>> <docFactory enabled="false" type="forum">
>>>>
>>>> <class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
>>>> </docFactory>
>>>> -->
>>>>
>>>> <!-- If you need to index XML Template files (bad idea) use
>>>> this: -->
>>>> <docFactory enabled="false" type="XML Template"/>
>>>> </docFactories>
>>>>
>>>> <!--
>>>> - <directories/> determines which directories are indexed. By
>>>> default,
>>>> - the /system directory is never indexed, so it is safe to
>>>> index root.
>>>> -
>>>> - If you want to specify only certain directories for
>>>> indexing, create
>>>> - one <directory/> entry per directory. Again, you may use
>>>> subsearch to
>>>> - override the default subsearch setting discussed above.
>>>> -->
>>>> <directories>
>>>> <directory location="/">
>>>> <section>Root</section>
>>>> <subsearch>true</subsearch>
>>>> </directory>
>>>> </directories>
>>>>
>>>> <!--
>>>> - Use this section to define specific contentDefinitions.
>>>> Provided below
>>>> - are entries for the news and forum modules.
>>>> - (Uncomment these only after you have installed the corresponding
>>>> - modules)
>>>> -->
>>>> <contentDefinitions>
>>>> <!--
>>>> <contentDefinition type="news">
>>>> -->
>>>> <!--
>>>> - <class /> determines the class of the content
>>>> definition. Should
>>>> - be a subclass of
>>>> com.opencms.defaults.A_CmsContentDefinition.
>>>> -->
>>>> <!--
>>>>
>>>> <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
>>>> -->
>>>> <!--
>>>> - <initClass /> is optional and has to implement
>>>> -
>>>> net.grcomputing.opencms.search.lucene.I_ContentDefinitionInitialization.
>>>>
>>>> - It provides you with the ability to perform some
>>>> - initialization before the content definition class can
>>>> be used.
>>>> - In case of the news module the
>>>> NewsChannelContentDefinition class
>>>> - has to be loaded.
>>>> -->
>>>> <!--
>>>>
>>>> <initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initClass>
>>>>
>>>> -->
>>>> <!--
>>>> - <listMethod /> defines the method of the content
>>>> definition class
>>>> - which should be used to retrieve all content
>>>> definition objects
>>>> - (or any subset).
>>>> - Usually you use this method also in the backoffice or
>>>> any other
>>>> - list view.
>>>> -->
>>>> <!--
>>>> <listMethod name="getNewsList">
>>>> <param type="java.lang.Integer">1</param>
>>>> <param type="java.lang.String">-1</param>
>>>> </listMethod>
>>>> -->
>>>> <!--
>>>> - <page /> determines a page in the virtual file system
>>>> that can
>>>> - display a single entry of a content definition. You
>>>> must provide
>>>> - also a method of the content definition class that
>>>> retrieves an
>>>> - id (or something else that has to be appended to your
>>>> page uri
>>>> - to determine which entry has to be displayed). The
>>>> result will
>>>> - look like:
>>>> - /news.html?__element=entry&newsid=<result of getIntId>
>>>> - for each content definition instance object.
>>>> -->
>>>> <!--
>>>> <page uri="/news.html?__element=entry">
>>>> <param method="getIntId" name="newsid"/>
>>>> </page>
>>>> -->
>>>> <!--
>>>> <page uri="/singleNews.jsp">
>>>> <param method="getIntId" name="id"/>
>>>> </page>
>>>> -->
>>>> <!--
>>>> </contentDefinition>
>>>> -->
>>>> <!-- for Forums modules
>>>> <contentDefinition type="forum">
>>>>
>>>> <class>de.wfnetz.opencms.modules.forum.ContributionContentDefinition</class>
>>>>
>>>> <listMethod name="getSortedList">
>>>> <param type="java.lang.String"/>
>>>> </listMethod>
>>>> <page uri="/forum.html?forumtemplate=viewcontributionentry">
>>>> <param method="getId" name="conid"/>
>>>> </page>
>>>> </contentDefinition>
>>>> -->
>>>> </contentDefinitions>
>>>> </luceneSearch>
>>>>
>>>> </system>
>>>> ---------- cut ------
>>>>
>>>>
>>>> _______________________________________________
>>>> This mail is send to you from the opencms-dev mailing list
>>>> To change your list options, or to unsubscribe from the list,
>>>> please visit
>>>> http://mail.opencms.org/mailman/listinfo/opencms-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> This mail is send to you from the opencms-dev mailing list
>>> To change your list options, or to unsubscribe from the list, please
>>> visit
>>> http://mail.opencms.org/mailman/listinfo/opencms-dev
>>>
>> _______________________________________________
>> This mail is send to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please
>> visit
>> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please
> visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
More information about the opencms-dev
mailing list