[opencms-dev] Help: Lucene Search CronIndexManager

Arthur Visser arthur.visser at student.groept.be
Sun Apr 24 23:51:40 CEST 2005


Hello,

 

I hope anybody can help me with the following problem. I
have installed the Lucene Search Module in OpenCms, but do
not succeed getting the Index job done.

I have checked everything several times and even
re-installed the module etc etc. Also checked and changed
registry file many times, tried different directories.
Everything seems to fail. I have just about run out of ideas
what else to check. The only thing I can imagine would be a
problem with a failure to write files in the indexdir on the
XP Home system. Does anyone know how to set the file write
permission for XP Home for a specific folder? Just
right-click on a folder doesn't give too many options (just
sharing, but no security or 'level 5' options as in XP
professional.

 

I am using : OpenCMS 5.0.1 - Tomcat 5.0.18 - Java 1.4.2 - XP
Home SP2 - MySQL 4.0.22 

 

Please find the error message and registry below.

 

 

Part of the message in my fileviewer (always the same):

13:42:10] <opencms_cronscheduler> Error running job for
com.opencms.core.CmsCronEntry{42 13 * * * admin
Administrators
net.grcomputing.opencms.search.lucene.CronIndexManager
createIndex=true} Error: java.lang.NullPointerException

      at
org.apache.lucene.store.FSDirectory.create(FSDirectory.java:
172)

      at
org.apache.lucene.store.FSDirectory.<init>(FSDirectory.java:
151)

      at
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory
.java:132)

      at
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory
.java:113)

      at
org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:
151)

      at
net.grcomputing.opencms.search.lucene.IndexManager.doIndex(U
nknown Source)

      at
net.grcomputing.opencms.search.lucene.CronIndexManager.launc
h(Unknown Source)

      at
com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.j
ava:68)

 

My registry (many times changed - this is just one version):

 

<?xml version="1.0" encoding="ISO-8859-1"?>

<registry>

    <system>

            <luceneSearch>

            <!--

              - mergeFactor and permCheck are currently
ignored.

              -->

   <mergeFactor>100000</mergeFactor>

   <permCheck>true</permCheck>

 

            <!--

              - directory in which lucene will store its
indexes. Note: this is real

              - fs, not VFS.

              -->

   <indexDir>/opt/luceneindex</indexDir>

   <!-- <indexDir>F:\luceneindex\</indexDir> -->

 

            <!--

              - The analyzer is used for parsing documents.
Choose one for your 

              - language. If language is English, use the
StandardAnalyzer.

              - There are additional analyzers at
http://jakarta.apache.org/lucene

              -->

 
<analyzer>org.apache.lucene.analysis.standard.StandardAnalyz
er</analyzer>

   <!--
<analyzer>org.apache.lucene.analysis.de.GermanAnalyzer</anal
yzer> -->

 

            <!--

              - If subsearch is true, subfolders will be
searched by default.

              - This can be turned on/off per directory.

              -->

   <subsearch>true</subsearch> 

 

            <!--

              - Name of the project to index. Online is
recommended.

              -->

   <project>online</project>

   

            <!--

              - docFactories determine how documents are
processed. Generally, one

              - docFactory exists for each type of content
(viz. JSP, Page, Plain) 

              - that you want to index.

              -->

   <docFactories>

   

               <!--

                 - This docFactory indexes documents with
type page (e.g. HTML 

                         - files edited with the WYSIWYG
editor). 

                 -->

       <docFactory enabled="true" type="page">

 
<class>net.grcomputing.opencms.search.lucene.PageDocument</c
lass>

       </docFactory>

 

               <!--

                 - This docFactory is a little more complex.
It takes documents of

                         - type "plain" and determines, by
extension, what class should be

                         - used to index each particular
file. In this example, we want to

                         - index plain text files exactly as
they are, but any files that 

                         - contain tags need the tags
stripped out before they are indexed.

                         -

                         - Note that the name="" attribute
is simply for pretty output, and 

                         - can contain any allowable PCDATA
text.

                         -->

       <docFactory enabled="true" type="plain">

          <fileType name="plaintext">

            <extension>.txt</extension>

 
<class>net.grcomputing.opencms.search.lucene.PlainDocument</
class>

          </fileType>

          <fileType name="taggedtext">

            <extension>.html</extension>

            <extension>.htm</extension>

            <extension>.xml</extension>

            <!-- This will strip tags before processing -->

 
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocu
ment</class>

          </fileType>

       </docFactory>

 

               <!--

                 - This will strip JSP tags and all
scriptlets. IT WILL NOT RENDER THE

                         - JSP FIRST, as JSPs are, by
nature, dynamic.

                         -

                         - Usually, this is off by default.

                         -->

       <docFactory enabled="false" type="jsp">

 
<class>net.grcomputing.opencms.search.lucene.JspDocument</cl
ass>

       </docFactory>

 

               <!-- For the news module. Enable if you use
news -->

       <docFactory enabled="false" type="news">

 
<class>net.grcomputing.opencms.search.lucene.NewsDocument</c
lass>

       </docFactory>

 

               <!-- For the forum module. Enable if you use
forums. -->

       <docFactory enabled="false" type="forum">

 
<class>de.wfnetz.opencms.modules.forum.ContributionDocument<
/class>

       </docFactory>

 

               <!-- If you need to index XML Template files
(bad idea) use this: -->

       <docFactory enabled="false" type="XML Template"/>

   </docFactories>

   

            <!--

              - <directories/> determines which directories
are indexed. By default,

              - the /system directory is never indexed, so
it is safe to index root.

              -

              - If you want to specify only certain
directories for indexing, create

              - one <directory/> entry per directory. Again,
you may use subsearch to

              - override the default subsearch setting
discussed above.

              -->

   <directories>

       <directory location="/">

         <section>Root</section>

         <subsearch>true</subsearch>

       </directory>

   </directories>

 

   <!--

     - Use this section to define specific
contentDefinitions. Provided below

             - are entries for the news and forum modules.

             -->

     <contentDefinitions>

    <!--   <contentDefinition type="news">

 
<class>com.opencms.modules.homepage.news.NewsContentDefiniti
on</class>

 
<initClass>net.grcomputing.opencms.search.lucene.NewsInitial
ization</initClass>

         <listMethod name="getNewsList">

           <param type="java.lang.Integer">1</param>

           <param type="java.lang.String">-1</param>

         </listMethod>

         <page uri="/news.html?__element=entry">

           <param method="getIntId" name="newsid"/>

         </page>

       </contentDefinition>

       <contentDefinition type="forum">

 
<class>de.wfnetz.opencms.modules.forum.ContributionContentDe
finition</class>

         <listMethod name="getSortedList">

           <param type="java.lang.String"/>

         </listMethod>

         <page
uri="/forum.html?forumtemplate=viewcontributionentry">

           <param method="getId" name="conid"/>

         </page>

       </contentDefinition> -->

   </contentDefinitions>

</luceneSearch>

 

 

 

 

Arthur Visser

Managing Director

 

       I.C.E. Europe bvba

Jachthoornlaan 42

B-1170 Brussels, Belgium

Mobile :  +32-(0)477-341850

Fax. :    +32-(0)2-675.8374

       <http://www.ice-eu.com> www.ice-eu.com

 

           

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20050424/4e8f5008/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1338 bytes
Desc: not available
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20050424/4e8f5008/attachment.gif>


More information about the opencms-dev mailing list