[opencms-dev] Developed an XML Indexer for Lucene but getting error

Alex ! kingofkingston at hotmail.com
Sun Mar 7 15:37:02 CET 2004


Hi,

this ones probably for Matt/Stefan.

I have written an XML Indexer for the lucene module (almost finished), which 
will basically take an xml file, parse it, and then add its elements and 
their contents to the lucene index, instead of stripping the element tags 
and then including the remaining content a a siingle searchable body (as is 
currently available).

Everything is now compiled (into a seprate jar, just 2 class files), the 
cron job runs but gives the following error:

[07.03.2004 14:20:10] <opencms_cronscheduler> Starting job for 
com.opencms.core.CmsCronEntry{20 14 * * * admin Administrators 
net.grcomputing.opencms.search.lucene.CronIndexManager 
createIndex=true,registry=C:/dev/java/tomcat-4.1.27/webapps/opencms/WEB-INF/config/uk_lucene_registry.xml}
[07.03.2004 14:20:10] <opencms_info>
=====IndexManager=============================================================
[07.03.2004 14:20:10] <opencms_info> Analyzer: 
org.apache.lucene.analysis.standard.StandardAnalyzer
[07.03.2004 14:20:10] <opencms_info> Extension map exists to handle XML
[07.03.2004 14:20:10] <opencms_info> Page DocumentFactory loaded
[07.03.2004 14:20:10] <opencms_info> IndexManager: indexing /test/
[07.03.2004 14:20:11] <opencms_info> Created XMLDocumentHandlerSAX
[07.03.2004 14:20:11] <opencms_info> Return Document
[07.03.2004 14:20:11] <opencms_cronscheduler> Error running job for 
com.opencms.core.CmsCronEntry{20 14 * * * admin Administrators 
net.grcomputing.opencms.search.lucene.CronIndexManager 
createIndex=true,registry=C:/dev/java/tomcat-4.1.27/webapps/opencms/WEB-INF/config/epfolio_uk_lucene_registry.xml} 
Error: java.lang.NullPointerException
	at org.apache.lucene.index.FieldInfos.add(FieldInfos.java:90)
	at 
org.apache.lucene.index.DocumentWriter.addDocument(DocumentWriter.java:92)
	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:257)
	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:244)
	at net.grcomputing.opencms.search.lucene.IndexManager.processFile(Unknown 
Source)
	at net.grcomputing.opencms.search.lucene.IndexManager.processDir(Unknown 
Source)
	at net.grcomputing.opencms.search.lucene.IndexManager.doIndex(Unknown 
Source)
	at net.grcomputing.opencms.search.lucene.CronIndexManager.launch(Unknown 
Source)
	at com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)


my registry entry for the xml files look like this (contained in external 
registry file):

       <!-- For XML Files :) -->
       <docFactory enabled="true" type="plain">
          <fileType name="XML">
            <extension>.xml</extension>
            
<class>com.mydomain.opencms.lucene.xmlindexing.XMLDocument</class>
          </fileType>
       </docFactory>

Your help would be much appreciated.

(should I send you the source to correct and include in your next 
patch/update?)

Many Thanks

Alex

_________________________________________________________________
Find a cheaper internet access deal - choose one to suit you. 
http://www.msn.co.uk/internetaccess




More information about the opencms-dev mailing list