[opencms-dev] Developed an XML Indexer for Lucene but getting error
Alex !
kingofkingston at hotmail.com
Sun Mar 7 15:37:02 CET 2004
Hi,
this ones probably for Matt/Stefan.
I have written an XML Indexer for the lucene module (almost finished), which
will basically take an xml file, parse it, and then add its elements and
their contents to the lucene index, instead of stripping the element tags
and then including the remaining content a a siingle searchable body (as is
currently available).
Everything is now compiled (into a seprate jar, just 2 class files), the
cron job runs but gives the following error:
[07.03.2004 14:20:10] <opencms_cronscheduler> Starting job for
com.opencms.core.CmsCronEntry{20 14 * * * admin Administrators
net.grcomputing.opencms.search.lucene.CronIndexManager
createIndex=true,registry=C:/dev/java/tomcat-4.1.27/webapps/opencms/WEB-INF/config/uk_lucene_registry.xml}
[07.03.2004 14:20:10] <opencms_info>
=====IndexManager=============================================================
[07.03.2004 14:20:10] <opencms_info> Analyzer:
org.apache.lucene.analysis.standard.StandardAnalyzer
[07.03.2004 14:20:10] <opencms_info> Extension map exists to handle XML
[07.03.2004 14:20:10] <opencms_info> Page DocumentFactory loaded
[07.03.2004 14:20:10] <opencms_info> IndexManager: indexing /test/
[07.03.2004 14:20:11] <opencms_info> Created XMLDocumentHandlerSAX
[07.03.2004 14:20:11] <opencms_info> Return Document
[07.03.2004 14:20:11] <opencms_cronscheduler> Error running job for
com.opencms.core.CmsCronEntry{20 14 * * * admin Administrators
net.grcomputing.opencms.search.lucene.CronIndexManager
createIndex=true,registry=C:/dev/java/tomcat-4.1.27/webapps/opencms/WEB-INF/config/epfolio_uk_lucene_registry.xml}
Error: java.lang.NullPointerException
at org.apache.lucene.index.FieldInfos.add(FieldInfos.java:90)
at
org.apache.lucene.index.DocumentWriter.addDocument(DocumentWriter.java:92)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:257)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:244)
at net.grcomputing.opencms.search.lucene.IndexManager.processFile(Unknown
Source)
at net.grcomputing.opencms.search.lucene.IndexManager.processDir(Unknown
Source)
at net.grcomputing.opencms.search.lucene.IndexManager.doIndex(Unknown
Source)
at net.grcomputing.opencms.search.lucene.CronIndexManager.launch(Unknown
Source)
at com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)
my registry entry for the xml files look like this (contained in external
registry file):
<!-- For XML Files :) -->
<docFactory enabled="true" type="plain">
<fileType name="XML">
<extension>.xml</extension>
<class>com.mydomain.opencms.lucene.xmlindexing.XMLDocument</class>
</fileType>
</docFactory>
Your help would be much appreciated.
(should I send you the source to correct and include in your next
patch/update?)
Many Thanks
Alex
_________________________________________________________________
Find a cheaper internet access deal - choose one to suit you.
http://www.msn.co.uk/internetaccess
More information about the opencms-dev
mailing list