<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1276" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>I tried to tell you to make sure that you use the
new registry.xml format. The tags "plainDocFactory", "jspDocFactory" and
"pageDocFactory" are obsolete and not used anymore. You have to replace them
with </FONT></DIV>
<DIV><FONT face=Arial size=2><docFactory type="plain"
enabled="true"></FONT></DIV>
<DIV><FONT face=Arial size=2><docFactory type="page" enabled =
true"></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Regards,</FONT></DIV>
<DIV><FONT face=Arial size=2>Stephan</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<BLOCKQUOTE
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=dattaritwik@yahoo.com href="mailto:dattaritwik@yahoo.com">Ritwik
Datta</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=opencms-dev@opencms.org
href="mailto:opencms-dev@opencms.org">opencms-dev@opencms.org</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, January 22, 2004 6:45
AM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> [opencms-dev] OpenCMSLucene 1.4
search: Word doc indexing is done but not for html/txt</DIV>
<DIV><BR></DIV>
<DIV>Dear All,</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>I have compiled opencmslucene 1.4 source from sourceforge.net CVS
repository. Now I am able to index Word Documents. But what I noticed is
indexing for other file extension like html txt is not happening. It was
happening with lucene module 1.3 for opencms. My registry.xml does contain
entries for PlainDocument, Taggeddocument and of course word document. but
Index manager is not taking other files into consideration other than Word
documents.</DIV>
<DIV>Earlier I had opencmslucene 1.3. But to upgrade I downloaded all
java files from latest CVS, compiled and uploaded under
$TOMCAT_HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene
and jakarta-poi-1.9.0-dev-20030109.jar & tm-extractors-0.2.jar under
$TOMCAT_HOME/webapps/opencms/WEB-INF/lib folder.</DIV>
<DIV> I am pasting the relevant contents of my registry.xml and log
entries of Index manager. but I need html/txt indexing also. Please help me.
This is urgent.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV><luceneSearch><BR>
<mergeFactor>100000</mergeFactor><BR>
<permCheck>true</permCheck><BR>
<indexDir>/opt/lucene/index/opencms/</indexDir><BR>
<analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer><BR>
<subsearch>true</subsearch><BR>
<project>online</project><BR>
<docFactories><BR>
<pageDocFactory
enabled="true"><BR>
<class>net.grcomputing.opencms.search.lucene.PageDocument</class><BR>
</pageDocFactory><BR>
<plainDocFactory
enabled="true"><BR>
<fileType
name="plaintext"><BR>
<extension>.txt</extension><BR>
<class>net.grcomputing.opencms.search.lucene.PlainDocument</class><BR>
</fileType><BR>
<fileType
name="taggedtext"><BR>
<extension>.html</extension><BR>
<extension>.htm</extension><BR>
<extension>.xml</extension><BR>
<!-- This will strip tags before processing
--><BR>
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class><BR>
</fileType><BR>
</plainDocFactory><BR> <docFactory
type="binary" enabled="true"><BR> <fileType
name="doctext"><BR> <extension>.doc</extension><BR> <extension>.dot</extension><BR> <class>net.grcomputing.opencms.search.lucene.WordDocument</class><BR> </fileType><BR> </docFactory><BR>
<jspDocFactory
enabled="true"><BR>
<class>net.grcomputing.opencms.search.lucene.JspDocument</class><BR>
</jspDocFactory><BR>
<xmlTemplateDocFactory
enabled="false"/><BR> </docFactories><BR> <directories><BR>
<directory
location="/release/"><BR>
<section>Test</section><BR>
<subsearch>true</subsearch><BR>
</directory><BR>
</directories><BR>
</luceneSearch></DIV>
<DIV> </DIV>
<DIV>=====IndexManager=============================================================<BR>[22.01.2004
09:46:10] <opencms_info> Analyzer:
org.apache.lucene.analysis.standard.StandardAnalyzer<BR>[22.01.2004 09:46:10]
<opencms_info> Extension map exists to handle doctext<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing /release/<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing
/release/spdb/<BR>[22.01.2004 09:46:10] <opencms_info> IndexManager:
indexing /release/spdb/Assessment_Findings/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Best_Practices/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/Business_Goals/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/CMC_Product_Information/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/CMM_Action_Plans/<BR>[22.01.2004 09:46:10] <opencms_i nfo>
IndexManager: indexing /release/spdb/Coding_Standard/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Dashboard/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/Defect_Prevention/<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing
/release/spdb/ER_SI_Organisation_Structure/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Estimation/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/Expert_List/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing /release/spdb/FAQ/<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing
/release/spdb/IGC_OSSP_Role_Mapping/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Metrics_and_Measurements/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing /release/spdb/OQPM/<BR>[22.01.2004
09:46:10] <opencms_info&g t; IndexManager: indexing
/release/spdb/OSSP/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/Presentation_Library/<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing
/release/spdb/Process_Change_Management/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Projectwise_Plans/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/PROMPT/<BR>[22.01.2004 09:46:10]
<opencms_info> IndexManager: indexing
/release/spdb/Readables/<BR>[22.01.2004 09:46:10] <opencms_info>
IndexManager: indexing /release/spdb/Sample_CMM_Documents/<BR>[22.01.2004
09:46:10] <opencms_info> IndexManager: indexing
/release/spdb/SCM/<BR>[22.01.2004 09:46:11] <opencms_info> IndexManager:
indexing /release/spdb/SEPG/<BR>[22.01.2004 09:46:11] <opencms_info>
IndexManager: indexing /release/spdb/SPDB_Notes/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexin g
/release/spdb/SPDB_Search/<BR>[22.01.2004 09:46:11] <opencms_info>
IndexManager: indexing /release/spdb/SQA/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing /release/spdb/TCM/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Notes/<BR>[22.01.2004 09:46:11] <opencms_info>
IndexManager: indexing /release/spdb/TCM/Others/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Data/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Bilingual_2-tier_Application_to_3-tier_Conversion/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Citrix/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Compilation_Problem/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Driver_Installation/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/FTP_Service_on_Linux/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Email/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Integration_Development_Guidelines/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/HW_Requirement_for_Oracle9i_9iDS_9iASR2/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_9i_Application_Server_Release2_Installation/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Forms9i_to_Forms6i_Conversion/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Froms6i_Deployment_on_9iAS/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/ORARRP_Reusable_Components/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/OS_Problem/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Asset_Details/Red_Hat_Advance_Server_Installation/<BR>[22.01.2004
09:46:11] <opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Project_Info/<BR>[22.01.2004 09:46:11]
<opencms_ info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Register/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/Reusable_Assets/Training_Materials/<BR>[22.01.2004 09:46:11]
<opencms_info> IndexManager: indexing
/release/spdb/TCM/TCM_Plans/<BR>[22.01.2004 09:46:11] <opencms_info>
IndexManager: indexing /release/spdb/TCM/Templates/<BR>[22.01.2004 09:46:12]
<opencms_info> IndexManager: indexing
/release/spdb/Timesheet/<BR>[22.01.2004 09:46:12] <opencms_info>
IndexManager: indexing /release/spdb/Training/<BR>[22.01.2004 09:46:12]
<opencms_info> IndexManager: 4 documents are being
processed<BR>[22.01.2004 09:46:13] <opencms_info> IndexManager:
Index has been optimized.<BR>[22.01.2004 09:46:13] <opencms_info>
Done<BR></DIV>
<P>
<HR SIZE=1>
Do you Yahoo!?<BR>Yahoo! SiteBuilder - Free web site building tool. <A
href="http://us.rd.yahoo.com/evt=21608/*http://webhosting.yahoo.com/ps/sb/"><B>Try
it!</B></A></BLOCKQUOTE></BODY></HTML>