<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1276" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>I tried to tell you to make sure that you use the 
new registry.xml format. The tags "plainDocFactory", "jspDocFactory" and 
"pageDocFactory" are obsolete and not used anymore. You have to replace them 
with </FONT></DIV>
<DIV><FONT face=Arial size=2><docFactory type="plain" 
enabled="true"></FONT></DIV>
<DIV><FONT face=Arial size=2><docFactory type="page" enabled = 
true"></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Regards,</FONT></DIV>
<DIV><FONT face=Arial size=2>Stephan</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<BLOCKQUOTE 
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
  <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
  <DIV 
  style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B> 
  <A title=dattaritwik@yahoo.com href="mailto:dattaritwik@yahoo.com">Ritwik 
  Datta</A> </DIV>
  <DIV style="FONT: 10pt arial"><B>To:</B> <A title=opencms-dev@opencms.org 
  href="mailto:opencms-dev@opencms.org">opencms-dev@opencms.org</A> </DIV>
  <DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, January 22, 2004 6:45 
  AM</DIV>
  <DIV style="FONT: 10pt arial"><B>Subject:</B> [opencms-dev] OpenCMSLucene 1.4 
  search: Word doc indexing is done but not for html/txt</DIV>
  <DIV><BR></DIV>
  <DIV>Dear All,</DIV>
  <DIV> </DIV>
  <DIV> </DIV>
  <DIV>I have compiled opencmslucene 1.4 source from sourceforge.net CVS 
  repository. Now I am able to index Word Documents. But what I noticed is 
  indexing for other file extension like html txt is not happening. It was 
  happening with lucene module 1.3 for opencms. My registry.xml does contain 
  entries for PlainDocument, Taggeddocument and of course word document. but 
  Index manager is not taking other files into consideration other than Word 
  documents.</DIV>
  <DIV>Earlier I had opencmslucene 1.3. But to upgrade I downloaded all 
  java files from latest CVS, compiled and uploaded under 
  $TOMCAT_HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene 
  and jakarta-poi-1.9.0-dev-20030109.jar & tm-extractors-0.2.jar under 
  $TOMCAT_HOME/webapps/opencms/WEB-INF/lib folder.</DIV>
  <DIV> I am pasting the relevant contents of my registry.xml and log 
  entries of Index manager. but I need html/txt indexing also. Please help me. 
  This is urgent.</DIV>
  <DIV> </DIV>
  <DIV> </DIV>
  <DIV><luceneSearch><BR>            
  <mergeFactor>100000</mergeFactor><BR>            
  <permCheck>true</permCheck><BR>            
  <indexDir>/opt/lucene/index/opencms/</indexDir><BR>            
  <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer><BR>            
  <subsearch>true</subsearch><BR>            
  <project>online</project><BR>            
  <docFactories><BR>                
  <pageDocFactory 
  enabled="true"><BR>                    
  <class>net.grcomputing.opencms.search.lucene.PageDocument</class><BR>                
  </pageDocFactory><BR>                
  <plainDocFactory 
  enabled="true"><BR>                    
  <fileType 
  name="plaintext"><BR>                        
  <extension>.txt</extension><BR>                        
  <class>net.grcomputing.opencms.search.lucene.PlainDocument</class><BR>                    
  </fileType><BR>                    
  <fileType 
  name="taggedtext"><BR>                        
  <extension>.html</extension><BR>                        
  <extension>.htm</extension><BR>                        
  <extension>.xml</extension><BR>                        
  <!-- This will strip tags before processing 
  --><BR>                        
  <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class><BR>                    
  </fileType><BR>                
  </plainDocFactory><BR>    <docFactory 
  type="binary" enabled="true"><BR>     <fileType 
  name="doctext"><BR>      <extension>.doc</extension><BR>      <extension>.dot</extension><BR>      <class>net.grcomputing.opencms.search.lucene.WordDocument</class><BR>     </fileType><BR>    </docFactory><BR>                
  <jspDocFactory 
  enabled="true"><BR>                    
  <class>net.grcomputing.opencms.search.lucene.JspDocument</class><BR>                
  </jspDocFactory><BR>                
  <xmlTemplateDocFactory 
  enabled="false"/><BR>   </docFactories><BR>   <directories><BR>                
  <directory 
  location="/release/"><BR>                    
  <section>Test</section><BR>                    
  <subsearch>true</subsearch><BR>                
  </directory><BR>            
  </directories><BR>        
  </luceneSearch></DIV>
  <DIV> </DIV>
  <DIV>=====IndexManager=============================================================<BR>[22.01.2004 
  09:46:10] <opencms_info> Analyzer: 
  org.apache.lucene.analysis.standard.StandardAnalyzer<BR>[22.01.2004 09:46:10] 
  <opencms_info> Extension map exists to handle doctext<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing /release/<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing 
  /release/spdb/<BR>[22.01.2004 09:46:10] <opencms_info> IndexManager: 
  indexing /release/spdb/Assessment_Findings/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Best_Practices/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/Business_Goals/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/CMC_Product_Information/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/CMM_Action_Plans/<BR>[22.01.2004 09:46:10] <opencms_i nfo> 
  IndexManager: indexing /release/spdb/Coding_Standard/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Dashboard/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/Defect_Prevention/<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing 
  /release/spdb/ER_SI_Organisation_Structure/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Estimation/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/Expert_List/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing /release/spdb/FAQ/<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing 
  /release/spdb/IGC_OSSP_Role_Mapping/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Metrics_and_Measurements/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing /release/spdb/OQPM/<BR>[22.01.2004 
  09:46:10] <opencms_info&g t; IndexManager: indexing 
  /release/spdb/OSSP/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/Presentation_Library/<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing 
  /release/spdb/Process_Change_Management/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Projectwise_Plans/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/PROMPT/<BR>[22.01.2004 09:46:10] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Readables/<BR>[22.01.2004 09:46:10] <opencms_info> 
  IndexManager: indexing /release/spdb/Sample_CMM_Documents/<BR>[22.01.2004 
  09:46:10] <opencms_info> IndexManager: indexing 
  /release/spdb/SCM/<BR>[22.01.2004 09:46:11] <opencms_info> IndexManager: 
  indexing /release/spdb/SEPG/<BR>[22.01.2004 09:46:11] <opencms_info> 
  IndexManager: indexing /release/spdb/SPDB_Notes/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexin g 
  /release/spdb/SPDB_Search/<BR>[22.01.2004 09:46:11] <opencms_info> 
  IndexManager: indexing /release/spdb/SQA/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing /release/spdb/TCM/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Notes/<BR>[22.01.2004 09:46:11] <opencms_info> 
  IndexManager: indexing /release/spdb/TCM/Others/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Data/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Bilingual_2-tier_Application_to_3-tier_Conversion/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Citrix/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Compilation_Problem/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Driver_Installation/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/FTP_Service_on_Linux/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Email/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Integration_Development_Guidelines/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/HW_Requirement_for_Oracle9i_9iDS_9iASR2/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_9i_Application_Server_Release2_Installation/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Forms9i_to_Forms6i_Conversion/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Froms6i_Deployment_on_9iAS/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/ORARRP_Reusable_Components/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/OS_Problem/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Asset_Details/Red_Hat_Advance_Server_Installation/<BR>[22.01.2004 
  09:46:11] <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Project_Info/<BR>[22.01.2004 09:46:11] 
  <opencms_ info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Register/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/Reusable_Assets/Training_Materials/<BR>[22.01.2004 09:46:11] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/TCM/TCM_Plans/<BR>[22.01.2004 09:46:11] <opencms_info> 
  IndexManager: indexing /release/spdb/TCM/Templates/<BR>[22.01.2004 09:46:12] 
  <opencms_info> IndexManager: indexing 
  /release/spdb/Timesheet/<BR>[22.01.2004 09:46:12] <opencms_info> 
  IndexManager: indexing /release/spdb/Training/<BR>[22.01.2004 09:46:12] 
  <opencms_info> IndexManager: 4 documents are being 
  processed<BR>[22.01.2004 09:46:13] <opencms_info> IndexManager:  
  Index has been optimized.<BR>[22.01.2004 09:46:13] <opencms_info> 
  Done<BR></DIV>
  <P>
  <HR SIZE=1>
  Do you Yahoo!?<BR>Yahoo! SiteBuilder - Free web site building tool. <A 
  href="http://us.rd.yahoo.com/evt=21608/*http://webhosting.yahoo.com/ps/sb/"><B>Try 
  it!</B></A></BLOCKQUOTE></BODY></HTML>