[opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent

Ritwik Datta dattaritwik at yahoo.com
Wed Jan 21 10:55:02 CET 2004


Ok Thanks. But tell me one thing, there is no entry for registering class for word document and pdf document search in the link you have given. I mean registry.xml should have enrty for .doc and .pdf extension, right?

"Hartmann, Waehrisch & Feykes GmbH" <hartmann at waehrisch-feykes.de> wrote:There has been a redesign from 1.3 to (inofficial) 1.4. The cvs is based on this 1.4 and you have to make sure that your registry looks like the sample registry http://www.aleph-null.tv/downloads/contribs/beffe/registry.txt
Also compile and copy all files from the cvs to your classes folder.
 
Bye,
Stephan
 
----- Original Message ----- 
From: Ritwik Datta 
To: opencms-dev at opencms.org 
Sent: Wednesday, January 21, 2004 10:11 AM
Subject: Re: [opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent


Dear Stephan,
 
I have imported net.grcomputing.opencms.search.lucene_1.3.zip. That version was missing search index facility for Word and Pdf documents. So I downleded those java files, compiled and copied under $TOMCAT-HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene.
I restared tomcat after that. but no result. Pls help me.
regards,
Ritwik
 
 
 
"Hartmann, Waehrisch & Feykes GmbH" <hartmann at waehrisch-feykes.de> wrote:
Which version of the module did you use before? Did you copy only those to classes or all together? Did you restart tomcat?
 
Regards,
Stephan
----- Original Message ----- 
From: Ritwik Datta 
To: opencms-dev at opencms.org 
Sent: Wednesday, January 21, 2004 7:16 AM
Subject: [opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent



Dear All,

I have complied opencms lucene source from CVS repositories. I have got WordDocument.class and I_Documentfactory.class under net.grcomputing.opencms.search.lucene package. Now I uploaded those files under $TOMCAT-HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene. I also uploaded third party tm-extractors-0.2.jar under $TOMCAT-HOME/webapps/opencms/WEB-INF/lib/

Now I have changed <docFactory> in registry.xml for lucene word .doc file search. Here is segment of registry.xml 

<docFactories>......

<docFactory type="binary" enabled="true">
     <fileType name="doctext">
      <extension>.doc</extension>
      <extension>.dot</extension>
      <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
     </fileType>
    </docFactory>

..........

</docFactories>.

Mow when I run crond scheduler, indexing is successful but there is no trace of indexing my doc files. I also checked it from simple_search.jsp. It is unable to hit url of my word docs even search criteria is met. I am attaching logs of index manager. There is trace of loading Page DocumentFactory, JSP DocumentFactory, Plain DocumentFactory, But not my word Document factory. I think I am missing something. can anyone tell me the catch? It is pretty urgent. pls help me

=====IndexManager=============================================================
[21.01.2004 11:36:10] <opencms_info> Analyzer: org.apache.lucene.analysis.standard.StandardAnalyzer
[21.01.2004 11:36:10] <opencms_info> Page DocumentFactory loaded
[21.01.2004 11:36:10] <opencms_info> JSP DocumentFactory loaded
[21.01.2004 11:36:10] <opencms_info> Plain DocumentFactory loaded
[21.01.2004 11:36:10] <opencms_info> Extension map exists to handle plaintext
[21.01.2004 11:36:10] <opencms_info> Extension map exists to handle taggedtext
[21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/
[21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/
[21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/Assessment_Findings/
[21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/Best_Practices/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Business_Goals/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/CMC_Product_Information/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/CMM_Action_Plans/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Coding_Standard/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Dashboard/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Defect_Prevention/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/ER_SI_Organisation_Structure/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Estimation/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Expert_List/
[21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/FAQ/
[21.01.2004 11: 36:12] <opencms_info> IndexManager: indexing /release/spdb/IGC_OSSP_Role_Mapping/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Metrics_and_Measurements/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/OQPM/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/OSSP/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Presentation_Library/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Process_Change_Management/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Projectwise_Plans/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/PROMPT/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Readables/
[21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Sample_CMM_Documents/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SCM/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SEPG/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SPDB_Notes/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SPDB_Search/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SQA/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Notes/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Others/
[21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Data/
[21.01.2004 11:36:14] <opencms_info> IndexM anager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Bilingual_2-tier_Application_to_3-tier_Conversion/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Citrix/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Compilation_Problem/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Driver_Installation/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/FTP_Service_on_Linux/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Email/
[21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Integration_Development_Guidelines/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/HW_Requirement_for_Oracle9i_9iDS_9iASR2/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_9i_Application_Server_Release2_Installation/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Forms9i_to_Forms6i_Conversion/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Froms6i_Deployment_on_9iAS/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/ORARRP_Reusable_Components/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/OS_Problem/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Red_Hat_Advance_Server_Installation/
[21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Project_Info/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Register/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Training_Materials/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/TCM_Plans/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Templates/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/Timesheet/
[21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/Training/
[21.01.2004 11:36:17] <opencms_in fo> IndexManager: 55 documents are being processed
[21.01.2004 11:36:17] <opencms_info> IndexManager:  Index has been optimized.
[21.01.2004 11:36:17] <opencms_info> Done
=====IndexManager=============================================================
[21.01.2004 11:36:17] <opencms_cronscheduler> Successful launch of job com.opencms.core.CmsCronEntry{36 11 * * * Admin Administrators net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true} Message: CronIndexManager rebuilt the Lucene index on Wed Jan 21 11:36:17 IST 2004

Regards,

Ritwik



---------------------------------
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes

---------------------------------
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes


---------------------------------
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20040121/dd995651/attachment.htm>


More information about the opencms-dev mailing list