[opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent

Hartmann, Waehrisch & Feykes GmbH hartmann at waehrisch-feykes.de
Wed Jan 21 11:01:01 CET 2004


I just wanted to make sure you use this format and not the one for version 1.3 or 1.2. Of course you have to add the docFactory for binary files as you posted it.
  ----- Original Message ----- 
  From: Ritwik Datta 
  To: opencms-dev at opencms.org 
  Sent: Wednesday, January 21, 2004 10:54 AM
  Subject: Re: [opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent


  Ok Thanks. But tell me one thing, there is no entry for registering class for word document and pdf document search in the link you have given. I mean registry.xml should have enrty for .doc and .pdf extension, right?

  "Hartmann, Waehrisch & Feykes GmbH" <hartmann at waehrisch-feykes.de> wrote: 
    There has been a redesign from 1.3 to (inofficial) 1.4. The cvs is based on this 1.4 and you have to make sure that your registry looks like the sample registry http://www.aleph-null.tv/downloads/contribs/beffe/registry.txt
    Also compile and copy all files from the cvs to your classes folder.

    Bye,
    Stephan

      ----- Original Message ----- 
      From: Ritwik Datta 
      To: opencms-dev at opencms.org 
      Sent: Wednesday, January 21, 2004 10:11 AM
      Subject: Re: [opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent


      Dear Stephan,

      I have imported net.grcomputing.opencms.search.lucene_1.3.zip. That version was missing search index facility for Word and Pdf documents. So I downleded those java files, compiled and copied under $TOMCAT-HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene.
      I restared tomcat after that. but no result. Pls help me.
      regards,
      Ritwik



      "Hartmann, Waehrisch & Feykes GmbH" <hartmann at waehrisch-feykes.de> wrote:
        Which version of the module did you use before? Did you copy only those to classes or all together? Did you restart tomcat?

        Regards,
        Stephan
          ----- Original Message ----- 
          From: Ritwik Datta 
          To: opencms-dev at opencms.org 
          Sent: Wednesday, January 21, 2004 7:16 AM
          Subject: [opencms-dev] Problem: docFactory in registry.xml for lucene word .doc file search:urgent


          Dear All,

          I have complied opencms lucene source from CVS repositories. I have got WordDocument.class and I_Documentfactory.class under net.grcomputing.opencms.search.lucene package. Now I uploaded those files under $TOMCAT-HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/lucene. I also uploaded third party tm-extractors-0.2.jar under $TOMCAT-HOME/webapps/opencms/WEB-INF/lib/

          Now I have changed <docFactory> in registry.xml for lucene word .doc file search. Here is segment of registry.xml 

          <docFactories>......

          <docFactory type="binary" enabled="true">
               <fileType name="doctext">
                <extension>.doc</extension>
                <extension>.dot</extension>
                <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
               </fileType>
              </docFactory>

          ..........

          </docFactories>.

          Mow when I run crond scheduler, indexing is successful but there is no trace of indexing my doc files. I also checked it from simple_search.jsp. It is unable to hit url of my word docs even search criteria is met. I am attaching logs of index manager. There is trace of loading Page DocumentFactory, JSP DocumentFactory, Plain DocumentFactory, But not my word Document factory. I think I am missing something. can anyone tell me the catch? It is pretty urgent. pls help me

          =====IndexManager=============================================================
          [21.01.2004 11:36:10] <opencms_info> Analyzer: org.apache.lucene.analysis.standard.StandardAnalyzer
          [21.01.2004 11:36:10] <opencms_info> Page DocumentFactory loaded
          [21.01.2004 11:36:10] <opencms_info> JSP DocumentFactory loaded
          [21.01.2004 11:36:10] <opencms_info> Plain DocumentFactory loaded
          [21.01.2004 11:36:10] <opencms_info> Extension map exists to handle plaintext
          [21.01.2004 11:36:10] <opencms_info> Extension map exists to handle taggedtext
          [21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/
          [21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/
          [21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/Assessment_Findings/
          [21.01.2004 11:36:10] <opencms_info> IndexManager: indexing /release/spdb/Best_Practices/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Business_Goals/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/CMC_Product_Information/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/CMM_Action_Plans/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Coding_Standard/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Dashboard/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Defect_Prevention/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/ER_SI_Organisation_Structure/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Estimation/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/Expert_List/
          [21.01.2004 11:36:11] <opencms_info> IndexManager: indexing /release/spdb/FAQ/
          [21.01.2004 11: 36:12] <opencms_info> IndexManager: indexing /release/spdb/IGC_OSSP_Role_Mapping/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Metrics_and_Measurements/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/OQPM/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/OSSP/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Presentation_Library/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Process_Change_Management/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Projectwise_Plans/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/PROMPT/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Readables/
          [21.01.2004 11:36:12] <opencms_info> IndexManager: indexing /release/spdb/Sample_CMM_Documents/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SCM/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SEPG/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SPDB_Notes/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SPDB_Search/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/SQA/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Notes/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Others/
          [21.01.2004 11:36:13] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Data/
          [21.01.2004 11:36:14] <opencms_info> IndexM anager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Bilingual_2-tier_Application_to_3-tier_Conversion/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Citrix/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Compilation_Problem/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Driver_Installation/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/FTP_Service_on_Linux/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Email/
          [21.01.2004 11:36:14] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Hindi_Integration_Development_Guidelines/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/HW_Requirement_for_Oracle9i_9iDS_9iASR2/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_9i_Application_Server_Release2_Installation/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Forms9i_to_Forms6i_Conversion/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Oracle_Froms6i_Deployment_on_9iAS/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/ORARRP_Reusable_Components/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/OS_Problem/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Asset_Details/Red_Hat_Advance_Server_Installation/
          [21.01.2004 11:36:15] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Project_Info/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Register/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Reusable_Assets/Training_Materials/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/TCM_Plans/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/TCM/Templates/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/Timesheet/
          [21.01.2004 11:36:16] <opencms_info> IndexManager: indexing /release/spdb/Training/
          [21.01.2004 11:36:17] <opencms_in fo> IndexManager: 55 documents are being processed
          [21.01.2004 11:36:17] <opencms_info> IndexManager:  Index has been optimized.
          [21.01.2004 11:36:17] <opencms_info> Done
          =====IndexManager=============================================================
          [21.01.2004 11:36:17] <opencms_cronscheduler> Successful launch of job com.opencms.core.CmsCronEntry{36 11 * * * Admin Administrators net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true} Message: CronIndexManager rebuilt the Lucene index on Wed Jan 21 11:36:17 IST 2004

          Regards,

          Ritwik




----------------------------------------------------------------------
          Do you Yahoo!?
          Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes


--------------------------------------------------------------------------
      Do you Yahoo!?
      Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes


------------------------------------------------------------------------------
  Do you Yahoo!?
  Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20040121/22b69f76/attachment.htm>


More information about the opencms-dev mailing list