[opencms-dev] Lucene 1.5 pdf & doc won't index

Hartmann, Waehrisch & Feykes GmbH hartmann at waehrisch-feykes.de
Wed Mar 17 08:10:02 CET 2004


You have a mixed up version of the module. It seems to be a common problem
that the class files during an update of a module are not exported
correctly. You can touch the directory of the modules' class files in
opencms and publish them directly.  Verify in your physical filesystem that
the classes in
$TOMCAT_HOME/webapps/opencms/WEB-INF/classes/net/grcomputing/opencms/search/
lucene are overwritten with the new ones.
Don't forget to restart tomcat after that.

Bye,
Stephan

----- Original Message ----- 
From: "Miyuru C. Ratnayake" <miyuruchanna at yahoo.com>
To: <opencms-dev at opencms.org>
Sent: Wednesday, March 17, 2004 4:27 AM
Subject: Re: [opencms-dev] Lucene 1.5 pdf & doc won't index


> M Buther,
>
> When I use
> <plainDocFactory enebled="true">
> then plian documents get indexed
>
> if i used
> <docFactory enebled="true" type="plain">
> then it won't get indexed, this is what I send you in
> the previos mail
>
> thanks,
> Miyuru.
>
> --- "Miyuru C. Ratnayake" <miyuruchanna at yahoo.com>
> wrote:
> > M Butcher,
> >
> >
>
=====IndexManager===========================================================
==
> > [17.03.2004 09:18:10] <opencms_info> Analyzer:
> > org.apache.lucene.analysis.standard.StandardAnalyzer
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Certification/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Certification/123/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Certification/Resource/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Certification/Sample/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Persistence/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Persistence/Resource/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Persistence/Sample/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Persistence/Tips/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Security/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Security/Resource/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/Security/Sample/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/WebServices/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/WebServices/Resources/
> > [17.03.2004 09:18:10] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/WebServices/Samples/
> > [17.03.2004 09:18:11] <opencms_info> IndexManager:
> > indexing /TBOKCMS/Documents/WebServices/Tips/
> > [17.03.2004 09:18:11] <opencms_info> IndexManager: 0
> > documents are being processed
> > [17.03.2004 09:18:11] <opencms_info> IndexManager:
> > Index has been optimized.
> > [17.03.2004 09:18:11] <opencms_info> Done
> >
>
=====IndexManager===========================================================
==
> >
> >
> > the relavant registry.xml used for this....
> >
> > <luceneSearch>
> > <mergeFactor>100000</mergeFactor>
> >    <permCheck>true</permCheck>
> > <indexDir>C:\lucene\TBOKCMS\</indexDir>
> >
> >
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
> >   <subsearch>true</subsearch>
> > <project>online</project>
> >    <docFactories>
> >    <docFactory enabled="true" type="page">
> >
> >
> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
> >       </docFactory>
> > <docFactory enabled="true" type="plain">
> >           <fileType name="plaintext">
> >             <extension>.txt</extension>
> >
> >
> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> >           </fileType>
> >           <fileType name="taggedtext">
> >             <extension>.html</extension>
> >             <extension>.htm</extension>
> >             <extension>.xml</extension>
> >
> >
> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
> >           </fileType>
> >        </docFactory>
> > <docFactory enabled="true" type="binary">
> > <fileType name="Word">
> > <extension>.doc</extension>
> >
> >
> <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
> > </fileType>
> > <fileType name="PDF">
> > <extension>.pdf</extension>
> >
> >
> <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
> > </fileType>
> > </docFactory>
> > <docFactory enabled="false" type="jsp">
> >
> >
> <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
> >        </docFactory>
> >        <docFactory enabled="false" type="news">
> >
> >
> <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
> >        </docFactory>
> > <docFactory enabled="false" type="forum">
> >
> >
> <class>de.wfnetz.opencms.modules.forum.ContributionDocument</class>
> >        </docFactory>
> > <docFactory enabled="false" type="XML
> > Template"/>
> >    </docFactories>
> >    <directories>
> >    <directory location="/TBOKCMS/Documents/">
> > <section>TBOK CMS</section>
> > <subsearch>true</subsearch>
> >    </directory>
> >    </directories>
> >    <contentDefinitions>
> >    <contentDefinition type="news">
> >
> >
> <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
> >
> >
>
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
ss>
> > <listMethod name="getNewsList">
> >    <param type="java.lang.Integer">1</param>
> >    <param type="java.lang.String">-1</param>
> > </listMethod>
> > <page uri="/news.html?__element=entry">
> >    <param method="getIntId" name="newsid"/>
> > </page>
> >    </contentDefinition>
> >    <contentDefinition type="forum">
> >
> >
>
<class>de.wfnetz.opencms.modules.forum.ContributionContentDefinition</class>
> > <listMethod name="getSortedList">
> >    <param type="java.lang.String"/>
> > </listMethod>
> > <page
> >
> uri="/forum.html?forumtemplate=viewcontributionentry">
> >    <param method="getId" name="conid"/>
> > </page>
> >    </contentDefinition>
> >    </contentDefinitions>
> > </luceneSearch>
> > Thanks,
> > Miyuru.
> >
> >
> > M Butcher <mbutcher at grcomputing.net> wrote:
> >
> > Miyuru,
> >
> > Can you send the section of the log that shows the
> > IndexManager entries.
> > It starts:
> >
> > =====IndexManager=========================
> >
> > And it should show what DocumentFactories and
> > extension maps were loaded.
> >
> > Matt
> >
> > Miyuru C. Ratnayake wrote:
> > > Hi,
> > >
> > > There are no errors. Only 4 documents get indexed
> > they all are .txt
> > > documents in plain type. There are .pdf and .doc
> > documents too but they
> > > won't get indexed
> > >
> > > Miyuru
> > >
> > > Do you Yahoo!?
> > > *Yahoo! Mail*
> > > - More reliable, more storage, less spam
> > >
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev
> > mailing
> > list
> > To change your list options, or to unsubscribe from
> > the list, please visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> > __________________________________
> > Do you Yahoo!?
> > Yahoo! Mail - More reliable, more storage, less spam
> > http://mail.yahoo.com
> > _______________________________________________
> > This mail is send to you from the opencms-dev
> > mailing list
> > To change your list options, or to unsubscribe from
> > the list, please visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - More reliable, more storage, less spam
> http://mail.yahoo.com
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>




More information about the opencms-dev mailing list