[opencms-dev] Registry xml for PDF and WORD document search

Stephan Hartmann beffe at beffe.de
Mon Nov 24 20:49:02 CET 2003


Well, i never tried it yet but for me it looks ok. The name attribute of the
fileType tag only has informational character for the debugging output.
Else what you need to include are the libraries from www.textmining.org

Bye,
Stephan


----- Original Message -----
From: "M Butcher" <mbutcher at grcomputing.net>
To: <opencms-dev at opencms.org>; "Hartmann, Waehrisch & Feykes GmbH"
<hartmann at waehrisch-feykes.de>; "Ernesto De Santis"
<ernesto.desantis at colaborativa.net>
Sent: Monday, November 24, 2003 7:56 PM
Subject: Re: [opencms-dev] Registry xml for PDF and WORD document search


>
> Trevor,
>
> I'm not sure. I think you need a Content Definition. I'm copying Stephen
> on this -- he did most of the work on this part of the module. I'll also
> copy Ernesto, who contributed the two classes.
>
> Stephen, Ernesto -- if you can answer, I'll incorporate your answer into
> the README/INSTALL files for the module.
>
> Matt
>
> Trevor Lee wrote:
> > Hi all,
> >
> > I was wondering what the registry.xml file should have inorder to get
lucene
> > to index word and pdf files using Ernesto De Santis's PDFDocument and
> > WordDocument classes?
> >
> > I've got the following in my registry.xml file:
> >
> >                 <docFactory enabled="true" type="binary">
> >                     <fileType name="pdftext">
> >                         <extension>.pdf</extension>
> >
> > <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
> >                     </fileType>
> >                     <fileType name="doctext">
> >                         <extension>.doc</extension>
> >
> > <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
> >                     </fileType>
> >                 </docFactory>
> >
> > Where do i define the "pdftext" and "doctext" types?
> >
> > What else needs to be changed or included?
> >
> > Thanks in advance for your help.
> >
> > Cheers
> > Trevor
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev




More information about the opencms-dev mailing list