[opencms-dev] Registry xml for PDF and WORD document search

M Butcher mbutcher at grcomputing.net
Mon Nov 24 19:50:02 CET 2003


Trevor,

I'm not sure. I think you need a Content Definition. I'm copying Stephen 
on this -- he did most of the work on this part of the module. I'll also 
copy Ernesto, who contributed the two classes.

Stephen, Ernesto -- if you can answer, I'll incorporate your answer into 
the README/INSTALL files for the module.

Matt

Trevor Lee wrote:
> Hi all,
> 
> I was wondering what the registry.xml file should have inorder to get lucene
> to index word and pdf files using Ernesto De Santis's PDFDocument and
> WordDocument classes?
> 
> I've got the following in my registry.xml file:
> 
>                 <docFactory enabled="true" type="binary">
>                     <fileType name="pdftext">
>                         <extension>.pdf</extension>
> 
> <class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
>                     </fileType>
>                     <fileType name="doctext">
>                         <extension>.doc</extension>
> 
> <class>net.grcomputing.opencms.search.lucene.WordDocument</class>
>                     </fileType>
>                 </docFactory>
> 
> Where do i define the "pdftext" and "doctext" types?
> 
> What else needs to be changed or included?
> 
> Thanks in advance for your help.
> 
> Cheers
> Trevor
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev





More information about the opencms-dev mailing list