[opencms-dev] Index pdf files with your content in lucene

Hartmann, Waehrisch & Feykes GmbH hartmann at waehrisch-feykes.de
Fri Oct 24 12:09:01 CEST 2003


Did you create a new index? To be sure delete your old one.
For the PDF: the class PDFDocument is the class that Ernesto is programming
on. You will have to wait until he is done. Until then you can use the
BodylessDocument to index only the title, description and keywords of your
pdfs.

Bye,
Stephan


> Hi,
>
> I installed your module. After that I wanted to search my multiple content
sites but it doesn't work. It's like
> before. I only find results if I search for text in the default body.
>
> Another problem is the search for .pdf documents.
> I couldn't find the class: PDFDocument
>
> Can you help me please?
> What did I made wrong?
>
> Fabian
>
>
> >> Hello Ernesto,
> >>
> >> i assume you are using the unpatched version 1.3 of the search module.
> >> As i mentioned yesterday, the plainDocFactory does only index cmsFiles
of type "plain" but not of
> >> type "binary". PDF files are stored as binary.
> >> I suggest to use the version i posted yesterday. Then your registry.xml
would have to look like this:
> >> ...
> >> <docFactories>
> >> ...
> >>    <docFactory type="plain" enabled="true">
> >> ...
> >>    </docFactory>
> >>    <docFactory type="binary" enabled="true">
> >>       <fileType name="pdftext">
> >>          <extension>.pdf</extension>
> >>
<class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
> >>       </fileType>
> >>    </docFactory>
> >> ...
> >> </docFactories>
> >>
> >> Important: The type attribute must match the file types of OpenCms
(also defined in the registry.xml).
> >>
> >> Bye,
> >> Stephan
>
> _________________________________________________________________
> www.ebay.de Hier Finden Sie Auktionen und Festpreisangebote!
>
>
>
>




More information about the opencms-dev mailing list