[opencms-dev] Lucene Search for binary files (PDF, WORD, EXCEL etc)

vsouksav at csc.com.au vsouksav at csc.com.au
Fri Oct 24 09:25:00 CEST 2003


Hello all,

First of all my apologies if I ask the same question again.

I have read through the archive and saw glimpses of what to do,
unfortunately to no avail.

I have tried to modify the registry as per Matt's suggestion, see below:

                <docFactory enabled="true" type="binary">
                    <!--
                    <fileType name="bodylesstext">
                    <extension>.pdf</extension>
                    <extension>.doc</extension>
                    <extension>.xls</extension>

<class>net.grcomputing.opencms.search.lucene.BodylessDocument</class>
                    </fileType>
                -->
                </docFactory>

Even tried to specify the file types, still no luck.

Had a look at the source code, there does not seem to be any generic
docFactory, only plainDocFactory, pageDocFactory and so forth.  Can you
tell me if the above code is still valid?

I've noticed that there is version 1.3 available on
http://opencms.al-arenal.de/ and tried it, still did not work.

Following on from everyone's request to Ivan Jenelic, may I also get a copy
of your search module (if possible) that can search static html and pdf
files.

Thanks in advance.

Valouny





More information about the opencms-dev mailing list