[opencms-dev] Search Engine integration - advice on available options?
M Butcher
mbutcher at grcomputing.net
Wed Nov 5 04:11:01 CET 2003
Joe,
AFAIK, htdig basically spiders the site and builds an index. It searches
over the HTML pages generated by OpenCms. For that reason, I'm not sure
that it needs a module or anything. Configuring and running htdic is
done outside of OpenCms
Lucene is (more or less) just a library for providing search
functionality to a Java application. The Lucene module you've seen
basically runs the indexing process on the VFS -- and it runs as part of
OpenCms. Current additions to the Lucene module provide searching of Doc
and PDF documents. The advantage to doing things this way (keep in mind,
I'm biased toward this module) is that search applications can be
tightly integrated with OpenCms. For instance, configuration is done via
the registry.xml file, and since the search engine has access to the CMS
information (e.g. content types, location in VFS, permissions), it can
index in a more intelligent way. Also, search results can use CMS
templates, which can take full advantage of the CMS (common templates,
newsfeeds, etc.).
Matt
Joe McFadden wrote:
>Hi,
>
>We'd like to integrate OpenCMS with a search engine, and I'm looking
>for some general advice on what options are available and their relative
>merits.
>
>I've looked at the docs and the mailing list archive, and seen various
>posts on the details of setting up the lucene module, plus a few about
>htdig. However, I can only find the lucence module in the module sandbox
>- is the htdig module still available?
>
>Any comments on pros and cons of htdig vs lucene vs whatever in the
>context of integrating with OpenCMS?
>
>This is for an intranet site; I will need to be able to index Word and
>PDF documents. Using OpenCMS 5.0 on Linux.
>We currently use htdig to index our existing static intranet site
>(which we plan to migrate to OpenCMS).
>
>Regards,
>Joe
>
>PS - thanks to Thomas Maarz for his response to my post last week on
>updating control code - your suggestion worked.
>
>
>_______________________________________________
>This mail is send to you from the opencms-dev mailing list
>To change your list options, or to unsubscribe from the list, please visit
>http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
More information about the opencms-dev
mailing list