[opencms-dev] Lucene search module

Ivan Jelenic ivan.jelenic at nbs.yu
Tue Jul 22 14:40:02 CEST 2003


Hi!

I have implemented lucene module and it's work fine. This is my registry
contains this:

<luceneSearch>
  <mergeFactor>100000</mergeFactor>
  <permCheck>true</permCheck>
  <indexDir>/lucene/opencms/</indexDir>
  <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
  <subsearch>true</subsearch>
  <project>online</project>
  <docFactories>
   <pageDocFactory enabled="true">
    <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
   </pageDocFactory>
   <plainDocFactory enabled="true">
    <fileType name="plaintext">
     <extension>.txt</extension>
     <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
    </fileType>
    <fileType name="taggedtext">
     <extension>.html</extension>
     <extension>.htm</extension>
     <extension>.xml</extension>
     <!-- This will strip tags before processing -->

<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
    </fileType>
   </plainDocFactory>
   <jspDocFactory enabled="true">
    <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
   </jspDocFactory>
   <xmlTemplateDocFactory enabled="false"/>
  </docFactories>
  <directories>
   <directory location="/Home/">
    <section>Test</section>
    <subsearch>true</subsearch>
   </directory>
  </directories>
 </luceneSearch>

Now I will try to include search for PDF docs.
Hope, it will help.
Best, regards. Ivan

----- Original Message ----- 
From: "antonio reyes" <reyes_soy at hotmail.com>
To: <opencms-dev at opencms.org>
Sent: Tuesday, July 22, 2003 2:14 PM
Subject: Re: [opencms-dev] Lucene search module


> I'm interesting in this module. But I don't know how use it.
> I have followed the steps of README.txt, and this is my
> opencms.log:-------------------------------------------------------
> [22.07.2003 13:58:10] <opencms_info>
>
=====IndexManager===========================================================
> ==
> [22.07.2003 13:58:10] <opencms_info> Analyzer:
> org.apache.lucene.analysis.standard.StandardAnalyzer
> [22.07.2003 13:58:10] <opencms_info> Page DocumentFactory loaded
> [22.07.2003 13:58:10] <opencms_info> JSP DocumentFactory loaded
> [22.07.2003 13:58:10] <opencms_info> Plain DocumentFactory loaded
> [22.07.2003 13:58:10] <opencms_info> Extension map exists to handle
> plaintext
> [22.07.2003 13:58:10] <opencms_info> Extension map exists to handle
> taggedtext
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing /ec-web/
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> /ec-web/Aplicaciones/
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> /ec-web/convocatorias_abiertas/
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> /ec-web/convocatorias_abiertas/cursos_formacion_creacion_empresas/
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> /ec-web/convocatorias_abiertas/premio_trayectoria_empresarial/
> [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> /ec-web/convocatorias_abiertas/programa_de_apoyo/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
>
/ec-web/convocatorias_abiertas/programa_insercion_aprendices_talleres_artesa
> nales/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
/ec-web/Enlaces/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/estudios_publicaciones/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/estudios_publicaciones/Pruebas/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/mapa_web/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/Noticias/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/programas_proyectos/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/programas_proyectos/servicios_apoyo_asesoramiento_empresarial/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
/ec-web/Saluda/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> /ec-web/Sugerencias/
> [22.07.2003 13:58:11] <opencms_info> IndexManager: 8 documents are being
> processed
> [22.07.2003 13:58:11] <opencms_info> Done
>
=====IndexManager===========================================================
> ==
>
> --------------------------------------------------------------------------
--
> ------
> this is my
> registry.xml: ------------------------------------------------------------
>
>         <!-- Begin luceneSearch  -->
>         <luceneSearch>
>             <mergeFactor>100000</mergeFactor>
>             <permCheck>true</permCheck>
>             <indexDir>c:\lucene</indexDir>
>
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
>             <subsearch>true</subsearch>
>             <project>online</project>
>             <docFactories>
>                 <pageDocFactory enabled="true">
>
> <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
>                 </pageDocFactory>
>                 <plainDocFactory enabled="true">
>                     <fileType name="plaintext">
>                         <extension>.txt</extension>
>
> <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
>                     </fileType>
>                     <fileType name="taggedtext">
>                         <extension>.html</extension>
>                         <extension>.htm</extension>
>                         <extension>.xml</extension>
>                         <!-- This will strip tags before processing -->
>
> <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
>                     </fileType>
>                 </plainDocFactory>
>                 <jspDocFactory enabled="true">
>
> <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
>                 </jspDocFactory>
>                 <xmlTemplateDocFactory enabled="false"/>
>             </docFactories>
>             <directories>
>                 <directory location="/ec-web/">
>                     <section>ec-web</section>
>                     <subsearch>true</subsearch>
>                 </directory>
>             </directories>
>         </luceneSearch>
>
> --------------------------------------------------------------------------
--
> ------
>
> Then I try to execute the sample jsp :
>
http://vmaqueta2:8080/opencms/opencms/system/modules/net.grcomputing.opencms
> .search.lucene/elements/simple_search.jsp?q=lucene
> but don't find anything
>
> Anybody can help me??
>
> a lot of thanks.
>
> (sorry for my english)
>
>
>
>
>
>
>
>
> ----- Original Message -----
> From: "Olli Aro" <olli_aro at yahoo.co.uk>
> To: <opencms-dev at opencms.org>
> Sent: Tuesday, July 22, 2003 11:23 AM
> Subject: RE: [opencms-dev] Lucene search module
>
>
> > Sounds good:) Does it index the static export files only or the dynamic
> > files as well?
> >
> > Olli
> >
> > > -----Original Message-----
> > > From: opencms-dev-admin at opencms.org
> > > [mailto:opencms-dev-admin at opencms.org]On Behalf Of M Butcher
> > > Sent: 22 July 2003 07:29
> > > To: opencms-dev at opencms.org
> > > Subject: [opencms-dev] Lucene search module
> > >
> > >
> > > For those of you interested, I've released a Lucene-based search
module
> > > under the GPL v.2 license. al-arenal.de has it up already (including
> > > source and Javadoc links), and I've also submitted it to the module
> > > sandbox on OpenCMS.org
> > >
> > > http://opencms.al-arenal.de/
> > >
> > > What it does (in a nutshell):
> > >
> > > The module uses the Apache Lucene search engine (part of Jakarata) to
> > > create search indexes of Resources (admin-specified, of course) in the
> > > OpenCMS online project. There is a small utility class (SearchHelper)
> > > and a simple search JSP (elements/simpleSearch.jsp) that provide the
> > > necessary components to build a user interface, and you may also use
the
> > > Lucene APIs to create a more powerful UI, if you prefer.
> > >
> > > There is no admin interface currently, and the indexing is currently
> > > done via a "Scheduled Task". If you install it, make sure you read the
> > > docs/README.txt after importing with module management.
> > >
> > > Please email this list with questions, etc. I welcome feedback,
patches,
> > > and suggestions.
> > >
> > > I would like to thank Niklas Eklund, Bryan LaPlante and Paul D. Bain
for
> > > their interest and help, as well as the others on the list who took
time
> > > to answer some of my questions.
> > >
> > > Enjoy!
> > >
> > > Matt Butcher
> > > Global Resources for Computing
> > > --
> > > M Butcher <mbutcher at grcomputing.net>
> > > _______________________________________________
> > > This mail is send to you from the opencms-dev mailing list
> > > To change your list options, or to unsubscribe from the list, please
> visit
> > > http://mail.opencms.org/mailman/listinfo/opencms-dev
> > > ---
> > > Incoming mail is certified Virus Free.
> > > Checked by AVG anti-virus system (http://www.grisoft.com).
> > > Version: 6.0.495 / Virus Database: 294 - Release Date: 30/06/2003
> > >
> > ---
> > Outgoing mail is certified Virus Free.
> > Checked by AVG anti-virus system (http://www.grisoft.com).
> > Version: 6.0.495 / Virus Database: 294 - Release Date: 30/06/2003
> >
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev





More information about the opencms-dev mailing list