[opencms-dev] Lucene search module

Ivan Jelenic ivan.jelenic at nbs.yu
Tue Jul 22 16:07:02 CEST 2003


Yes, it's true.
I also have some problems with search for text in the body of document.
Sometimes it works but sometimes it doesn't work.
Anyway, tomorrow is a new day. Have a nice evening.

Best regards, Ivan.

----- Original Message ----- 
From: "antonio reyes" <reyes_soy at hotmail.com>
To: <opencms-dev at opencms.org>
Sent: Tuesday, July 22, 2003 3:38 PM
Subject: Re: [opencms-dev] Lucene search module


> I put a file called "lucene.txt" with the body : "this is the body for
> lucene file", and the property "Title" set to "lucene" in the directory
> indexed.
> work fine when looking for a property like "Title", "Keywords", but don't
> seems work when is looking for a word contained in the body file.
>
> it means:
> http://...elements/simple_search.jsp?q=title:lucene ----> FOUND
> but
> http://...elements/simple_search.jsp?q=lucene        ---> NOT  FOUND
> or
> http://...elements/simple_search.jsp?q=text:lucene  ---> NOT  FOUND
> or
> http://...elements/simple_search.jsp?q=body:lucene  ---> NOT  FOUND
>
> A lot of thanks.
>
> Best, regards. Reyes
>
> ----- Original Message -----
> From: "Ivan Jelenic" <ivan.jelenic at nbs.yu>
> To: <opencms-dev at opencms.org>
> Sent: Tuesday, July 22, 2003 2:21 PM
> Subject: Re: [opencms-dev] Lucene search module
>
>
> > Hi!
> >
> > I have implemented lucene module and it's work fine. This is my registry
> > contains this:
> >
> > <luceneSearch>
> >   <mergeFactor>100000</mergeFactor>
> >   <permCheck>true</permCheck>
> >   <indexDir>/lucene/opencms/</indexDir>
> >
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
> >   <subsearch>true</subsearch>
> >   <project>online</project>
> >   <docFactories>
> >    <pageDocFactory enabled="true">
> >     <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
> >    </pageDocFactory>
> >    <plainDocFactory enabled="true">
> >     <fileType name="plaintext">
> >      <extension>.txt</extension>
> >      <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> >     </fileType>
> >     <fileType name="taggedtext">
> >      <extension>.html</extension>
> >      <extension>.htm</extension>
> >      <extension>.xml</extension>
> >      <!-- This will strip tags before processing -->
> >
> > <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
> >     </fileType>
> >    </plainDocFactory>
> >    <jspDocFactory enabled="true">
> >     <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
> >    </jspDocFactory>
> >    <xmlTemplateDocFactory enabled="false"/>
> >   </docFactories>
> >   <directories>
> >    <directory location="/Home/">
> >     <section>Test</section>
> >     <subsearch>true</subsearch>
> >    </directory>
> >   </directories>
> >  </luceneSearch>
> >
> > Now I will try to include search for PDF docs.
> > Hope, it will help.
> > Best, regards. Ivan
> >
> > ----- Original Message -----
> > From: "antonio reyes" <reyes_soy at hotmail.com>
> > To: <opencms-dev at opencms.org>
> > Sent: Tuesday, July 22, 2003 2:14 PM
> > Subject: Re: [opencms-dev] Lucene search module
> >
> >
> > > I'm interesting in this module. But I don't know how use it.
> > > I have followed the steps of README.txt, and this is my
> > > opencms.log:-------------------------------------------------------
> > > [22.07.2003 13:58:10] <opencms_info>
> > >
> >
>
=====IndexManager===========================================================
> > > ==
> > > [22.07.2003 13:58:10] <opencms_info> Analyzer:
> > > org.apache.lucene.analysis.standard.StandardAnalyzer
> > > [22.07.2003 13:58:10] <opencms_info> Page DocumentFactory loaded
> > > [22.07.2003 13:58:10] <opencms_info> JSP DocumentFactory loaded
> > > [22.07.2003 13:58:10] <opencms_info> Plain DocumentFactory loaded
> > > [22.07.2003 13:58:10] <opencms_info> Extension map exists to handle
> > > plaintext
> > > [22.07.2003 13:58:10] <opencms_info> Extension map exists to handle
> > > taggedtext
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing /ec-web/
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> > > /ec-web/Aplicaciones/
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> > > /ec-web/convocatorias_abiertas/
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> > > /ec-web/convocatorias_abiertas/cursos_formacion_creacion_empresas/
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> > > /ec-web/convocatorias_abiertas/premio_trayectoria_empresarial/
> > > [22.07.2003 13:58:10] <opencms_info> IndexManager: indexing
> > > /ec-web/convocatorias_abiertas/programa_de_apoyo/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > >
> >
>
/ec-web/convocatorias_abiertas/programa_insercion_aprendices_talleres_artesa
> > > nales/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > /ec-web/Enlaces/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/estudios_publicaciones/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/estudios_publicaciones/Pruebas/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/mapa_web/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/Noticias/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/programas_proyectos/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/programas_proyectos/servicios_apoyo_asesoramiento_empresarial/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > /ec-web/Saluda/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: indexing
> > > /ec-web/Sugerencias/
> > > [22.07.2003 13:58:11] <opencms_info> IndexManager: 8 documents are
being
> > > processed
> > > [22.07.2003 13:58:11] <opencms_info> Done
> > >
> >
>
=====IndexManager===========================================================
> > > ==
> > >
> >
>
> --------------------------------------------------------------------------
> > --
> > > ------
> > > this is my
> > >
> registry.xml: ------------------------------------------------------------
> > >
> > >         <!-- Begin luceneSearch  -->
> > >         <luceneSearch>
> > >             <mergeFactor>100000</mergeFactor>
> > >             <permCheck>true</permCheck>
> > >             <indexDir>c:\lucene</indexDir>
> > >
> > >
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
> > >             <subsearch>true</subsearch>
> > >             <project>online</project>
> > >             <docFactories>
> > >                 <pageDocFactory enabled="true">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
> > >                 </pageDocFactory>
> > >                 <plainDocFactory enabled="true">
> > >                     <fileType name="plaintext">
> > >                         <extension>.txt</extension>
> > >
> > > <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> > >                     </fileType>
> > >                     <fileType name="taggedtext">
> > >                         <extension>.html</extension>
> > >                         <extension>.htm</extension>
> > >                         <extension>.xml</extension>
> > >                         <!-- This will strip tags before
processing -->
> > >
> > >
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
> > >                     </fileType>
> > >                 </plainDocFactory>
> > >                 <jspDocFactory enabled="true">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
> > >                 </jspDocFactory>
> > >                 <xmlTemplateDocFactory enabled="false"/>
> > >             </docFactories>
> > >             <directories>
> > >                 <directory location="/ec-web/">
> > >                     <section>ec-web</section>
> > >                     <subsearch>true</subsearch>
> > >                 </directory>
> > >             </directories>
> > >         </luceneSearch>
> > >
> >
>
> --------------------------------------------------------------------------
> > --
> > > ------
> > >
> > > Then I try to execute the sample jsp :
> > >
> >
>
http://vmaqueta2:8080/opencms/opencms/system/modules/net.grcomputing.opencms
> > > .search.lucene/elements/simple_search.jsp?q=lucene
> > > but don't find anything
> > >
> > > Anybody can help me??
> > >
> > > a lot of thanks.
> > >
> > > (sorry for my english)
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: "Olli Aro" <olli_aro at yahoo.co.uk>
> > > To: <opencms-dev at opencms.org>
> > > Sent: Tuesday, July 22, 2003 11:23 AM
> > > Subject: RE: [opencms-dev] Lucene search module
> > >
> > >
> > > > Sounds good:) Does it index the static export files only or the
> dynamic
> > > > files as well?
> > > >
> > > > Olli
> > > >
> > > > > -----Original Message-----
> > > > > From: opencms-dev-admin at opencms.org
> > > > > [mailto:opencms-dev-admin at opencms.org]On Behalf Of M Butcher
> > > > > Sent: 22 July 2003 07:29
> > > > > To: opencms-dev at opencms.org
> > > > > Subject: [opencms-dev] Lucene search module
> > > > >
> > > > >
> > > > > For those of you interested, I've released a Lucene-based search
> > module
> > > > > under the GPL v.2 license. al-arenal.de has it up already
(including
> > > > > source and Javadoc links), and I've also submitted it to the
module
> > > > > sandbox on OpenCMS.org
> > > > >
> > > > > http://opencms.al-arenal.de/
> > > > >
> > > > > What it does (in a nutshell):
> > > > >
> > > > > The module uses the Apache Lucene search engine (part of Jakarata)
> to
> > > > > create search indexes of Resources (admin-specified, of course) in
> the
> > > > > OpenCMS online project. There is a small utility class
> (SearchHelper)
> > > > > and a simple search JSP (elements/simpleSearch.jsp) that provide
the
> > > > > necessary components to build a user interface, and you may also
use
> > the
> > > > > Lucene APIs to create a more powerful UI, if you prefer.
> > > > >
> > > > > There is no admin interface currently, and the indexing is
currently
> > > > > done via a "Scheduled Task". If you install it, make sure you read
> the
> > > > > docs/README.txt after importing with module management.
> > > > >
> > > > > Please email this list with questions, etc. I welcome feedback,
> > patches,
> > > > > and suggestions.
> > > > >
> > > > > I would like to thank Niklas Eklund, Bryan LaPlante and Paul D.
Bain
> > for
> > > > > their interest and help, as well as the others on the list who
took
> > time
> > > > > to answer some of my questions.
> > > > >
> > > > > Enjoy!
> > > > >
> > > > > Matt Butcher
> > > > > Global Resources for Computing
> > > > > --
> > > > > M Butcher <mbutcher at grcomputing.net>
> > > > > _______________________________________________
> > > > > This mail is send to you from the opencms-dev mailing list
> > > > > To change your list options, or to unsubscribe from the list,
please
> > > visit
> > > > > http://mail.opencms.org/mailman/listinfo/opencms-dev
> > > > > ---
> > > > > Incoming mail is certified Virus Free.
> > > > > Checked by AVG anti-virus system (http://www.grisoft.com).
> > > > > Version: 6.0.495 / Virus Database: 294 - Release Date: 30/06/2003
> > > > >
> > > > ---
> > > > Outgoing mail is certified Virus Free.
> > > > Checked by AVG anti-virus system (http://www.grisoft.com).
> > > > Version: 6.0.495 / Virus Database: 294 - Release Date: 30/06/2003
> > > >
> > > >
> > > > _______________________________________________
> > > > This mail is send to you from the opencms-dev mailing list
> > > > To change your list options, or to unsubscribe from the list, please
> > visit
> > > > http://mail.opencms.org/mailman/listinfo/opencms-dev
> > > >
> > > _______________________________________________
> > > This mail is send to you from the opencms-dev mailing list
> > > To change your list options, or to unsubscribe from the list, please
> visit
> > > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev





More information about the opencms-dev mailing list