[opencms-dev] Indexing News 2.1 for Lucene Search - Please help..

Hartmann, Waehrisch & Feykes GmbH hartmann at waehrisch-feykes.de
Mon Nov 10 09:01:01 CET 2003


Hi,

It is up to you to specify which news to index.
Have a look at
<contentDefinition type="news">
 <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>

<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
ss>
 <listMethod name="getNewsList">
  <param type="java.lang.Integer">1</param>
  <param type="java.lang.String">-1</param>
 </listMethod>
 <page uri="/news.html?__element=entry">
   <param method="getIntId" name="newsid"/>
 </page>
</contentDefinition>

The listMethod declares which method should be used to retrieve the news.
The first parameter is the channel id.
You can add another complete contentDefinition section with another channel
id or you can specify another listMethod, for example getSortedList which
takes only one parameter of type String for sorting purpose.

The fix for the null values is not ready, yet, sorry.

Bye,
Stephan


----- Original Message ----- 
From: "Trevor Lee" <Trevor.Lee at 4Loop.com.au>
To: <opencms-dev at opencms.org>
Sent: Monday, November 10, 2003 3:20 AM
Subject: RE: [opencms-dev] Indexing News 2.1 for Lucene Search - Please
help..


> Hi,
>
> I have multiple news channels which i'd like to index using lucene search
> (module 1.4).
>
> I've noticed that indexing of news only occurs on channel id=1 but does
not
> index other news channels.
>
> What do i need to do to get it to index all news channels as part of the
> lucene search (1.4)?
>
> Also is there any update on the issue that was raised below - NewsDocument
> checking for null?
>
> Please help as I need to get news going as a matter of urgency.
>
> Thank you for any help received.
>
> Cheers
> Trevor
>
> -----Original Message-----
> From: opencms-dev-admin at opencms.org
> [mailto:opencms-dev-admin at opencms.org]On Behalf Of Trevor Lee
> Sent: Friday, November 07, 2003 4:52 PM
> To: opencms-dev at opencms.org
> Subject: RE: [opencms-dev] Indexing News for Lucene Search - Please
> help..
>
>
> Hi Stephan,
>
> Thank you.
>
> Will you be releasing an updated news module as a result of this fix?
>
> Thank you in advance
>
> Cheers
> Trevor
>
> -----Original Message-----
> From: opencms-dev-admin at opencms.org
> [mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch &
> Feykes GmbH
> Sent: Thursday, November 06, 2003 6:51 PM
> To: opencms-dev at opencms.org
> Subject: Re: [opencms-dev] Indexing News for Lucene Search - Please
> help..
>
>
> The fix is to check in the NewsDocument class for null values.
> I'll check this.
>
> Bye,
> Stephan
>
> ----- Original Message -----
> From: "Trevor Lee" <Trevor.Lee at 4Loop.com.au>
> To: <opencms-dev at opencms.org>
> Sent: Thursday, November 06, 2003 4:29 AM
> Subject: RE: [opencms-dev] Indexing News for Lucene Search - Please help..
>
>
> > Hi Stephan,
> >
> > Thanks for that. I manually changed the a_info1, a_info2 and a_info3 in
> the
> > database to replace the null elements with an empty string and the news
> > items can be indexed.
> >
> > However, it seems that with news2.1 the three info fields are no longer
> > available on the form that is used to create a news entry.
> > This means that subsequent news items created will have a_info1 etc
> > defaulted to "NULL" in the database and hence cause the indexing to
fail.
> >
> > Is there an easy fix for this?
> >
> > Thanks in advance for any help.
> >
> > Cheers
> > Trevor
> >
> > -----Original Message-----
> > From: opencms-dev-admin at opencms.org
> > [mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch &
> > Feykes GmbH
> > Sent: Wednesday, November 05, 2003 7:54 PM
> > To: opencms-dev at opencms.org
> > Subject: Re: [opencms-dev] Indexing News for Lucene Search - Please
> > help..
> >
> >
> > Hi Trevor,
> >
> > this classes were written for version 1.0 of the news module, but should
> > work also with verison 2.1.
> > It seems that there is a null for a_info1 what shouldn't be. Normally
all
> > fields get initialized with an empty String.
> >
> > The page tag describes a page that should be called to show the single
> news
> > entry by passing it some parameters.
> > <param method="getIntId" name="newsid"/>
> > tells it to append a parameter newsid=123 where the id is fetch by
calling
> > the method getIntId on the ContentDefinition.
> >
> > Bye,
> > Stephan
> >
> > ----- Original Message -----
> > From: "Trevor Lee" <Trevor.Lee at 4Loop.com.au>
> > To: <opencms-dev at opencms.org>
> > Sent: Wednesday, November 05, 2003 7:07 AM
> > Subject: [opencms-dev] Indexing News for Lucene Search - Please help..
> >
> >
> > > Hi
> > >
> > > I have news2.1 and Lucene Search 1.4 installed on opencms 5.0
> > >
> > > I'm trying to index news items and need this functionality working
very
> > > soon, so if any one can help ....
> > >
> > > The following is what my registry.xml looks like in relation to
lucene:
> > >         <luceneSearch>
> > >             <mergeFactor>100000</mergeFactor>
> > >             <permCheck>true</permCheck>
> > >
> > >
> >
>
<indexDir>C:\Jakarta-Tomcat-4.1.12\webapps\opencms\lucene\index\</indexDir>
> > >
> > >
> <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
> > >             <subsearch>true</subsearch>
> > >             <project>online</project>
> > >             <docFactories>
> > >                 <docFactory enabled="true" type="page">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
> > >                 </docFactory>
> > >                 <docFactory enabled="true" type="plain">
> > >                     <fileType name="plaintext">
> > >                         <extension>.txt</extension>
> > >
> > > <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> > >                     </fileType>
> > >                     <fileType name="taggedtext">
> > >                         <extension>.html</extension>
> > >                         <extension>.htm</extension>
> > >                         <extension>.xml</extension>
> > >                         <!-- This will strip tags before
processing -->
> > >
> > >
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
> > >                     </fileType>
> > >                 </docFactory>
> > >                 <docFactory enabled="true" type="binary">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.BodylessDocument</class>
> > >                 </docFactory>
> > >                 <docFactory enabled="true" type="jsp">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
> > >                 </docFactory>
> > >                 <docFactory enabled="true" type="news">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
> > >                 </docFactory>
> > >                 <docFactory enabled="false" type="XML Template"/>
> > >             </docFactories>
> > >             <directories>
> > >                 <directory location="/swm/">
> > >                     <section>Test</section>
> > >                     <subsearch>true</subsearch>
> > >                 </directory>
> > >             </directories>
> > >             <contentDefinitions>
> > >                 <contentDefinition type="news">
> > >
> > > <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
> > >
> > >
> >
>
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
> > > ss>
> > >                     <listMethod name="getNewsList">
> > >                         <param type="java.lang.Integer">1</param>
> > >                         <param type="java.lang.String">-1</param>
> > >                     </listMethod>
> > >                    <page uri="/news/news.jsp?__element=entry">
> > >                         <param method="getIntId" name="newsid"/>
> > >                    </page>
> > >                 </contentDefinition>
> > >             </contentDefinitions>
> > >         </luceneSearch>
> > >
> > > The news.jsp file is the same as that provided in the news2.1 zip
file.
> > I've
> > > modified it:
> > > <jsp:useBean id="newsbean"
> > > class="com.opencms.modules.homepage.news.NewsContentDefinition"
> > scope="page"
> > > />
> > > <%@page session="false" import="java.util.*, java.text.*,
> > > com.opencms.modules.homepage.news.*" %>
> > > <%@ taglib prefix="cms" uri="http://www.opencms.org/taglib/cms" %>
> > > <cms:template element="entry"> <!-- added this line -->
> > > <%
> > > String sID = request.getParameter("id");
> > > :
> > > :
> > > :
> > > %>
> > > </cms:template>
> > > I've added the element "entry" as per the instructions in the message
> > below.
> > >
> > > When the lucene cron job runs I get the following error messages:
> > > [05.11.2003 05:55:10] <opencms_cronscheduler> Starting job for
> > > com.opencms.core.CmsCronEntry{55 5 * * * admin Administrators
> > > net.grcomputing.opencms.search.lucene.CronIndexManager
createIndex=true}
> > > [05.11.2003 05:55:10] <opencms_info>
> > >
> >
>
=====IndexManager===========================================================
> > > ==
> > > [05.11.2003 05:55:10] <opencms_info> Analyzer:
> > > org.apache.lucene.analysis.standard.StandardAnalyzer
> > > [05.11.2003 05:55:10] <opencms_info> Extension map exists to handle
> > > plaintext
> > > [05.11.2003 05:55:10] <opencms_info> Extension map exists to handle
> > > taggedtext
> > > [05.11.2003 05:55:10] <opencms_info> JSP DocumentFactory loaded
> > > [05.11.2003 05:55:10] <opencms_info> Bodyless DocumentFactory loaded
> > > [05.11.2003 05:55:10] <opencms_info> Page DocumentFactory loaded
> > > [05.11.2003 05:55:10] <opencms_info> IndexManager: indexing /swm/
> > > :
> > > :
> > > 05.11.2003 05:55:12] <opencms_cronscheduler> Error running job for
> > > com.opencms.core.CmsCronEntry{55 5 * * * admin Administrators
> > > net.grcomputing.opencms.search.lucene.CronIndexManager
createIndex=true}
> > > Error: java.lang.IllegalArgumentException: value cannot be null
> > > at org.apache.lucene.document.Field.<init>(Unknown Source)
> > > at org.apache.lucene.document.Field.UnStored(Unknown Source)
> > > at
> > >
> >
>
net.grcomputing.opencms.search.lucene.NewsDocument.Document(NewsDocument.jav
> > > a:140)
> > > at
> > >
> >
>
net.grcomputing.opencms.search.lucene.IndexManager.processContentDefinitions
> > > (IndexManager.java:437)
> > > at
> > >
> >
>
net.grcomputing.opencms.search.lucene.IndexManager.doIndex(IndexManager.java
> > > :240)
> > > at
> > >
> >
>
net.grcomputing.opencms.search.lucene.CronIndexManager.launch(CronIndexManag
> > > er.java:107)
> > > at com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)
> > >
> > >
> > > IS the error due to the <page> element in <contentDefinition
> type="news">?
> > >
> > > Thank you in advance.
> > >
> > > Cheers
> > >
> > > Trevor
> > > -----Original Message-----
> > > From: opencms-dev-admin at opencms.org
> > > [mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch
&
> > > Feykes GmbH
> > > Sent: Wednesday, October 22, 2003 4:51 PM
> > > To: opencms-dev at opencms.org
> > > Subject: Re: [opencms-dev] (no subject)
> > >
> > >
> > > Hi Ben,
> > >
> > > i think this won't work since the plainDocFactory will only be used
for
> > > files of type "plain" but not for files of type "binary".
> > > Recently we have done some additions to the module - by order of
Lenord,
> > > Bauer & Co. GmbH - that could meet your needs. It introduces a more
> > flexible
> > > way of defining docFactories that you can add new factories without
> having
> > > to recompile the whole module. So other modules (like the news) can
> bring
> > > their own docFactory and all you have to do is to edit the
registry.xml.
> > > Here is an example:
> > >
> > >             <docFactories>
> > >                 <docFactory enabled="true" type="plain">
> > >                     <fileType name="plaintext">
> > >                         <extension>.txt</extension>
> > >
> > > <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> > >                     </fileType>
> > >                 </docFactory>
> > >                 <docFactory enabled="true" type="news">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
> > >                 </docFactory>
> > >             </docFactories>
> > >
> > > To index binary files all you need to add is this:
> > >
> > >            <docFactory enabled="true" type="binary">
> > >
> > > <class>net.grcomputing.opencms.search.lucene.BodylessDocument</class>
> > >            </docFactory>
> > >
> > > There should be no need for an extension mapping.
> > >
> > > For the interested people:
> > > For ContentDefinitions (like news) i introduced the following:
> > >             <contentDefinitions>
> > >                 <contentDefinition type="news"> <!-- must match
> docFactory
> > > type -->
> > >
> > > <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
> > >
> > >
> >
>
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
> > > ss>
> > >                     <listMethod name="getNewsList">
> > >                         <param type="java.lang.Integer">1</param>
> > >                         <param type="java.lang.String">-1</param>
> > >                     </listMethod>
> > >                     <page uri="/news.html?__element=entry">
> > >                         <param method="getIntId" name="newsid"/>
> > >                     </page>
> > >                 </contentDefinition>
> > >
> > > In short:
> > > initClass is optional: For the news the news classes have to be loaded
> to
> > > initialize the db pool.
> > > listMethod: a method of the content definition class that returns a
List
> > of
> > > elements
> > > page: the page that can display an entry. Here a jsp that has a
template
> > > element "entry". It also needs the id of the news item.
> > > getIntId is a method of the content definition class and newsid is the
> url
> > > parameter the page needs. A link like
> > > news.html?__element=entry&newsid=xy
> > > will be generated.
> > >
> > > Best regards,
> > > Stephan
> > >
> > >
> > >
> > > _______________________________________________
> > > This mail is send to you from the opencms-dev mailing list
> > > To change your list options, or to unsubscribe from the list, please
> visit
> > > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
> >
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev




More information about the opencms-dev mailing list