[opencms-dev] Indexing News 2.1 for Lucene Search - Please help..

Trevor Lee Trevor.Lee at 4Loop.com.au
Mon Nov 10 03:22:01 CET 2003


Hi,

I have multiple news channels which i'd like to index using lucene search
(module 1.4).

I've noticed that indexing of news only occurs on channel id=1 but does not
index other news channels.

What do i need to do to get it to index all news channels as part of the
lucene search (1.4)?

Also is there any update on the issue that was raised below - NewsDocument
checking for null?

Please help as I need to get news going as a matter of urgency.

Thank you for any help received.

Cheers
Trevor

-----Original Message-----
From: opencms-dev-admin at opencms.org
[mailto:opencms-dev-admin at opencms.org]On Behalf Of Trevor Lee
Sent: Friday, November 07, 2003 4:52 PM
To: opencms-dev at opencms.org
Subject: RE: [opencms-dev] Indexing News for Lucene Search - Please
help..


Hi Stephan,

Thank you.

Will you be releasing an updated news module as a result of this fix?

Thank you in advance

Cheers
Trevor

-----Original Message-----
From: opencms-dev-admin at opencms.org
[mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch &
Feykes GmbH
Sent: Thursday, November 06, 2003 6:51 PM
To: opencms-dev at opencms.org
Subject: Re: [opencms-dev] Indexing News for Lucene Search - Please
help..


The fix is to check in the NewsDocument class for null values.
I'll check this.

Bye,
Stephan

----- Original Message -----
From: "Trevor Lee" <Trevor.Lee at 4Loop.com.au>
To: <opencms-dev at opencms.org>
Sent: Thursday, November 06, 2003 4:29 AM
Subject: RE: [opencms-dev] Indexing News for Lucene Search - Please help..


> Hi Stephan,
>
> Thanks for that. I manually changed the a_info1, a_info2 and a_info3 in
the
> database to replace the null elements with an empty string and the news
> items can be indexed.
>
> However, it seems that with news2.1 the three info fields are no longer
> available on the form that is used to create a news entry.
> This means that subsequent news items created will have a_info1 etc
> defaulted to "NULL" in the database and hence cause the indexing to fail.
>
> Is there an easy fix for this?
>
> Thanks in advance for any help.
>
> Cheers
> Trevor
>
> -----Original Message-----
> From: opencms-dev-admin at opencms.org
> [mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch &
> Feykes GmbH
> Sent: Wednesday, November 05, 2003 7:54 PM
> To: opencms-dev at opencms.org
> Subject: Re: [opencms-dev] Indexing News for Lucene Search - Please
> help..
>
>
> Hi Trevor,
>
> this classes were written for version 1.0 of the news module, but should
> work also with verison 2.1.
> It seems that there is a null for a_info1 what shouldn't be. Normally all
> fields get initialized with an empty String.
>
> The page tag describes a page that should be called to show the single
news
> entry by passing it some parameters.
> <param method="getIntId" name="newsid"/>
> tells it to append a parameter newsid=123 where the id is fetch by calling
> the method getIntId on the ContentDefinition.
>
> Bye,
> Stephan
>
> ----- Original Message -----
> From: "Trevor Lee" <Trevor.Lee at 4Loop.com.au>
> To: <opencms-dev at opencms.org>
> Sent: Wednesday, November 05, 2003 7:07 AM
> Subject: [opencms-dev] Indexing News for Lucene Search - Please help..
>
>
> > Hi
> >
> > I have news2.1 and Lucene Search 1.4 installed on opencms 5.0
> >
> > I'm trying to index news items and need this functionality working very
> > soon, so if any one can help ....
> >
> > The following is what my registry.xml looks like in relation to lucene:
> >         <luceneSearch>
> >             <mergeFactor>100000</mergeFactor>
> >             <permCheck>true</permCheck>
> >
> >
>
<indexDir>C:\Jakarta-Tomcat-4.1.12\webapps\opencms\lucene\index\</indexDir>
> >
> >
<analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
> >             <subsearch>true</subsearch>
> >             <project>online</project>
> >             <docFactories>
> >                 <docFactory enabled="true" type="page">
> >
> > <class>net.grcomputing.opencms.search.lucene.PageDocument</class>
> >                 </docFactory>
> >                 <docFactory enabled="true" type="plain">
> >                     <fileType name="plaintext">
> >                         <extension>.txt</extension>
> >
> > <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> >                     </fileType>
> >                     <fileType name="taggedtext">
> >                         <extension>.html</extension>
> >                         <extension>.htm</extension>
> >                         <extension>.xml</extension>
> >                         <!-- This will strip tags before processing -->
> >
> > <class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
> >                     </fileType>
> >                 </docFactory>
> >                 <docFactory enabled="true" type="binary">
> >
> > <class>net.grcomputing.opencms.search.lucene.BodylessDocument</class>
> >                 </docFactory>
> >                 <docFactory enabled="true" type="jsp">
> >
> > <class>net.grcomputing.opencms.search.lucene.JspDocument</class>
> >                 </docFactory>
> >                 <docFactory enabled="true" type="news">
> >
> > <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
> >                 </docFactory>
> >                 <docFactory enabled="false" type="XML Template"/>
> >             </docFactories>
> >             <directories>
> >                 <directory location="/swm/">
> >                     <section>Test</section>
> >                     <subsearch>true</subsearch>
> >                 </directory>
> >             </directories>
> >             <contentDefinitions>
> >                 <contentDefinition type="news">
> >
> > <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
> >
> >
>
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
> > ss>
> >                     <listMethod name="getNewsList">
> >                         <param type="java.lang.Integer">1</param>
> >                         <param type="java.lang.String">-1</param>
> >                     </listMethod>
> >                    <page uri="/news/news.jsp?__element=entry">
> >                         <param method="getIntId" name="newsid"/>
> >                    </page>
> >                 </contentDefinition>
> >             </contentDefinitions>
> >         </luceneSearch>
> >
> > The news.jsp file is the same as that provided in the news2.1 zip file.
> I've
> > modified it:
> > <jsp:useBean id="newsbean"
> > class="com.opencms.modules.homepage.news.NewsContentDefinition"
> scope="page"
> > />
> > <%@page session="false" import="java.util.*, java.text.*,
> > com.opencms.modules.homepage.news.*" %>
> > <%@ taglib prefix="cms" uri="http://www.opencms.org/taglib/cms" %>
> > <cms:template element="entry"> <!-- added this line -->
> > <%
> > String sID = request.getParameter("id");
> > :
> > :
> > :
> > %>
> > </cms:template>
> > I've added the element "entry" as per the instructions in the message
> below.
> >
> > When the lucene cron job runs I get the following error messages:
> > [05.11.2003 05:55:10] <opencms_cronscheduler> Starting job for
> > com.opencms.core.CmsCronEntry{55 5 * * * admin Administrators
> > net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true}
> > [05.11.2003 05:55:10] <opencms_info>
> >
>
=====IndexManager===========================================================
> > ==
> > [05.11.2003 05:55:10] <opencms_info> Analyzer:
> > org.apache.lucene.analysis.standard.StandardAnalyzer
> > [05.11.2003 05:55:10] <opencms_info> Extension map exists to handle
> > plaintext
> > [05.11.2003 05:55:10] <opencms_info> Extension map exists to handle
> > taggedtext
> > [05.11.2003 05:55:10] <opencms_info> JSP DocumentFactory loaded
> > [05.11.2003 05:55:10] <opencms_info> Bodyless DocumentFactory loaded
> > [05.11.2003 05:55:10] <opencms_info> Page DocumentFactory loaded
> > [05.11.2003 05:55:10] <opencms_info> IndexManager: indexing /swm/
> > :
> > :
> > 05.11.2003 05:55:12] <opencms_cronscheduler> Error running job for
> > com.opencms.core.CmsCronEntry{55 5 * * * admin Administrators
> > net.grcomputing.opencms.search.lucene.CronIndexManager createIndex=true}
> > Error: java.lang.IllegalArgumentException: value cannot be null
> > at org.apache.lucene.document.Field.<init>(Unknown Source)
> > at org.apache.lucene.document.Field.UnStored(Unknown Source)
> > at
> >
>
net.grcomputing.opencms.search.lucene.NewsDocument.Document(NewsDocument.jav
> > a:140)
> > at
> >
>
net.grcomputing.opencms.search.lucene.IndexManager.processContentDefinitions
> > (IndexManager.java:437)
> > at
> >
>
net.grcomputing.opencms.search.lucene.IndexManager.doIndex(IndexManager.java
> > :240)
> > at
> >
>
net.grcomputing.opencms.search.lucene.CronIndexManager.launch(CronIndexManag
> > er.java:107)
> > at com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)
> >
> >
> > IS the error due to the <page> element in <contentDefinition
type="news">?
> >
> > Thank you in advance.
> >
> > Cheers
> >
> > Trevor
> > -----Original Message-----
> > From: opencms-dev-admin at opencms.org
> > [mailto:opencms-dev-admin at opencms.org]On Behalf Of Hartmann, Waehrisch &
> > Feykes GmbH
> > Sent: Wednesday, October 22, 2003 4:51 PM
> > To: opencms-dev at opencms.org
> > Subject: Re: [opencms-dev] (no subject)
> >
> >
> > Hi Ben,
> >
> > i think this won't work since the plainDocFactory will only be used for
> > files of type "plain" but not for files of type "binary".
> > Recently we have done some additions to the module - by order of Lenord,
> > Bauer & Co. GmbH - that could meet your needs. It introduces a more
> flexible
> > way of defining docFactories that you can add new factories without
having
> > to recompile the whole module. So other modules (like the news) can
bring
> > their own docFactory and all you have to do is to edit the registry.xml.
> > Here is an example:
> >
> >             <docFactories>
> >                 <docFactory enabled="true" type="plain">
> >                     <fileType name="plaintext">
> >                         <extension>.txt</extension>
> >
> > <class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
> >                     </fileType>
> >                 </docFactory>
> >                 <docFactory enabled="true" type="news">
> >
> > <class>net.grcomputing.opencms.search.lucene.NewsDocument</class>
> >                 </docFactory>
> >             </docFactories>
> >
> > To index binary files all you need to add is this:
> >
> >            <docFactory enabled="true" type="binary">
> >
> > <class>net.grcomputing.opencms.search.lucene.BodylessDocument</class>
> >            </docFactory>
> >
> > There should be no need for an extension mapping.
> >
> > For the interested people:
> > For ContentDefinitions (like news) i introduced the following:
> >             <contentDefinitions>
> >                 <contentDefinition type="news"> <!-- must match
docFactory
> > type -->
> >
> > <class>com.opencms.modules.homepage.news.NewsContentDefinition</class>
> >
> >
>
<initClass>net.grcomputing.opencms.search.lucene.NewsInitialization</initCla
> > ss>
> >                     <listMethod name="getNewsList">
> >                         <param type="java.lang.Integer">1</param>
> >                         <param type="java.lang.String">-1</param>
> >                     </listMethod>
> >                     <page uri="/news.html?__element=entry">
> >                         <param method="getIntId" name="newsid"/>
> >                     </page>
> >                 </contentDefinition>
> >
> > In short:
> > initClass is optional: For the news the news classes have to be loaded
to
> > initialize the db pool.
> > listMethod: a method of the content definition class that returns a List
> of
> > elements
> > page: the page that can display an entry. Here a jsp that has a template
> > element "entry". It also needs the id of the news item.
> > getIntId is a method of the content definition class and newsid is the
url
> > parameter the page needs. A link like
> > news.html?__element=entry&newsid=xy
> > will be generated.
> >
> > Best regards,
> > Stephan
> >
> >
> >
> > _______________________________________________
> > This mail is send to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
visit
> > http://mail.opencms.org/mailman/listinfo/opencms-dev
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev
>
>
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://mail.opencms.org/mailman/listinfo/opencms-dev

_______________________________________________
This mail is send to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://mail.opencms.org/mailman/listinfo/opencms-dev


_______________________________________________
This mail is send to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://mail.opencms.org/mailman/listinfo/opencms-dev





More information about the opencms-dev mailing list