[opencms-dev] FILE_CONTENT not found

Darin Kuntze dkuntze at thinksacco.com
Thu Apr 15 20:58:02 CEST 2004


Here's the lucene part:


        <luceneSearch>
            <mergeFactor>100000</mergeFactor>
            <permCheck>true</permCheck>
            <indexDir>/opt/lucene/index/opencms/</indexDir>
 
<analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
            <subsearch>true</subsearch>
            <project>Online</project>
            <docFactories>
                <docFactory enabled="true" type="page">
 
<class>net.grcomputing.opencms.search.lucene.PageDocument</class>
                </docFactory>
                <docFactory enabled="true" type="plain">
                    <fileType name="plaintext">
                        <extension>.txt</extension>
 
<class>net.grcomputing.opencms.search.lucene.PlainDocument</class>
                    </fileType>
                    <fileType name="taggedtext">
                        <extension>.html</extension>
                        <extension>.htm</extension>
                        <extension>.jsp</extension>
                        <!-- This will strip tags before processing -->
 
<class>net.grcomputing.opencms.search.lucene.TaggedPlainDocument</class>
                    </fileType>
                </docFactory>
                <docFactory enabled="false" type="jsp">
 
<class>net.grcomputing.opencms.search.lucene.JspDocument</class>
                </docFactory>
                <docFactory enabled="false" type="XML Template"/>
                <docFactory enabled="true" type="binary">
                    <fileType name="pdftext">
                        <extension>.pdf</extension>
 
<class>net.grcomputing.opencms.search.lucene.PDFDocument</class>
                    </fileType>
                </docFactory>
            </docFactories>
            <directories>
                <directory location="/dept/">
                    <section>Department</section>
                    <subsearch>true</subsearch>
                </directory>
                <directory location="/pdfs/">
                    <section>PDFs</section>
                    <subsearch>true</subsearch>
                </directory>
                <directory location="/primary/">
                    <section>MainSite</section>
                    <subsearch>true</subsearch>
                </directory>
                <directory location="/statements/">
                    <section>Statements</section>
                    <subsearch>true</subsearch>
                </directory>
            </directories>
        </luceneSearch>

-----Original Message-----
From: opencms-dev-admin at opencms.org [mailto:opencms-dev-admin at opencms.org]
On Behalf Of M Butcher
Sent: Thursday, April 15, 2004 12:29 PM
To: opencms-dev at opencms.org
Subject: Re: [opencms-dev] FILE_CONTENT not found



Hmmm... that doesn't sound like it's choking on index.jsp, does it? What 
does your registry XML look like? I wonder if the PDF Box classes are 
having trouble parsing one or more of your PDF files.

Matt

Darin Kuntze wrote:
> The log message I'm getting looks telling:
> 
> java.io.IOException: expected='obj' actual='obj<</H[576' pdfSource 
> 0x21
> 
> at org.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:261)
>         at org.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:93)
>         at 
> org.textmining.text.extraction.PDFExtractor.extractText(PDFExtractor.j
> ava:37
> )
>         at
> net.grcomputing.opencms.search.lucene.PDFDocument.Document(Unknown Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.processFile(Unknown
> Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.processDir(Unknown
> Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.doIndex(Unknown Source)
>         at
> net.grcomputing.opencms.search.lucene.CronIndexManager.launch(Unknown
> Source)
>         at
> com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)
> 
> It eats up all the memory then kills tomcat.
> 
> -----Original Message-----
> From: opencms-dev-admin at opencms.org 
> [mailto:opencms-dev-admin at opencms.org]
> On Behalf Of M Butcher
> Sent: Thursday, April 15, 2004 10:18 AM
> To: opencms-dev at opencms.org
> Subject: Re: [opencms-dev] FILE_CONTENT not found
> 
> 
> Does the file have content? Sounds like the CmsFile.getContents() is
> getting an error.
> 
> You can use this SQL to check the contents in the database:
> 
> select a.RESOURCE_NAME, b.FILE_CONTENT
>    from CMS_RESOURCES as a, CMS_FILES as b
>    where a.FILE_ID = b.FILE_ID and
>      a.RESOURCE_NAME = '/default/vfs/index.jsp';
> 
> Matt
> 
> 
> 
> Darin Kuntze wrote:
> 
>>I'm getting this error in my opencms.log:
>>[15.04.2004 02:03:20] <opencms_critical> IndexManager: CMS Error 
>>processing file index.jsp: com.opencms.core.CmsException: 4 Sql 
>>exception. Detailed error: [com.opencms.file.mySql.CmsDbAccess] Column 
>>'FILE_CONTENT' not found..
>> 
>>Something is causing the site to hang... I'm guessing this has
>>something
>>to do with it.
>> 
>><http://www.thinksacco.com/> 	Darin Kuntze
>>/Senior Technologist/
>>*The Sacco Group*
>>402.392.2222 x120
>>
>> 
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev
> 
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev

_______________________________________________
This mail is send to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://mail.opencms.org/mailman/listinfo/opencms-dev






More information about the opencms-dev mailing list