[opencms-dev] FILE_CONTENT not found

Darin Kuntze dkuntze at thinksacco.com
Fri Apr 16 21:51:00 CEST 2004


I had to make a coupe tweaks to the build.xml file, but I was able to get it
to build fairly easily using eclipse. It looks like it is working with the
lucene module fairly well. It gets through my long list of PDFs without
running out of memory. I am getting the following error:

java.lang.Throwable: Warning: You did not close the PDF Document
        at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:386)
        at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
        at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
        at java.lang.ref.Finalizer.access$100(Finalizer.java:14)

I'm not sure of the source of the error... I haven't had time to look it up.
But it could be in the module code due to the new version of PDFbox.

-----Original Message-----
From: opencms-dev-admin at opencms.org [mailto:opencms-dev-admin at opencms.org]
On Behalf Of M Butcher
Sent: Friday, April 16, 2004 5:12 AM
To: opencms-dev at opencms.org
Subject: Re: [opencms-dev] FILE_CONTENT not found



I don't know. When you find out, tell me!

I tried building that 6.4 version from source several months ago, but 
there were some bugs that prevented it from compiling, so I gave up.

Matt

Darin Kuntze wrote:
> Thanks Stephan!
> 
> Matt:
> 
> If I build a copy of PDFbox 6.5 and deploy it with the lucene module, 
> will I run into any compatibility issues?
> 
> -----Original Message-----
> From: opencms-dev-admin at opencms.org 
> [mailto:opencms-dev-admin at opencms.org]
> On Behalf Of Stephan Hartmann
> Sent: Thursday, April 15, 2004 2:45 PM
> To: opencms-dev at opencms.org
> Subject: Re: [opencms-dev] FILE_CONTENT not found
> 
> 
> This is a bug in pdfbox versions before 0.6.0
> 
> Bye,
> Stephan
> 
> ----- Original Message -----
> From: "Darin Kuntze" <dkuntze at thinksacco.com>
> To: <opencms-dev at opencms.org>
> Sent: Thursday, April 15, 2004 6:59 PM
> Subject: RE: [opencms-dev] FILE_CONTENT not found
> 
> 
> The log message I'm getting looks telling:
> 
> java.io.IOException: expected='obj' actual='obj<</H[576' pdfSource 
> 0x21
> 
> at org.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:261)
>         at org.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:93)
>         at 
> org.textmining.text.extraction.PDFExtractor.extractText(PDFExtractor.j
> ava:37
> )
>         at
> net.grcomputing.opencms.search.lucene.PDFDocument.Document(Unknown Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.processFile(Unknown
> Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.processDir(Unknown
> Source)
>         at
> net.grcomputing.opencms.search.lucene.IndexManager.doIndex(Unknown Source)
>         at
> net.grcomputing.opencms.search.lucene.CronIndexManager.launch(Unknown
> Source)
>         at
> com.opencms.core.CmsCronScheduleJob.run(CmsCronScheduleJob.java:68)
> 
> It eats up all the memory then kills tomcat.
> 
> -----Original Message-----
> From: opencms-dev-admin at opencms.org 
> [mailto:opencms-dev-admin at opencms.org]
> On Behalf Of M Butcher
> Sent: Thursday, April 15, 2004 10:18 AM
> To: opencms-dev at opencms.org
> Subject: Re: [opencms-dev] FILE_CONTENT not found
> 
> 
> Does the file have content? Sounds like the CmsFile.getContents() is 
> getting an error.
> 
> You can use this SQL to check the contents in the database:
> 
> select a.RESOURCE_NAME, b.FILE_CONTENT
>    from CMS_RESOURCES as a, CMS_FILES as b
>    where a.FILE_ID = b.FILE_ID and
>      a.RESOURCE_NAME = '/default/vfs/index.jsp';
> 
> Matt
> 
> 
> 
> Darin Kuntze wrote:
> 
>>I'm getting this error in my opencms.log:
>>[15.04.2004 02:03:20] <opencms_critical> IndexManager: CMS Error
>>processing file index.jsp: com.opencms.core.CmsException: 4 Sql 
>>exception. Detailed error: [com.opencms.file.mySql.CmsDbAccess] Column 
>>'FILE_CONTENT' not found..
>>
>>Something is causing the site to hang... I'm guessing this has
>>something to do with it.
>>
>><http://www.thinksacco.com/> Darin Kuntze
>>/Senior Technologist/
>>*The Sacco Group*
>>402.392.2222 x120
>>
>>
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev
> 
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev
> 
> 
> 
> _______________________________________________
> This mail is send to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please 
> visit http://mail.opencms.org/mailman/listinfo/opencms-dev

_______________________________________________
This mail is send to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://mail.opencms.org/mailman/listinfo/opencms-dev






More information about the opencms-dev mailing list