[opencms-dev] lucene in OpenCms 5 for Excel
???
shiys at langhua.cn
Sun Jan 9 14:40:45 CET 2005
Hi Peter,
Try the following code to make index for Ms Excel files:
package net.grcomputing.opencms.search.lucene;
import com.opencms.core.CmsException;
import com.opencms.file.CmsFile;
import com.opencms.file.CmsObject;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import jxl.Cell;
import jxl.CellType;
import jxl.Sheet;
import jxl.Workbook;
import jxl.read.biff.BiffException;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
/**
* This class allows to create org.apache.lucene.document.Document
* so you can index the entire content of Excel files.
*
* Title: ExcelDocument
* Company: Beijing Langhua Ltd.
*
* @author Shi Yusen
* @link http://www.langhua.cn/
* @version 1.0
*/
public class ExcelDocument extends BodylessDocument {
public static String FACTORY_NAME = "MS Excel DocumentFactory";
public ExcelDocument() {
}
public String getFactoryName() {
return FACTORY_NAME;
}
public Document Document(CmsObject cmso, CmsFile f) throws CmsException {
String bodyText = null;
Document doc = super.Document(cmso, f);
f = cmso.readFile(f.getAbsolutePath());
InputStream in = new ByteArrayInputStream(f.getContents());
try {
Workbook wb = Workbook.getWorkbook(in);
Sheet[] sheets = wb.getSheets();
for(int i=0; i<sheets.length; i++) {
for(int j=0; j<sheets[i].getRows(); j++) {
Cell[] cells = sheets[i].getRow(j);
for(int k=0; k<cells.length; k++) {
// Only Label will be indexed.
if(cells[k].getType().equals(CellType.LABEL)) {
bodyText += cells[k].getContents() + "\n";
}
}
}
}
} catch (BiffException e1) {
throw new CmsException(e1.getMessage());
} catch (IOException e1) {
throw new CmsException(e1.getMessage());
}
if (bodyText != null) {
doc.add(Field.Text(FIELD_BODY, bodyText));
doc.add(Field.UnStored(FIELD_BULK, bodyText));
}
return doc;
}
public Document Document(CmsObject cmso, CmsFile f, HashMap h)
throws CmsException {
return Document(cmso, f);
}
}
Regards,
Shi Yusen / Beijing Langhua Ltd.
-----????-----
???: Peter Korn [mailto:peter_korn at gmx.de]
????: 2005?1?7? 18:02
???: opencms-dev at opencms.org
??: [opencms-dev] lucene in OpenCms 5 for Exel
Hi,
is it possible to use lucene in OpenCms 5 for Exel files?
thanks
Peter
--
+++ Sparen Sie mit GMX DSL +++ http://www.gmx.net/de/go/dsl
AKTION f|r Wechsler: DSL-Tarife ab 3,99 EUR/Monat + Startguthaben
More information about the opencms-dev
mailing list