net.grcomputing.opencms.search.lucene
Class TaggedPlainDocument
java.lang.Object
|
+--net.grcomputing.opencms.search.lucene.TaggedPlainDocument
- All Implemented Interfaces:
- I_DocumentConstants, I_DocumentFactory
- public class TaggedPlainDocument
- extends java.lang.Object
- implements I_DocumentConstants, I_DocumentFactory
This class serves as a document factory for OpenCMS resources. It produces
Lucene Document objects that contain the correct fields for indexing OpenCMS
resources. Unlike some of the other Lucene implementations, this one is
highly coupled with the OpenCMS API - thereby taking advantage of properties
security settings, etc.
This class handles pages with tagged data, e.g. XML and HTML and their
derrivatives.
- Author:
- Matt Butcher mbutcher@grcomputing.net
- See Also:
- http://grcomputing.net
Method Summary |
Document |
Document(CmsObject cmso,
CmsFile f)
Takes a tagged Plain instance and builds a Lucene Document suitable for
index generation. |
Document |
Document(CmsObject cmso,
CmsFile f,
java.util.HashMap p)
Takes a Plain instance of tagged content and builds a
Lucene Document suitable for
index generation. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TaggedPlainDocument
public TaggedPlainDocument()
Document
public Document Document(CmsObject cmso,
CmsFile f,
java.util.HashMap p)
throws CmsException
- Takes a Plain instance of tagged content and builds a
Lucene Document suitable for
index generation. This is for Plain documents that contain tagged
data -- that is, HTML, XML, and their derivatives. Like the default
PageDocument parser, this one uses the fast tag stripper, which will
simply strip out all of the tags from a document. Information stored
in element attributes will not make it into the index. All
parsed character DATA will make it in, even if it is JavaScript
or CSS.
- Specified by:
Document
in interface I_DocumentFactory
- Throws:
CmsException
- it cannot work with the CmsFile or CmsObject.- See Also:
FastTagStripper
Document
public Document Document(CmsObject cmso,
CmsFile f)
throws CmsException
- Takes a tagged Plain instance and builds a Lucene Document suitable for
index generation. Convenience method.
- Specified by:
Document
in interface I_DocumentFactory
- Throws:
CmsException
- it cannot work with the CmsFile or CmsObject.
Copyright © 2003 Matt Butcher of Global Resources for Computing. Reporoduction and modification of this documents are allowed as in accordance with the GPL v2. Refer to COPYING.txt for information on acceptible use