AW: [opencms-dev] Search engine / pure Java

Thomas Maerz thomasmaerz at gmx.de
Tue Jan 14 15:22:29 CET 2003


>> My thinking was to export the sites temporarily statically and then to
>> create for each user a different index to use Lucene. I also have the
>> lists of files belonging to each users access permissions.
>> But then I wasn't able to export the files statically, because of the
>> the different access permissions the files mustn't be exported. At this
>> point I then I chose something else to do, because of the database
>> access errors.
>
> I think, I'd index 'em all together in one index set and just let the
> httpd's
> access mechanism handle the rights - if you're worried about accidentially
> revealing
> confidential info through the description text's of the search hits, then
> you can
> omit those from your list of search results.
>
> Or you could also include an extra field in your index to store the
> access-level
> of the document and use this during your seraches in addition to the user's
> query string.

But then there's still the problem of creating the index! I wasn't able
to export the data statically to files to create an index, because I get
database access errors and I _cannot_ export any file which has any
access restrictions. Maybe rewrite the export classes, but I don't like
to access any core code of OpenCms.

I already had to update my own navigation class (something about
"LinkSubstitution.*"); but copy & and paste aren't always sufficient.

>> By the way I didn't adapt the Lucene module, but just created a JSP-file
>> which executes a Java command to update the index:
>>
>> ,----[ not very nice, but works for me ]
>> | java -classpath /lib/lucene-1.2.jar:/ \
>> | org.apache.lucene.demo.IndexHTML -create -index /export/index /export/
>> `----
>
> seems a nice idea, but I'd use it in addition to a cron'ed call
> and I'd suggest to spawn it into an off-JSP threat; also, using -create
> for every run ever looks inefficient.

Yes, this is a total crap for hourly or daily use; but once a week you
may allow an admin to update the index. This isn't a bug, this is a
feature to test our machine. ;-)

Regards,
Thomas



More information about the opencms-dev mailing list