[opencms-dev] Lucene search an german umlauts

Corsin Camichel cocaman at gmail.com
Fri Feb 24 15:23:08 CET 2006


Hi list

I finally have a solution for my (this) problem:
You can read about it here
http://cocaman.ch/wp/opencms-create-a-search-with-umlaut-and-other-signs/

If you have any comments, please let me know.

Regards
-- Corsin

On 2/24/06, Claus Priisholm <cpr at codedroids.com> wrote:
>
> If the string you receive as a parameter in your code is garbled as soon
> as you get it, you're probably running into the problem that tomcat uses
> some default encoding for the parameter because it does not explicitly
> have  an encoding.
>
> It typically works when testing inside the workplace since it seems to
> set the encoding to UTF-8 before accessing any parameter. That does not
> happen when accessing the page directly (and I haven't found a way to
> consistently make it do so - maybe there is some Tomcat setting).
> Anyway, this also seem to be an issue related to the browser and whether
> you post or get...
>
> So I decided to take a defensive approach and not trust anything. I
> include a hidden field with a value like ö and checks upon return if
> that value is still an ö or it has been garbled. If it is garbled,
> convert the strings from ISO-8859-1 (which on the installations I've
> tried seems to be what tomcat defaults to) to UTF-8.
>
> regards
> Claus
>
> Corsin Camichel wrote:
> > Hi Jon
> >
> > Thanks for your response.
> > No, I can deactivate the translation part and still getting the wrong
> > queryString.
> >
> > Corsin
> >
> > On 2/24/06, *Jonathan Woods * <jonathan.woods at scintillance.com
> > <mailto:jonathan.woods at scintillance.com>> wrote:
> >
> >     Corsin -
> >
> >     Is opencms-vfs.xml relevant here?  It has a <translations> element
> >     which does all of that nasty Anglicisation!
> >
> >     Jon
> >
> >
> ------------------------------------------------------------------------
> >     *From:* opencms-dev-bounces at opencms.org
> >     <mailto:opencms-dev-bounces at opencms.org>
> >     [mailto:opencms-dev-bounces at opencms.org
> >     <mailto:opencms-dev-bounces at opencms.org>] *On Behalf Of *Corsin
> Camichel
> >     *Sent:* 24 February 2006 09:42
> >     *To:* The OpenCms mailing list
> >     *Subject:* [opencms-dev] Lucene search an german umlauts
> >
> >     Hi List
> >
> >     while creating the search part for a site, I came across the same
> >     problem with the lucene search as I always have. But this time, I do
> >     not want to create some nasty "search Creator" page that replaces
> >     German umlauts (äöü) with their html code. I made a lot of research
> >     in this problem and I have found is, that somehow it has to be
> >     possible. Why? The site from the "Erzbistum Köln" (created by
> >     Alkacon last year) has exactly this functionality.
> >     Link:
> >
> http://www.erzbistum-koeln.de/system/modules/org.opencms.frontend.templateone/pages/search.html
> >     <
> http://www.erzbistum-koeln.de/system/modules/org.opencms.frontend.templateone/pages/search.html
> >
> >     If you do a query for "köln" and click through the sites ( 1 | 2 |
> >     3...) the query "köln" stays as it should. In my cases, it changes
> >     to something like "k%öln" and lucene has no more search results.
> >     I tried the set my search page to UTF-8,ISO-8859-1 but nothing
> helps.
> >
> >     Has anybody of you an idea how I can solve this?
> >
> >     Hope to hear from you soon
> >
> >     Regards
> >     Corsin
> >
> >     --
> >     Corsin Camichel
> >     cocaman at gmail.com <mailto:cocaman at gmail.com>
> >
> >
> >     _______________________________________________
> >     This mail is sent to you from the opencms-dev mailing list
> >     To change your list options, or to unsubscribe from the list, please
> >     visit
> >     http://lists.opencms.org/mailman/listinfo/opencms-dev
> >
> >
> >
> >
> > --
> > Corsin Camichel
> > cocaman at gmail.com <mailto:cocaman at gmail.com>
> >
> >
> > ------------------------------------------------------------------------
> >
> >
> > _______________________________________________
> > This mail is sent to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, please
> visit
> > http://lists.opencms.org/mailman/listinfo/opencms-dev
>
> --
> Claus Priisholm, CodeDroids ApS
> Phone: +45 48 22 46 46
> cpr (you know what) codedroids.com - http://www.codedroids.com
> cpr (you know what) interlet.dk - http://www.interlet.dk
> --
> Javadocs and other OpenCms stuff:
> http://www.codedroids.com/community/opencms
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
>



--
Corsin Camichel
cocaman at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20060224/f6840c6a/attachment.htm>


More information about the opencms-dev mailing list