[opencms-dev] Lucene search an german umlauts
Claus Priisholm
cpr at codedroids.com
Fri Feb 24 11:56:06 CET 2006
If the string you receive as a parameter in your code is garbled as soon
as you get it, you're probably running into the problem that tomcat uses
some default encoding for the parameter because it does not explicitly
have an encoding.
It typically works when testing inside the workplace since it seems to
set the encoding to UTF-8 before accessing any parameter. That does not
happen when accessing the page directly (and I haven't found a way to
consistently make it do so - maybe there is some Tomcat setting).
Anyway, this also seem to be an issue related to the browser and whether
you post or get...
So I decided to take a defensive approach and not trust anything. I
include a hidden field with a value like ö and checks upon return if
that value is still an ö or it has been garbled. If it is garbled,
convert the strings from ISO-8859-1 (which on the installations I've
tried seems to be what tomcat defaults to) to UTF-8.
regards
Claus
Corsin Camichel wrote:
> Hi Jon
>
> Thanks for your response.
> No, I can deactivate the translation part and still getting the wrong
> queryString.
>
> Corsin
>
> On 2/24/06, *Jonathan Woods * <jonathan.woods at scintillance.com
> <mailto:jonathan.woods at scintillance.com>> wrote:
>
> Corsin -
>
> Is opencms-vfs.xml relevant here? It has a <translations> element
> which does all of that nasty Anglicisation!
>
> Jon
>
> ------------------------------------------------------------------------
> *From:* opencms-dev-bounces at opencms.org
> <mailto:opencms-dev-bounces at opencms.org>
> [mailto:opencms-dev-bounces at opencms.org
> <mailto:opencms-dev-bounces at opencms.org>] *On Behalf Of *Corsin Camichel
> *Sent:* 24 February 2006 09:42
> *To:* The OpenCms mailing list
> *Subject:* [opencms-dev] Lucene search an german umlauts
>
> Hi List
>
> while creating the search part for a site, I came across the same
> problem with the lucene search as I always have. But this time, I do
> not want to create some nasty "search Creator" page that replaces
> German umlauts (äöü) with their html code. I made a lot of research
> in this problem and I have found is, that somehow it has to be
> possible. Why? The site from the "Erzbistum Köln" (created by
> Alkacon last year) has exactly this functionality.
> Link:
> http://www.erzbistum-koeln.de/system/modules/org.opencms.frontend.templateone/pages/search.html
> <http://www.erzbistum-koeln.de/system/modules/org.opencms.frontend.templateone/pages/search.html>
> If you do a query for "köln" and click through the sites ( 1 | 2 |
> 3...) the query "köln" stays as it should. In my cases, it changes
> to something like "k%öln" and lucene has no more search results.
> I tried the set my search page to UTF-8,ISO-8859-1 but nothing helps.
>
> Has anybody of you an idea how I can solve this?
>
> Hope to hear from you soon
>
> Regards
> Corsin
>
> --
> Corsin Camichel
> cocaman at gmail.com <mailto:cocaman at gmail.com>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please
> visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
>
>
>
>
> --
> Corsin Camichel
> cocaman at gmail.com <mailto:cocaman at gmail.com>
>
>
> ------------------------------------------------------------------------
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
--
Claus Priisholm, CodeDroids ApS
Phone: +45 48 22 46 46
cpr (you know what) codedroids.com - http://www.codedroids.com
cpr (you know what) interlet.dk - http://www.interlet.dk
--
Javadocs and other OpenCms stuff:
http://www.codedroids.com/community/opencms
More information about the opencms-dev
mailing list