[opencms-dev] Problems with labels in workplace_ru.properties

Paul-Inge Flakstad flakstad at npolar.no
Wed Sep 30 20:53:48 CEST 2009


Hi Roman

I was referring only plain text ones - sorry if I didn't make that clear. And like you say, it is definitively correct that these plain text .properties files are handled as ISO-8859-1.

Paul 

> -----Original Message-----
> From: opencms-dev-bounces at opencms.org 
> [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Roman 
> Uhlig Maxity.de
> Sent: 30. september 2009 17:43
> To: The OpenCms mailing list
> Subject: Re: [opencms-dev] Problems with labels in 
> workplace_ru.properties
> 
>  
> In a default Java web application (has nothing to do with 
> OpenCms), .xml property files are usually used for (default) 
> UTF-8 encoding (but as of XML the encoding is configurable). 
> As far as I know, plain text property files are always 
> handled as ISO-8859-1 encoding by Java.
> 
> Roman
> 
> -----Ursprüngliche Nachricht-----
> Von: opencms-dev-bounces at opencms.org 
> [mailto:opencms-dev-bounces at opencms.org] Im Auftrag von 
> Paul-Inge Flakstad
> Gesendet: Mittwoch, 30. September 2009 17:19
> An: The OpenCms mailing list
> Betreff: Re: [opencms-dev] Problems with labels in 
> workplace_ru.properties
> 
> I was a bit mistaken in my last post. By specification, all 
> .properties files are Latin-1 encoded, and when loading from 
> (or saving to) a stream, ISO-8859-1 character encoding is 
> used. All characters that cannot be represented in this 
> encoding must be Unicode escaped.
> 
> This is probably old news for the more experienced 
> developers, but to me, it's news. I'm glad to finally have 
> learned the cause of my "mysterios" encoding problem, but 
> bewildered adn confounded by the facts... Why why WHY???
> 
> Further reading: 
> http://www.thoughtsabout.net/blog/archives/000044.html
> 
> Guys at Alkacon: Any possibility that, in a future release, 
> you would consider disregarding the specification for 
> .properties files, and use UTF-8 instead? In my opinion, 
> OpenCms would become more user-friendly if you did - at least 
> when dealing with multilanguage sites that should support use 
> of characters not specified in the Latin-1 set.
> 
> Cheers,
> Paul
> 
> > -----Original Message-----
> > From: opencms-dev-bounces at opencms.org 
> > [mailto:opencms-dev-bounces at opencms.org] On Behalf Of 
> > Paul-Inge Flakstad
> > Sent: 30. september 2009 15:42
> > To: The OpenCms mailing list
> > Subject: Re: [opencms-dev] Problems with labels in 
> > workplace_ru.properties
> > 
> > Self-replying :)
> > 
> > Given that my assumptions are correct: 
> > The workplace_xx.properties files are read during the 
> > workplace initialization, using the default encoding of the 
> > JVM, which typically depends upon the locale and charset of 
> > the underlying operating system.
> > 
> > In my case, the workplace_ru.properties file is read as 
> > ISO-8859-1, and as a result, no strings fetched using 
> > CmsJspActionElement#label(String) make any sense - it's all 
> gibberish.
> > 
> > The solution is something along the lines of this:
> > 
> >     public String labelUnicode(String key) {
> >         String jvmDefaultCharsetName = 
> > Charset.defaultCharset().displayName();
> >         try {
> >             return new 
> > String(this.label(key).getBytes(jvmDefaultCharsetName), "UTF-8");
> >         } catch (java.io.UnsupportedEncodingException e) {
> >             return new String("[Default label: " + 
> > this.label(key) + "]");
> >         }
> >     }
> > 
> > This seems to be working just perfectly. I don't have to 
> > think about character encoding since utf-8 is the default 
> > encoding all over OpenCms (I can just leave the 
> > "content-encoding" property blank), and I can even mix 
> > special characters from different languages all in one 
> > .properties file.
> > 
> > I would even propose to add a method, like the one suggested 
> > above, to CmsJspActionElement. (I'm pretty sure I attempted 
> > every single possibility within OpenCms to get the correct 
> > strings from my workplace_ru.properties returned, using the 
> > "standard" label(String), but never got anything but strange 
> > symbols. If someone for some reason _needs_ to have their JVM 
> > default encoding set to ISO-8859-1, while at the same time 
> > supporting a multilanguage OpenCms system, there seems to be 
> > no method native to OpenCms that enables getting correct 
> > russian (for example) strings from the 
> workplace_xx.properties files.)
> > 
> > Cheers,
> > Paul
> > 
> > PS: I know I should propably set the JVM default encoding 
> > manually to UTF-8 instead, but I'm unsure of any possible 
> > side-effects. So until then, this is a pretty decent workaround.
> > 
> > 
> > > -----Original Message-----
> > > From: opencms-dev-bounces at opencms.org 
> > > [mailto:opencms-dev-bounces at opencms.org] On Behalf Of 
> > > Paul-Inge Flakstad
> > > Sent: 29. september 2009 22:57
> > > To: The OpenCms mailing list
> > > Subject: [opencms-dev] Problems with labels in 
> > workplace_ru.properties
> > > 
> > > Hi all
> > > 
> > > In one of our multilanguage sites, Russian and English 
> > > content is mixed. Everything's working as expected (since 
> > > we're using UTF-8 encoding), but all the labels read from 
> > > workplace_ru.properties, using 
> > > CmsJspActionElement#label(String), is just gibberish... 
> > > 
> > > As a workaround, I created my own label(String, Locale) 
> > > method that does nothing more than simply read the value 
> > > straight out from the workplace_ru.properties file. When 
> > > using this method to access the labels, everything is OK, but 
> > > the .properties file is "parsed" upon each invocation, so 
> > > it's not desirable to keep using it.
> > > 
> > > I've tried this:
> > > Set the HTML charset to utf-8.
> > > Set the JSP pageEncoding to utf-8. 
> > > Set the OpenCms <defaultcontentencoding> to utf-8.
> > > I also checked the HTTP response header, it also says utf-8.
> > > 
> > > Also, I been experimenting with different constellations of 
> > > encodings (including Cyrillic iso-8859-5), but to no avail.
> > > 
> > > Can anyone please provide some insight?
> > > 
> > > (Just so there's no mistake, I'm reading the labels from the 
> > > .properties file to use them as text on a web-page, not in 
> > > the OpenCms workplace. Things like "Photo:", "Published by " 
> > > and alike.)
> > > 
> > > Cheers,
> > > Paul
> > > 
> > > _______________________________________________
> > > This mail is sent to you from the opencms-dev mailing list
> > > To change your list options, or to unsubscribe from the list, 
> > > please visit
> > > http://lists.opencms.org/mailman/listinfo/opencms-dev
> > > 
> > 
> > _______________________________________________
> > This mail is sent to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, 
> > please visit
> > http://lists.opencms.org/mailman/listinfo/opencms-dev
> > 
> 
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, 
> please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
> 
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, 
> please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
> 


More information about the opencms-dev mailing list