[opencms-dev] Problems with labels in workplace_ru.properties

Paul-Inge Flakstad flakstad at npolar.no
Wed Sep 30 17:18:41 CEST 2009


I was a bit mistaken in my last post. By specification, all .properties files are Latin-1 encoded, and when loading from (or saving to) a stream, ISO-8859-1 character encoding is used. All characters that cannot be represented in this encoding must be Unicode escaped.

This is probably old news for the more experienced developers, but to me, it's news. I'm glad to finally have learned the cause of my "mysterios" encoding problem, but bewildered adn confounded by the facts... Why why WHY???

Further reading: http://www.thoughtsabout.net/blog/archives/000044.html

Guys at Alkacon: Any possibility that, in a future release, you would consider disregarding the specification for .properties files, and use UTF-8 instead? In my opinion, OpenCms would become more user-friendly if you did - at least when dealing with multilanguage sites that should support use of characters not specified in the Latin-1 set.

Cheers,
Paul

> -----Original Message-----
> From: opencms-dev-bounces at opencms.org 
> [mailto:opencms-dev-bounces at opencms.org] On Behalf Of 
> Paul-Inge Flakstad
> Sent: 30. september 2009 15:42
> To: The OpenCms mailing list
> Subject: Re: [opencms-dev] Problems with labels in 
> workplace_ru.properties
> 
> Self-replying :)
> 
> Given that my assumptions are correct: 
> The workplace_xx.properties files are read during the 
> workplace initialization, using the default encoding of the 
> JVM, which typically depends upon the locale and charset of 
> the underlying operating system.
> 
> In my case, the workplace_ru.properties file is read as 
> ISO-8859-1, and as a result, no strings fetched using 
> CmsJspActionElement#label(String) make any sense - it's all gibberish.
> 
> The solution is something along the lines of this:
> 
>     public String labelUnicode(String key) {
>         String jvmDefaultCharsetName = 
> Charset.defaultCharset().displayName();
>         try {
>             return new 
> String(this.label(key).getBytes(jvmDefaultCharsetName), "UTF-8");
>         } catch (java.io.UnsupportedEncodingException e) {
>             return new String("[Default label: " + 
> this.label(key) + "]");
>         }
>     }
> 
> This seems to be working just perfectly. I don't have to 
> think about character encoding since utf-8 is the default 
> encoding all over OpenCms (I can just leave the 
> "content-encoding" property blank), and I can even mix 
> special characters from different languages all in one 
> .properties file.
> 
> I would even propose to add a method, like the one suggested 
> above, to CmsJspActionElement. (I'm pretty sure I attempted 
> every single possibility within OpenCms to get the correct 
> strings from my workplace_ru.properties returned, using the 
> "standard" label(String), but never got anything but strange 
> symbols. If someone for some reason _needs_ to have their JVM 
> default encoding set to ISO-8859-1, while at the same time 
> supporting a multilanguage OpenCms system, there seems to be 
> no method native to OpenCms that enables getting correct 
> russian (for example) strings from the workplace_xx.properties files.)
> 
> Cheers,
> Paul
> 
> PS: I know I should propably set the JVM default encoding 
> manually to UTF-8 instead, but I'm unsure of any possible 
> side-effects. So until then, this is a pretty decent workaround.
> 
> 
> > -----Original Message-----
> > From: opencms-dev-bounces at opencms.org 
> > [mailto:opencms-dev-bounces at opencms.org] On Behalf Of 
> > Paul-Inge Flakstad
> > Sent: 29. september 2009 22:57
> > To: The OpenCms mailing list
> > Subject: [opencms-dev] Problems with labels in 
> workplace_ru.properties
> > 
> > Hi all
> > 
> > In one of our multilanguage sites, Russian and English 
> > content is mixed. Everything's working as expected (since 
> > we're using UTF-8 encoding), but all the labels read from 
> > workplace_ru.properties, using 
> > CmsJspActionElement#label(String), is just gibberish... 
> > 
> > As a workaround, I created my own label(String, Locale) 
> > method that does nothing more than simply read the value 
> > straight out from the workplace_ru.properties file. When 
> > using this method to access the labels, everything is OK, but 
> > the .properties file is "parsed" upon each invocation, so 
> > it's not desirable to keep using it.
> > 
> > I've tried this:
> > Set the HTML charset to utf-8.
> > Set the JSP pageEncoding to utf-8. 
> > Set the OpenCms <defaultcontentencoding> to utf-8.
> > I also checked the HTTP response header, it also says utf-8.
> > 
> > Also, I been experimenting with different constellations of 
> > encodings (including Cyrillic iso-8859-5), but to no avail.
> > 
> > Can anyone please provide some insight?
> > 
> > (Just so there's no mistake, I'm reading the labels from the 
> > .properties file to use them as text on a web-page, not in 
> > the OpenCms workplace. Things like "Photo:", "Published by " 
> > and alike.)
> > 
> > Cheers,
> > Paul
> > 
> > _______________________________________________
> > This mail is sent to you from the opencms-dev mailing list
> > To change your list options, or to unsubscribe from the list, 
> > please visit
> > http://lists.opencms.org/mailman/listinfo/opencms-dev
> > 
> 
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, 
> please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
> 


More information about the opencms-dev mailing list