<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16890" name=GENERATOR></HEAD>
<BODY text=#000000 bgColor=#ffffff>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>Hi Christian</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>It is indeed horrible, just imagine how much easier it
would have been to use utf-8 (or at least be able to specify the encoding)..! No
idea what Sun has been thinking (or not thinking at all) all the time they
haven't addressed this issue. Yes, the native2ascii thing is fine, but damn what
a messy "solution" that is...</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>I agree that the format of the .properties files have a
nice simplicity to them, and I don't see anything wrong with them other than the
encoding issue. But of course, XML is a strong contender if there's ever going
to be a change. </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>However, that would require a lot more re-writing than just
switching over to UTF-8 while keeping the format, I would assume (since you'd
have to change both the format and the encoding)? As far as I can tell from what
(little) I've read on the subject, UTF-8 is largely "backwards compatible"
with ISO-8859-1, so in my mind it seems like a pretty straight-forward swap for
Alkacon - if they ever wanted to. </FONT></SPAN><SPAN
class=189515318-30092009><FONT face=Arial color=#0000ff
size=2>Eventually, I guess it's up to Sun if they want to change
the specification. I just thought it would be cool if Alkacon stepped
up and set an example, OpenCms being a multilanguage CMS and
all.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>I was considering whether or not to extend/modify the
OpenCms core in order to read .properties-values as UTF-8 (probably something
like you've done already), but for now, my method is sufficient, so I think I'll
put that idea on ice for now.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>Thanks for your reply, comforting to know I'm not the only
one who's been struggling with the properties encoding.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>All the best,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=189515318-30092009><FONT face=Arial
color=#0000ff size=2>Paul</FONT></SPAN></DIV><BR>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> opencms-dev-bounces@opencms.org
[mailto:opencms-dev-bounces@opencms.org] <B>On Behalf Of </B>Christian
Steinert<BR><B>Sent:</B> 30. september 2009 18:55<BR><B>To:</B> The OpenCms
mailing list<BR><B>Subject:</B> Re: [opencms-dev] Problems with labels in
workplace_ru.properties<BR></FONT><BR></DIV>
<DIV></DIV>Paul-Inge Flakstad wrote:
<BLOCKQUOTE cite=mid:4D405B4872E4E54A95966E6CF909E7F325D4BF6614@anton
type="cite"><PRE wrap="">I was a bit mistaken in my last post. By specification, all .properties files are Latin-1 encoded, and when loading from (or saving to) a stream, ISO-8859-1 character encoding is used. All characters that cannot be represented in this encoding must be Unicode escaped.
This is probably old news for the more experienced developers, but to me, it's news. I'm glad to finally have learned the cause of my "mysterios" encoding problem, but bewildered adn confounded by the facts... Why why WHY???
</PRE></BLOCKQUOTE>At some point I hat also stumbled over this one -
horrible, isn't it? In the end I just wrote my own property file loader method
that pushes the property data string through the native2ascii implementation
from GNU Classpath and then loads the resulting property bundle.<BR><BR>I
don't know why Sun never fixed the property file spec by adding an optional
encoding signature, but somehow they never did.<BR>
<BLOCKQUOTE cite=mid:4D405B4872E4E54A95966E6CF909E7F325D4BF6614@anton
type="cite"><PRE wrap="">Further reading: <A class=moz-txt-link-freetext href="http://www.thoughtsabout.net/blog/archives/000044.html">http://www.thoughtsabout.net/blog/archives/000044.html</A>
Guys at Alkacon: Any possibility that, in a future release, you would consider disregarding the specification for .properties files, and use UTF-8 instead? In my opinion, OpenCms would become more user-friendly if you did - at least when dealing with multilanguage sites that should support use of characters not specified in the Latin-1 set.
</PRE></BLOCKQUOTE>Maybe it would be best to switch to XML-based resource
bundles at some point. I like the simplicity of the old-fashioned property
format, but since their encoding behavior is definded like that, it's maybe
best to leave them aside rather than bending the spec oficially.<BR><BR>Best
Regards<BR>Christian<BR><BR><BR><BR>
<BLOCKQUOTE cite=mid:4D405B4872E4E54A95966E6CF909E7F325D4BF6614@anton
type="cite"><PRE wrap="">Cheers,
Paul
</PRE>
<BLOCKQUOTE type="cite"><PRE wrap="">-----Original Message-----
From: <A class=moz-txt-link-abbreviated href="mailto:opencms-dev-bounces@opencms.org">opencms-dev-bounces@opencms.org</A>
[<A class=moz-txt-link-freetext href="mailto:opencms-dev-bounces@opencms.org">mailto:opencms-dev-bounces@opencms.org</A>] On Behalf Of
Paul-Inge Flakstad
Sent: 30. september 2009 15:42
To: The OpenCms mailing list
Subject: Re: [opencms-dev] Problems with labels in
workplace_ru.properties
Self-replying :)
Given that my assumptions are correct:
The workplace_xx.properties files are read during the
workplace initialization, using the default encoding of the
JVM, which typically depends upon the locale and charset of
the underlying operating system.
In my case, the workplace_ru.properties file is read as
ISO-8859-1, and as a result, no strings fetched using
CmsJspActionElement#label(String) make any sense - it's all gibberish.
The solution is something along the lines of this:
public String labelUnicode(String key) {
String jvmDefaultCharsetName =
Charset.defaultCharset().displayName();
try {
return new
String(this.label(key).getBytes(jvmDefaultCharsetName), "UTF-8");
} catch (java.io.UnsupportedEncodingException e) {
return new String("[Default label: " +
this.label(key) + "]");
}
}
This seems to be working just perfectly. I don't have to
think about character encoding since utf-8 is the default
encoding all over OpenCms (I can just leave the
"content-encoding" property blank), and I can even mix
special characters from different languages all in one
.properties file.
I would even propose to add a method, like the one suggested
above, to CmsJspActionElement. (I'm pretty sure I attempted
every single possibility within OpenCms to get the correct
strings from my workplace_ru.properties returned, using the
"standard" label(String), but never got anything but strange
symbols. If someone for some reason _needs_ to have their JVM
default encoding set to ISO-8859-1, while at the same time
supporting a multilanguage OpenCms system, there seems to be
no method native to OpenCms that enables getting correct
russian (for example) strings from the workplace_xx.properties files.)
Cheers,
Paul
PS: I know I should propably set the JVM default encoding
manually to UTF-8 instead, but I'm unsure of any possible
side-effects. So until then, this is a pretty decent workaround.
</PRE>
<BLOCKQUOTE type="cite"><PRE wrap="">-----Original Message-----
From: <A class=moz-txt-link-abbreviated href="mailto:opencms-dev-bounces@opencms.org">opencms-dev-bounces@opencms.org</A>
[<A class=moz-txt-link-freetext href="mailto:opencms-dev-bounces@opencms.org">mailto:opencms-dev-bounces@opencms.org</A>] On Behalf Of
Paul-Inge Flakstad
Sent: 29. september 2009 22:57
To: The OpenCms mailing list
Subject: [opencms-dev] Problems with labels in
</PRE></BLOCKQUOTE><PRE wrap="">workplace_ru.properties
</PRE>
<BLOCKQUOTE type="cite"><PRE wrap="">Hi all
In one of our multilanguage sites, Russian and English
content is mixed. Everything's working as expected (since
we're using UTF-8 encoding), but all the labels read from
workplace_ru.properties, using
CmsJspActionElement#label(String), is just gibberish...
As a workaround, I created my own label(String, Locale)
method that does nothing more than simply read the value
straight out from the workplace_ru.properties file. When
using this method to access the labels, everything is OK, but
the .properties file is "parsed" upon each invocation, so
it's not desirable to keep using it.
I've tried this:
Set the HTML charset to utf-8.
Set the JSP pageEncoding to utf-8.
Set the OpenCms <defaultcontentencoding> to utf-8.
I also checked the HTTP response header, it also says utf-8.
Also, I been experimenting with different constellations of
encodings (including Cyrillic iso-8859-5), but to no avail.
Can anyone please provide some insight?
(Just so there's no mistake, I'm reading the labels from the
.properties file to use them as text on a web-page, not in
the OpenCms workplace. Things like "Photo:", "Published by "
and alike.)
Cheers,
Paul
_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list,
please visit
<A class=moz-txt-link-freetext href="http://lists.opencms.org/mailman/listinfo/opencms-dev">http://lists.opencms.org/mailman/listinfo/opencms-dev</A>
</PRE></BLOCKQUOTE><PRE wrap="">_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list,
please visit
<A class=moz-txt-link-freetext href="http://lists.opencms.org/mailman/listinfo/opencms-dev">http://lists.opencms.org/mailman/listinfo/opencms-dev</A>
</PRE></BLOCKQUOTE><PRE wrap=""><!---->
_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<A class=moz-txt-link-freetext href="http://lists.opencms.org/mailman/listinfo/opencms-dev">http://lists.opencms.org/mailman/listinfo/opencms-dev</A>
</PRE></BLOCKQUOTE><BR></BLOCKQUOTE></BODY></HTML>