<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Dear Alex<br>
    <br>
    I opened a bug and also created a pull-request. Please see
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    <a href="https://github.com/alkacon/opencms-core/issues/68">https://github.com/alkacon/opencms-core/issues/68</a><br>
    I re-arranged the new translation rules in such a way that upper and
    lower case are now preserved. I also converted all non-ascii
    characters to XML character entities.<br>
    <br>
    Kind regards<br>
    Christian<br>
    <br>
    <blockquote cite="mid:003201cd76dd$e230a060$a691e120$@org"
      type="cite">
      <pre wrap="">Christian, 

thanks for this. I think we should add this to the core distribution. I'll have to see about the upper / lower case issue though, as it will be a change from the current handling and some may not like this.

As you mentioned the bug tracker: Would you be able to send a pull request on GitHub as well? That would be great.

Kind Regards,
Alex.
 
-------------------
Alexander Kandzior
                                                              
Alkacon Software GmbH  - The OpenCms Experts                 
<a class="moz-txt-link-freetext" href="http://www.alkacon.com">http://www.alkacon.com</a> - <a class="moz-txt-link-freetext" href="http://www.opencms.org">http://www.opencms.org</a>                  


-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:opencms-dev-bounces@opencms.org">opencms-dev-bounces@opencms.org</a> [<a class="moz-txt-link-freetext" href="mailto:opencms-dev-bounces@opencms.org">mailto:opencms-dev-bounces@opencms.org</a>] On Behalf Of Christian Steinert
Sent: Friday, August 10, 2012 2:05 AM
To: The OpenCms mailing list
Subject: Re: [opencms-dev] Some additional file translation rules for dealing with diacritics

Dear Alexander

at least for my email client your email reply still had all characters intact but here are the characters in escaped form (I used java's navtive2ascii).
If you want then I am happy to also create an issue in the bug tracker - just let me know. Right now the rules also convert everything to lower case which fits my needs. To preserve upper case, the characters would have to be re-shuffled and additional rules for upper case would need to be introduced.

The first character in each rule is the standard uppercase character which should be elimintated altogether if the intention is to preserve upper case.

Kind regards
Christian


<translation>s#[A\u00c0\u00e0\u00c1\u00e1\u00c2\u00e2\u00c3\u00e3\u00c5\u00e5\u0100\u0101\u0102\u0103\u0104\u0105\u01cd\u01ce\u01de\u01df\u01e0\u01e1\u01fa\u01fb\u0200\u0201\u0202\u0203\u0226\u0227\u1e00\u1e01\u1e9a\u1ea0\u1ea1\u1ea2\u1ea3\u1ea4\u1ea5\u1ea6\u1ea7\u1ea8\u1ea9\u1eaa\u1eab\u1eac\u1ead\u1eae\u1eaf\u1eb0\u1eb1\u1eb2\u1eb3\u1eb4\u1eb5\u1eb6\u1eb7]#a#g</translation>
<translation>s#[B\u0180\u0181\u0253\u0182\u0183\u1e02\u1e03\u1e04\u1e05\u1e06\u1e07]#b#g</translation>
<translation>s#[C\u00c7\u00e7\u0106\u0107\u0108\u0109\u010a\u010b\u010c\u010d\u0187\u0188\u0255\u1e08\u1e09]#c#g</translation>
<translation>s#[D\u010e\u010f\u0110\u0111\u018a\u0257\u018b\u018c\u01c5\u01f2\u0221\u0256\u1e0a\u1e0b\u1e0c\u1e0d\u1e0e\u1e0f\u1e10\u1e11\u1e12\u1e13]#d#g</translation>
<translation>s#[E\u00c8\u00e8\u00c9\u00e9\u00ca\u00ea\u00cb\u00eb\u0112\u0113\u0114\u0115\u0116\u0117\u0118\u0119\u011a\u011b\u0204\u0205\u0206\u0207\u0228\u0229\u1e14\u1e15\u1e16\u1e17\u1e18\u1e19\u1e1a\u1e1b\u1e1c\u1e1d\u1eb8\u1eb9\u1eba\u1ebb\u1ebc\u1ebd\u1ebe\u1ebf\u1ec0\u1ec1\u1ec2\u1ec3\u1ec4\u1ec5\u1ec6\u1ec7]#e#g</translation>
<translation>s#[F\u0191\u0192\u1e1e\u1e1f]#f#g</translation>
<translation>s#[G\u011c\u011d\u011e\u011f\u0120\u0121\u0122\u0123\u0193\u0260\u01e4\u01e5\u01e6\u01e7\u01f4\u01f5\u1e20\u1e21]#g#g</translation>
<translation>s#[H\u0124\u0125\u0126\u0127\u021e\u021f\u0266\u1e22\u1e23\u1e24\u1e25\u1e26\u1e27\u1e28\u1e29\u1e2a\u1e2b\u1e96]#h#g</translation>
<translation>s#[I\u00cc\u00ec\u00cd\u00ed\u00ce\u00ee\u00cf\u00ef\u0128\u0129\u012a\u012b\u012c\u012d\u012e\u012f\u0130\u0197\u0268\u01cf\u01d0\u0208\u0209\u020a\u020b\u1e2c\u1e2d\u1e2e\u1e2f\u1ec8\u1ec9\u1eca\u1ecb]#i#g</translation>
<translation>s#[J\u0134\u0135\u01f0\u029d]#j#g</translation>
<translation>s#[K\u0136\u0137\u0198\u0199\u01e8\u01e9\u1e30\u1e31\u1e32\u1e33\u1e34\u1e35]#k#g</translation>
<translation>s#[L\u0139\u013a\u013b\u013c\u013d\u013e\u013f\u0140\u0141\u0142\u019a\u01c8\u0234\u026b\u026c\u026d\u1e36\u1e37\u1e38\u1e39\u1e3a\u1e3b\u1e3c\u1e3d]#l#g</translation>
<translation>s#[M\u0271\u1e3e\u1e3f\u1e40\u1e41\u1e42\u1e43]#m#g</translation>
<translation>s#[N\u00d1\u00f1\u0143\u0144\u0145\u0146\u0147\u0148\u019d\u0272\u019e\u0220\u01cb\u01f8\u01f9\u0235\u0273\u1e44\u1e45\u1e46\u1e47\u1e48\u1e49\u1e4a\u1e4b]#n#g</translation>
<translation>s#[O\u00d2\u00f2\u00d3\u00f3\u00d4\u00f4\u00d5\u00f5\u00d8\u00f8\u014c\u014d\u014e\u014f\u0150\u0151\u019f\u01a0\u01a1\u01d1\u01d2\u01ea\u01eb\u01ec\u01ed\u01fe\u01ff\u020c\u020d\u020e\u020f\u022a\u022b\u022c\u022d\u022e\u022f\u0230\u0231\u1e4c\u1e4d\u1e4e\u1e4f\u1e50\u1e51\u1e52\u1e53\u1ecc\u1ecd\u1ece\u1ecf\u1ed0\u1ed1\u1ed2\u1ed3\u1ed4\u1ed5\u1ed6\u1ed7\u1ed8\u1ed9\u1eda\u1edb\u1edc\u1edd\u1ede\u1edf\u1ee0\u1ee1\u1ee2\u1ee3]#o#g</translation>
<translation>s#[P\u01a4\u01a5\u1e54\u1e55\u1e56\u1e57]#p#g</translation>
<translation>s#[Q\u02a0]#q#g</translation>
<translation>s#[R\u0154\u0155\u0156\u0157\u0158\u0159\u0210\u0211\u0212\u0213\u027c\u027d\u027e\u1e58\u1e59\u1e5a\u1e5b\u1e5c\u1e5d\u1e5e\u1e5f]#r#g</translation>
<translation>s#[S\u015a\u015b\u015c\u015d\u015e\u015f\u0160\u0161\u0218\u0219\u0282\u1e60\u1e61\u1e62\u1e63\u1e64\u1e65\u1e66\u1e67\u1e68\u1e69]#s#g</translation>
<translation>s#[T\u0162\u0163\u0164\u0165\u0166\u0167\u01ab\u01ac\u01ad\u01ae\u0288\u021a\u021b\u0236\u1e6a\u1e6b\u1e6c\u1e6d\u1e6e\u1e6f\u1e70\u1e71\u1e97]#t#g</translation>
<translation>s#[U\u00d9\u00f9\u00da\u00fa\u00db\u00fb\u0168\u0169\u016a\u016b\u016c\u016d\u016e\u016f\u0170\u0171\u0172\u0173\u01af\u01b0\u01d3\u01d4\u01d5\u01d6\u01d7\u01d8\u01d9\u01da\u01db\u01dc\u0214\u0215\u0216\u0217\u1e72\u1e73\u1e74\u1e75\u1e76\u1e77\u1e78\u1e79\u1e7a\u1e7b\u1ee4\u1ee5\u1ee6\u1ee7\u1ee8\u1ee9\u1eea\u1eeb\u1eec\u1eed\u1eee\u1eef\u1ef0\u1ef1]#u#g</translation>
<translation>s#[V\u01b2\u028b\u1e7c\u1e7d\u1e7e\u1e7f]#v#g</translation>
<translation>s#[W\u0174\u0175\u1e80\u1e81\u1e82\u1e83\u1e84\u1e85\u1e86\u1e87\u1e88\u1e89\u1e98]#w#g</translation>
<translation>s#[X\u1e8a\u1e8b\u1e8c\u1e8d]#x#g</translation>
<translation>s#[Y\u00dd\u00fd\u00ff\u0178\u0176\u0177\u01b3\u01b4\u0232\u0233\u1e8e\u1e8f\u1e99\u1ef2\u1ef3\u1ef4\u1ef5\u1ef6\u1ef7\u1ef8\u1ef9]#y#g</translation>
<translation>s#[Z\u0179\u017a\u017b\u017c\u017d\u017e\u01b5\u01b6\u0224\u0225\u0290\u0291\u1e90\u1e91\u1e92\u1e93\u1e94\u1e95\u1e95]#z#g</translation>


Kind regards
Christian


</pre>
      <blockquote type="cite">
        <pre wrap="">I am interested in this.

Before I copy and paste this into "my" standard configuration: I do suspect that some of the characters may have been corrupted because of the email transport. Could you translate this from your original source into the "\uXXXX" notation and send it again? Or maybe raise an issue in GitHub with the original file attached.

Kind Regards,
Alex.

-------------------
Alexander Kandzior

Alkacon Software GmbH  - The OpenCms Experts
<a class="moz-txt-link-freetext" href="http://www.alkacon.com">http://www.alkacon.com</a> - <a class="moz-txt-link-freetext" href="http://www.opencms.org">http://www.opencms.org</a>


-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:opencms-dev-bounces@opencms.org">opencms-dev-bounces@opencms.org</a> [<a class="moz-txt-link-freetext" href="mailto:opencms-dev-bounces@opencms.org">mailto:opencms-dev-bounces@opencms.org</a>] On Behalf Of Christian Steinert
Sent: Thursday, August 09, 2012 4:51 AM
To: The OpenCms mailing list
Subject: [opencms-dev] Some additional file translation rules for dealing with diacritics

Dear All

Maybe somebody else is interested in this. I created some additional file translation rules to deal with additional characters that have diacritical marks.
The following rules can be added to the other translation rules in opencms-vfs.xml. You should leave all other rules as they are. Please make sure that you add these rules BEFORE THE LAST TWO RULES because the last two rules allow the default characters and replace everything else with "x".

These rules convert all latin characters into the the most similar basic form  and they will also convert everything to lower case.

Kind regards
Christian


<translation>s#[AÀàÁáÂâÃãÅåĀāĂ㥹ǍǎǞǟǠǡǺǻȀȁȂȃȦȧḀḁẚẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặ]#a#g</translation>
<translation>s#[BƀƁɓƂƃḂḃḄḅḆḇ]#b#g</translation>
<translation>s#[CÇçĆćĈĉĊċČčƇƈɕḈḉ]#c#g</translation>
<translation>s#[DĎďĐđƊɗƋƌDžDzȡɖḊḋḌḍḎḏḐḑḒḓ]#d#g</translation>
<translation>s#[EÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ]#e#g</translation>
<translation>s#[FƑƒḞḟ]#f#g</translation>
<translation>s#[GĜĝĞğĠġĢģƓɠǤǥǦǧǴǵḠḡ]#g#g</translation>
<translation>s#[HĤĥĦħȞȟɦḢḣḤḥḦḧḨḩḪḫẖ]#h#g</translation>
<translation>s#[IÌìÍíÎîÏïĨĩĪīĬĭĮįİƗɨǏǐȈȉȊȋḬḭḮḯỈỉỊị]#i#g</translation>
<translation>s#[JĴĵǰʝ]#j#g</translation>
<translation>s#[KĶķƘƙǨǩḰḱḲḳḴḵ]#k#g</translation>
<translation>s#[LĹĺĻļĽľĿŀŁłƚLjȴɫɬɭḶḷḸḹḺḻḼḽ]#l#g</translation>
<translation>s#[MɱḾḿṀṁṂṃ]#m#g</translation>
<translation>s#[NÑñŃńŅņŇňƝɲƞȠNjǸǹȵɳṄṅṆṇṈṉṊṋ]#n#g</translation>
<translation>s#[OÒòÓóÔôÕõØøŌōŎŏŐőƟƠơǑǒǪǫǬǭǾǿȌȍȎȏȪȫȬȭȮȯȰȱṌṍṎṏṐṑṒṓỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợ]#o#g</translation>
<translation>s#[PƤƥṔṕṖṗ]#p#g</translation>
<translation>s#[Qʠ]#q#g</translation>
<translation>s#[RŔŕŖŗŘřȐȑȒȓɼɽɾṘṙṚṛṜṝṞṟ]#r#g</translation>
<translation>s#[SŚśŜŝŞşŠšȘșʂṠṡṢṣṤṥṦṧṨṩ]#s#g</translation>
<translation>s#[TŢţŤťŦŧƫƬƭƮʈȚțȶṪṫṬṭṮṯṰṱẗ]#t#g</translation>
<translation>s#[UÙùÚúÛûŨũŪūŬŭŮůŰűŲųƯưǓǔǕǖǗǘǙǚǛǜȔȕȖȗṲṳṴṵṶṷṸṹṺṻỤụỦủỨứỪừỬửỮữỰự]#u#g</translation>
<translation>s#[VƲʋṼṽṾṿ]#v#g</translation>
<translation>s#[WŴŵẀẁẂẃẄẅẆẇẈẉẘ]#w#g</translation>
<translation>s#[XẊẋẌẍ]#x#g</translation>
<translation>s#[YÝýÿŸŶŷƳƴȲȳẎẏẙỲỳỴỵỶỷỸỹ]#y#g</translation>
<translation>s#[ZŹźŻżŽžƵƶȤȥʐʑẐẑẒẓẔẕẕ]#z#g</translation>

_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<a class="moz-txt-link-freetext" href="http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev">http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev</a>



_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<a class="moz-txt-link-freetext" href="http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev">http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev</a>



</pre>
      </blockquote>
      <pre wrap="">
_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<a class="moz-txt-link-freetext" href="http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev">http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev</a>




_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
<a class="moz-txt-link-freetext" href="http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev">http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev</a>



</pre>
    </blockquote>
    <br>
  </body>
</html>