[opencms-dev] Some additional file translation rules for dealing with diacritics

Christian Steinert christian at christian-steinert.de
Fri Aug 10 02:05:02 CEST 2012


Dear Alexander

at least for my email client your email reply still had all characters intact but here are the characters in escaped form (I used java's navtive2ascii).
If you want then I am happy to also create an issue in the bug tracker - just let me know. Right now the rules also convert everything to lower case which fits my needs. To preserve upper case, the characters would have to be re-shuffled and additional rules for upper case would need to be introduced.

The first character in each rule is the standard uppercase character which should be elimintated altogether if the intention is to preserve upper case.

Kind regards
Christian


<translation>s#[A\u00c0\u00e0\u00c1\u00e1\u00c2\u00e2\u00c3\u00e3\u00c5\u00e5\u0100\u0101\u0102\u0103\u0104\u0105\u01cd\u01ce\u01de\u01df\u01e0\u01e1\u01fa\u01fb\u0200\u0201\u0202\u0203\u0226\u0227\u1e00\u1e01\u1e9a\u1ea0\u1ea1\u1ea2\u1ea3\u1ea4\u1ea5\u1ea6\u1ea7\u1ea8\u1ea9\u1eaa\u1eab\u1eac\u1ead\u1eae\u1eaf\u1eb0\u1eb1\u1eb2\u1eb3\u1eb4\u1eb5\u1eb6\u1eb7]#a#g</translation>
<translation>s#[B\u0180\u0181\u0253\u0182\u0183\u1e02\u1e03\u1e04\u1e05\u1e06\u1e07]#b#g</translation>
<translation>s#[C\u00c7\u00e7\u0106\u0107\u0108\u0109\u010a\u010b\u010c\u010d\u0187\u0188\u0255\u1e08\u1e09]#c#g</translation>
<translation>s#[D\u010e\u010f\u0110\u0111\u018a\u0257\u018b\u018c\u01c5\u01f2\u0221\u0256\u1e0a\u1e0b\u1e0c\u1e0d\u1e0e\u1e0f\u1e10\u1e11\u1e12\u1e13]#d#g</translation>
<translation>s#[E\u00c8\u00e8\u00c9\u00e9\u00ca\u00ea\u00cb\u00eb\u0112\u0113\u0114\u0115\u0116\u0117\u0118\u0119\u011a\u011b\u0204\u0205\u0206\u0207\u0228\u0229\u1e14\u1e15\u1e16\u1e17\u1e18\u1e19\u1e1a\u1e1b\u1e1c\u1e1d\u1eb8\u1eb9\u1eba\u1ebb\u1ebc\u1ebd\u1ebe\u1ebf\u1ec0\u1ec1\u1ec2\u1ec3\u1ec4\u1ec5\u1ec6\u1ec7]#e#g</translation>
<translation>s#[F\u0191\u0192\u1e1e\u1e1f]#f#g</translation>
<translation>s#[G\u011c\u011d\u011e\u011f\u0120\u0121\u0122\u0123\u0193\u0260\u01e4\u01e5\u01e6\u01e7\u01f4\u01f5\u1e20\u1e21]#g#g</translation>
<translation>s#[H\u0124\u0125\u0126\u0127\u021e\u021f\u0266\u1e22\u1e23\u1e24\u1e25\u1e26\u1e27\u1e28\u1e29\u1e2a\u1e2b\u1e96]#h#g</translation>
<translation>s#[I\u00cc\u00ec\u00cd\u00ed\u00ce\u00ee\u00cf\u00ef\u0128\u0129\u012a\u012b\u012c\u012d\u012e\u012f\u0130\u0197\u0268\u01cf\u01d0\u0208\u0209\u020a\u020b\u1e2c\u1e2d\u1e2e\u1e2f\u1ec8\u1ec9\u1eca\u1ecb]#i#g</translation>
<translation>s#[J\u0134\u0135\u01f0\u029d]#j#g</translation>
<translation>s#[K\u0136\u0137\u0198\u0199\u01e8\u01e9\u1e30\u1e31\u1e32\u1e33\u1e34\u1e35]#k#g</translation>
<translation>s#[L\u0139\u013a\u013b\u013c\u013d\u013e\u013f\u0140\u0141\u0142\u019a\u01c8\u0234\u026b\u026c\u026d\u1e36\u1e37\u1e38\u1e39\u1e3a\u1e3b\u1e3c\u1e3d]#l#g</translation>
<translation>s#[M\u0271\u1e3e\u1e3f\u1e40\u1e41\u1e42\u1e43]#m#g</translation>
<translation>s#[N\u00d1\u00f1\u0143\u0144\u0145\u0146\u0147\u0148\u019d\u0272\u019e\u0220\u01cb\u01f8\u01f9\u0235\u0273\u1e44\u1e45\u1e46\u1e47\u1e48\u1e49\u1e4a\u1e4b]#n#g</translation>
<translation>s#[O\u00d2\u00f2\u00d3\u00f3\u00d4\u00f4\u00d5\u00f5\u00d8\u00f8\u014c\u014d\u014e\u014f\u0150\u0151\u019f\u01a0\u01a1\u01d1\u01d2\u01ea\u01eb\u01ec\u01ed\u01fe\u01ff\u020c\u020d\u020e\u020f\u022a\u022b\u022c\u022d\u022e\u022f\u0230\u0231\u1e4c\u1e4d\u1e4e\u1e4f\u1e50\u1e51\u1e52\u1e53\u1ecc\u1ecd\u1ece\u1ecf\u1ed0\u1ed1\u1ed2\u1ed3\u1ed4\u1ed5\u1ed6\u1ed7\u1ed8\u1ed9\u1eda\u1edb\u1edc\u1edd\u1ede\u1edf\u1ee0\u1ee1\u1ee2\u1ee3]#o#g</translation>
<translation>s#[P\u01a4\u01a5\u1e54\u1e55\u1e56\u1e57]#p#g</translation>
<translation>s#[Q\u02a0]#q#g</translation>
<translation>s#[R\u0154\u0155\u0156\u0157\u0158\u0159\u0210\u0211\u0212\u0213\u027c\u027d\u027e\u1e58\u1e59\u1e5a\u1e5b\u1e5c\u1e5d\u1e5e\u1e5f]#r#g</translation>
<translation>s#[S\u015a\u015b\u015c\u015d\u015e\u015f\u0160\u0161\u0218\u0219\u0282\u1e60\u1e61\u1e62\u1e63\u1e64\u1e65\u1e66\u1e67\u1e68\u1e69]#s#g</translation>
<translation>s#[T\u0162\u0163\u0164\u0165\u0166\u0167\u01ab\u01ac\u01ad\u01ae\u0288\u021a\u021b\u0236\u1e6a\u1e6b\u1e6c\u1e6d\u1e6e\u1e6f\u1e70\u1e71\u1e97]#t#g</translation>
<translation>s#[U\u00d9\u00f9\u00da\u00fa\u00db\u00fb\u0168\u0169\u016a\u016b\u016c\u016d\u016e\u016f\u0170\u0171\u0172\u0173\u01af\u01b0\u01d3\u01d4\u01d5\u01d6\u01d7\u01d8\u01d9\u01da\u01db\u01dc\u0214\u0215\u0216\u0217\u1e72\u1e73\u1e74\u1e75\u1e76\u1e77\u1e78\u1e79\u1e7a\u1e7b\u1ee4\u1ee5\u1ee6\u1ee7\u1ee8\u1ee9\u1eea\u1eeb\u1eec\u1eed\u1eee\u1eef\u1ef0\u1ef1]#u#g</translation>
<translation>s#[V\u01b2\u028b\u1e7c\u1e7d\u1e7e\u1e7f]#v#g</translation>
<translation>s#[W\u0174\u0175\u1e80\u1e81\u1e82\u1e83\u1e84\u1e85\u1e86\u1e87\u1e88\u1e89\u1e98]#w#g</translation>
<translation>s#[X\u1e8a\u1e8b\u1e8c\u1e8d]#x#g</translation>
<translation>s#[Y\u00dd\u00fd\u00ff\u0178\u0176\u0177\u01b3\u01b4\u0232\u0233\u1e8e\u1e8f\u1e99\u1ef2\u1ef3\u1ef4\u1ef5\u1ef6\u1ef7\u1ef8\u1ef9]#y#g</translation>
<translation>s#[Z\u0179\u017a\u017b\u017c\u017d\u017e\u01b5\u01b6\u0224\u0225\u0290\u0291\u1e90\u1e91\u1e92\u1e93\u1e94\u1e95\u1e95]#z#g</translation>


Kind regards
Christian


> I am interested in this.
>
> Before I copy and paste this into "my" standard configuration: I do suspect that some of the characters may have been corrupted because of the email transport. Could you translate this from your original source into the "\uXXXX" notation and send it again? Or maybe raise an issue in GitHub with the original file attached.
>
> Kind Regards,
> Alex.
>
> -------------------
> Alexander Kandzior
>
> Alkacon Software GmbH  - The OpenCms Experts
> http://www.alkacon.com - http://www.opencms.org
>
>
> -----Original Message-----
> From: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Christian Steinert
> Sent: Thursday, August 09, 2012 4:51 AM
> To: The OpenCms mailing list
> Subject: [opencms-dev] Some additional file translation rules for dealing with diacritics
>
> Dear All
>
> Maybe somebody else is interested in this. I created some additional file translation rules to deal with additional characters that have diacritical marks.
> The following rules can be added to the other translation rules in opencms-vfs.xml. You should leave all other rules as they are. Please make sure that you add these rules BEFORE THE LAST TWO RULES because the last two rules allow the default characters and replace everything else with "x".
>
> These rules convert all latin characters into the the most similar basic form  and they will also convert everything to lower case.
>
> Kind regards
> Christian
>
>
> <translation>s#[AÀàÁáÂâÃãÅåĀāĂ㥹ǍǎǞǟǠǡǺǻȀȁȂȃȦȧḀḁẚẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặ]#a#g</translation>
> <translation>s#[BƀƁɓƂƃḂḃḄḅḆḇ]#b#g</translation>
> <translation>s#[CÇçĆćĈĉĊċČčƇƈɕḈḉ]#c#g</translation>
> <translation>s#[DĎďĐđƊɗƋƌDžDzȡɖḊḋḌḍḎḏḐḑḒḓ]#d#g</translation>
> <translation>s#[EÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ]#e#g</translation>
> <translation>s#[FƑƒḞḟ]#f#g</translation>
> <translation>s#[GĜĝĞğĠġĢģƓɠǤǥǦǧǴǵḠḡ]#g#g</translation>
> <translation>s#[HĤĥĦħȞȟɦḢḣḤḥḦḧḨḩḪḫẖ]#h#g</translation>
> <translation>s#[IÌìÍíÎîÏïĨĩĪīĬĭĮįİƗɨǏǐȈȉȊȋḬḭḮḯỈỉỊị]#i#g</translation>
> <translation>s#[JĴĵǰʝ]#j#g</translation>
> <translation>s#[KĶķƘƙǨǩḰḱḲḳḴḵ]#k#g</translation>
> <translation>s#[LĹĺĻļĽľĿŀŁłƚLjȴɫɬɭḶḷḸḹḺḻḼḽ]#l#g</translation>
> <translation>s#[MɱḾḿṀṁṂṃ]#m#g</translation>
> <translation>s#[NÑñŃńŅņŇňƝɲƞȠNjǸǹȵɳṄṅṆṇṈṉṊṋ]#n#g</translation>
> <translation>s#[OÒòÓóÔôÕõØøŌōŎŏŐőƟƠơǑǒǪǫǬǭǾǿȌȍȎȏȪȫȬȭȮȯȰȱṌṍṎṏṐṑṒṓỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợ]#o#g</translation>
> <translation>s#[PƤƥṔṕṖṗ]#p#g</translation>
> <translation>s#[Qʠ]#q#g</translation>
> <translation>s#[RŔŕŖŗŘřȐȑȒȓɼɽɾṘṙṚṛṜṝṞṟ]#r#g</translation>
> <translation>s#[SŚśŜŝŞşŠšȘșʂṠṡṢṣṤṥṦṧṨṩ]#s#g</translation>
> <translation>s#[TŢţŤťŦŧƫƬƭƮʈȚțȶṪṫṬṭṮṯṰṱẗ]#t#g</translation>
> <translation>s#[UÙùÚúÛûŨũŪūŬŭŮůŰűŲųƯưǓǔǕǖǗǘǙǚǛǜȔȕȖȗṲṳṴṵṶṷṸṹṺṻỤụỦủỨứỪừỬửỮữỰự]#u#g</translation>
> <translation>s#[VƲʋṼṽṾṿ]#v#g</translation>
> <translation>s#[WŴŵẀẁẂẃẄẅẆẇẈẉẘ]#w#g</translation>
> <translation>s#[XẊẋẌẍ]#x#g</translation>
> <translation>s#[YÝýÿŸŶŷƳƴȲȳẎẏẙỲỳỴỵỶỷỸỹ]#y#g</translation>
> <translation>s#[ZŹźŻżŽžƵƶȤȥʐʑẐẑẒẓẔẕẕ]#z#g</translation>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>




More information about the opencms-dev mailing list