[opencms-dev] Some additional file translation rules for dealing with diacritics
Christian Steinert
christian at christian-steinert.de
Tue Aug 14 17:19:39 CEST 2012
Dear Alex
I opened a bug and also created a pull-request. Please see https://github.com/alkacon/opencms-core/issues/68
I re-arranged the new translation rules in such a way that upper and lower case are now preserved. I also converted all non-ascii characters to XML character entities.
Kind regards
Christian
> Christian,
>
> thanks for this. I think we should add this to the core distribution. I'll have to see about the upper / lower case issue though, as it will be a change from the current handling and some may not like this.
>
> As you mentioned the bug tracker: Would you be able to send a pull request on GitHub as well? That would be great.
>
> Kind Regards,
> Alex.
>
> -------------------
> Alexander Kandzior
>
> Alkacon Software GmbH - The OpenCms Experts
> http://www.alkacon.com - http://www.opencms.org
>
>
> -----Original Message-----
> From: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Christian Steinert
> Sent: Friday, August 10, 2012 2:05 AM
> To: The OpenCms mailing list
> Subject: Re: [opencms-dev] Some additional file translation rules for dealing with diacritics
>
> Dear Alexander
>
> at least for my email client your email reply still had all characters intact but here are the characters in escaped form (I used java's navtive2ascii).
> If you want then I am happy to also create an issue in the bug tracker - just let me know. Right now the rules also convert everything to lower case which fits my needs. To preserve upper case, the characters would have to be re-shuffled and additional rules for upper case would need to be introduced.
>
> The first character in each rule is the standard uppercase character which should be elimintated altogether if the intention is to preserve upper case.
>
> Kind regards
> Christian
>
>
> <translation>s#[A\u00c0\u00e0\u00c1\u00e1\u00c2\u00e2\u00c3\u00e3\u00c5\u00e5\u0100\u0101\u0102\u0103\u0104\u0105\u01cd\u01ce\u01de\u01df\u01e0\u01e1\u01fa\u01fb\u0200\u0201\u0202\u0203\u0226\u0227\u1e00\u1e01\u1e9a\u1ea0\u1ea1\u1ea2\u1ea3\u1ea4\u1ea5\u1ea6\u1ea7\u1ea8\u1ea9\u1eaa\u1eab\u1eac\u1ead\u1eae\u1eaf\u1eb0\u1eb1\u1eb2\u1eb3\u1eb4\u1eb5\u1eb6\u1eb7]#a#g</translation>
> <translation>s#[B\u0180\u0181\u0253\u0182\u0183\u1e02\u1e03\u1e04\u1e05\u1e06\u1e07]#b#g</translation>
> <translation>s#[C\u00c7\u00e7\u0106\u0107\u0108\u0109\u010a\u010b\u010c\u010d\u0187\u0188\u0255\u1e08\u1e09]#c#g</translation>
> <translation>s#[D\u010e\u010f\u0110\u0111\u018a\u0257\u018b\u018c\u01c5\u01f2\u0221\u0256\u1e0a\u1e0b\u1e0c\u1e0d\u1e0e\u1e0f\u1e10\u1e11\u1e12\u1e13]#d#g</translation>
> <translation>s#[E\u00c8\u00e8\u00c9\u00e9\u00ca\u00ea\u00cb\u00eb\u0112\u0113\u0114\u0115\u0116\u0117\u0118\u0119\u011a\u011b\u0204\u0205\u0206\u0207\u0228\u0229\u1e14\u1e15\u1e16\u1e17\u1e18\u1e19\u1e1a\u1e1b\u1e1c\u1e1d\u1eb8\u1eb9\u1eba\u1ebb\u1ebc\u1ebd\u1ebe\u1ebf\u1ec0\u1ec1\u1ec2\u1ec3\u1ec4\u1ec5\u1ec6\u1ec7]#e#g</translation>
> <translation>s#[F\u0191\u0192\u1e1e\u1e1f]#f#g</translation>
> <translation>s#[G\u011c\u011d\u011e\u011f\u0120\u0121\u0122\u0123\u0193\u0260\u01e4\u01e5\u01e6\u01e7\u01f4\u01f5\u1e20\u1e21]#g#g</translation>
> <translation>s#[H\u0124\u0125\u0126\u0127\u021e\u021f\u0266\u1e22\u1e23\u1e24\u1e25\u1e26\u1e27\u1e28\u1e29\u1e2a\u1e2b\u1e96]#h#g</translation>
> <translation>s#[I\u00cc\u00ec\u00cd\u00ed\u00ce\u00ee\u00cf\u00ef\u0128\u0129\u012a\u012b\u012c\u012d\u012e\u012f\u0130\u0197\u0268\u01cf\u01d0\u0208\u0209\u020a\u020b\u1e2c\u1e2d\u1e2e\u1e2f\u1ec8\u1ec9\u1eca\u1ecb]#i#g</translation>
> <translation>s#[J\u0134\u0135\u01f0\u029d]#j#g</translation>
> <translation>s#[K\u0136\u0137\u0198\u0199\u01e8\u01e9\u1e30\u1e31\u1e32\u1e33\u1e34\u1e35]#k#g</translation>
> <translation>s#[L\u0139\u013a\u013b\u013c\u013d\u013e\u013f\u0140\u0141\u0142\u019a\u01c8\u0234\u026b\u026c\u026d\u1e36\u1e37\u1e38\u1e39\u1e3a\u1e3b\u1e3c\u1e3d]#l#g</translation>
> <translation>s#[M\u0271\u1e3e\u1e3f\u1e40\u1e41\u1e42\u1e43]#m#g</translation>
> <translation>s#[N\u00d1\u00f1\u0143\u0144\u0145\u0146\u0147\u0148\u019d\u0272\u019e\u0220\u01cb\u01f8\u01f9\u0235\u0273\u1e44\u1e45\u1e46\u1e47\u1e48\u1e49\u1e4a\u1e4b]#n#g</translation>
> <translation>s#[O\u00d2\u00f2\u00d3\u00f3\u00d4\u00f4\u00d5\u00f5\u00d8\u00f8\u014c\u014d\u014e\u014f\u0150\u0151\u019f\u01a0\u01a1\u01d1\u01d2\u01ea\u01eb\u01ec\u01ed\u01fe\u01ff\u020c\u020d\u020e\u020f\u022a\u022b\u022c\u022d\u022e\u022f\u0230\u0231\u1e4c\u1e4d\u1e4e\u1e4f\u1e50\u1e51\u1e52\u1e53\u1ecc\u1ecd\u1ece\u1ecf\u1ed0\u1ed1\u1ed2\u1ed3\u1ed4\u1ed5\u1ed6\u1ed7\u1ed8\u1ed9\u1eda\u1edb\u1edc\u1edd\u1ede\u1edf\u1ee0\u1ee1\u1ee2\u1ee3]#o#g</translation>
> <translation>s#[P\u01a4\u01a5\u1e54\u1e55\u1e56\u1e57]#p#g</translation>
> <translation>s#[Q\u02a0]#q#g</translation>
> <translation>s#[R\u0154\u0155\u0156\u0157\u0158\u0159\u0210\u0211\u0212\u0213\u027c\u027d\u027e\u1e58\u1e59\u1e5a\u1e5b\u1e5c\u1e5d\u1e5e\u1e5f]#r#g</translation>
> <translation>s#[S\u015a\u015b\u015c\u015d\u015e\u015f\u0160\u0161\u0218\u0219\u0282\u1e60\u1e61\u1e62\u1e63\u1e64\u1e65\u1e66\u1e67\u1e68\u1e69]#s#g</translation>
> <translation>s#[T\u0162\u0163\u0164\u0165\u0166\u0167\u01ab\u01ac\u01ad\u01ae\u0288\u021a\u021b\u0236\u1e6a\u1e6b\u1e6c\u1e6d\u1e6e\u1e6f\u1e70\u1e71\u1e97]#t#g</translation>
> <translation>s#[U\u00d9\u00f9\u00da\u00fa\u00db\u00fb\u0168\u0169\u016a\u016b\u016c\u016d\u016e\u016f\u0170\u0171\u0172\u0173\u01af\u01b0\u01d3\u01d4\u01d5\u01d6\u01d7\u01d8\u01d9\u01da\u01db\u01dc\u0214\u0215\u0216\u0217\u1e72\u1e73\u1e74\u1e75\u1e76\u1e77\u1e78\u1e79\u1e7a\u1e7b\u1ee4\u1ee5\u1ee6\u1ee7\u1ee8\u1ee9\u1eea\u1eeb\u1eec\u1eed\u1eee\u1eef\u1ef0\u1ef1]#u#g</translation>
> <translation>s#[V\u01b2\u028b\u1e7c\u1e7d\u1e7e\u1e7f]#v#g</translation>
> <translation>s#[W\u0174\u0175\u1e80\u1e81\u1e82\u1e83\u1e84\u1e85\u1e86\u1e87\u1e88\u1e89\u1e98]#w#g</translation>
> <translation>s#[X\u1e8a\u1e8b\u1e8c\u1e8d]#x#g</translation>
> <translation>s#[Y\u00dd\u00fd\u00ff\u0178\u0176\u0177\u01b3\u01b4\u0232\u0233\u1e8e\u1e8f\u1e99\u1ef2\u1ef3\u1ef4\u1ef5\u1ef6\u1ef7\u1ef8\u1ef9]#y#g</translation>
> <translation>s#[Z\u0179\u017a\u017b\u017c\u017d\u017e\u01b5\u01b6\u0224\u0225\u0290\u0291\u1e90\u1e91\u1e92\u1e93\u1e94\u1e95\u1e95]#z#g</translation>
>
>
> Kind regards
> Christian
>
>
>> I am interested in this.
>>
>> Before I copy and paste this into "my" standard configuration: I do suspect that some of the characters may have been corrupted because of the email transport. Could you translate this from your original source into the "\uXXXX" notation and send it again? Or maybe raise an issue in GitHub with the original file attached.
>>
>> Kind Regards,
>> Alex.
>>
>> -------------------
>> Alexander Kandzior
>>
>> Alkacon Software GmbH - The OpenCms Experts
>> http://www.alkacon.com - http://www.opencms.org
>>
>>
>> -----Original Message-----
>> From: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Christian Steinert
>> Sent: Thursday, August 09, 2012 4:51 AM
>> To: The OpenCms mailing list
>> Subject: [opencms-dev] Some additional file translation rules for dealing with diacritics
>>
>> Dear All
>>
>> Maybe somebody else is interested in this. I created some additional file translation rules to deal with additional characters that have diacritical marks.
>> The following rules can be added to the other translation rules in opencms-vfs.xml. You should leave all other rules as they are. Please make sure that you add these rules BEFORE THE LAST TWO RULES because the last two rules allow the default characters and replace everything else with "x".
>>
>> These rules convert all latin characters into the the most similar basic form and they will also convert everything to lower case.
>>
>> Kind regards
>> Christian
>>
>>
>> <translation>s#[AÀàÁáÂâÃãÅåĀāĂ㥹ǍǎǞǟǠǡǺǻȀȁȂȃȦȧḀḁẚẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặ]#a#g</translation>
>> <translation>s#[BƀƁɓƂƃḂḃḄḅḆḇ]#b#g</translation>
>> <translation>s#[CÇçĆćĈĉĊċČčƇƈɕḈḉ]#c#g</translation>
>> <translation>s#[DĎďĐđƊɗƋƌDžDzȡɖḊḋḌḍḎḏḐḑḒḓ]#d#g</translation>
>> <translation>s#[EÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ]#e#g</translation>
>> <translation>s#[FƑƒḞḟ]#f#g</translation>
>> <translation>s#[GĜĝĞğĠġĢģƓɠǤǥǦǧǴǵḠḡ]#g#g</translation>
>> <translation>s#[HĤĥĦħȞȟɦḢḣḤḥḦḧḨḩḪḫẖ]#h#g</translation>
>> <translation>s#[IÌìÍíÎîÏïĨĩĪīĬĭĮįİƗɨǏǐȈȉȊȋḬḭḮḯỈỉỊị]#i#g</translation>
>> <translation>s#[JĴĵǰʝ]#j#g</translation>
>> <translation>s#[KĶķƘƙǨǩḰḱḲḳḴḵ]#k#g</translation>
>> <translation>s#[LĹĺĻļĽľĿŀŁłƚLjȴɫɬɭḶḷḸḹḺḻḼḽ]#l#g</translation>
>> <translation>s#[MɱḾḿṀṁṂṃ]#m#g</translation>
>> <translation>s#[NÑñŃńŅņŇňƝɲƞȠNjǸǹȵɳṄṅṆṇṈṉṊṋ]#n#g</translation>
>> <translation>s#[OÒòÓóÔôÕõØøŌōŎŏŐőƟƠơǑǒǪǫǬǭǾǿȌȍȎȏȪȫȬȭȮȯȰȱṌṍṎṏṐṑṒṓỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợ]#o#g</translation>
>> <translation>s#[PƤƥṔṕṖṗ]#p#g</translation>
>> <translation>s#[Qʠ]#q#g</translation>
>> <translation>s#[RŔŕŖŗŘřȐȑȒȓɼɽɾṘṙṚṛṜṝṞṟ]#r#g</translation>
>> <translation>s#[SŚśŜŝŞşŠšȘșʂṠṡṢṣṤṥṦṧṨṩ]#s#g</translation>
>> <translation>s#[TŢţŤťŦŧƫƬƭƮʈȚțȶṪṫṬṭṮṯṰṱẗ]#t#g</translation>
>> <translation>s#[UÙùÚúÛûŨũŪūŬŭŮůŰűŲųƯưǓǔǕǖǗǘǙǚǛǜȔȕȖȗṲṳṴṵṶṷṸṹṺṻỤụỦủỨứỪừỬửỮữỰự]#u#g</translation>
>> <translation>s#[VƲʋṼṽṾṿ]#v#g</translation>
>> <translation>s#[WŴŵẀẁẂẃẄẅẆẇẈẉẘ]#w#g</translation>
>> <translation>s#[XẊẋẌẍ]#x#g</translation>
>> <translation>s#[YÝýÿŸŶŷƳƴȲȳẎẏẙỲỳỴỵỶỷỸỹ]#y#g</translation>
>> <translation>s#[ZŹźŻżŽžƵƶȤȥʐʑẐẑẒẓẔẕẕ]#z#g</translation>
>>
>> _______________________________________________
>> This mail is sent to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please visit
>> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>>
>>
>>
>> _______________________________________________
>> This mail is sent to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please visit
>> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>>
>>
>>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20120814/e243c760/attachment.htm>
More information about the opencms-dev
mailing list