[opencms-dev] Some additional file translation rules for dealing with diacritics

Alexander Kandzior alex at opencms.org
Fri Aug 10 11:52:40 CEST 2012


Christian, 

thanks for this. I think we should add this to the core distribution. I'll have to see about the upper / lower case issue though, as it will be a change from the current handling and some may not like this.

As you mentioned the bug tracker: Would you be able to send a pull request on GitHub as well? That would be great.

Kind Regards,
Alex.
 
-------------------
Alexander Kandzior
                                                              
Alkacon Software GmbH  - The OpenCms Experts                 
http://www.alkacon.com - http://www.opencms.org                  


-----Original Message-----
From: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Christian Steinert
Sent: Friday, August 10, 2012 2:05 AM
To: The OpenCms mailing list
Subject: Re: [opencms-dev] Some additional file translation rules for dealing with diacritics

Dear Alexander

at least for my email client your email reply still had all characters intact but here are the characters in escaped form (I used java's navtive2ascii).
If you want then I am happy to also create an issue in the bug tracker - just let me know. Right now the rules also convert everything to lower case which fits my needs. To preserve upper case, the characters would have to be re-shuffled and additional rules for upper case would need to be introduced.

The first character in each rule is the standard uppercase character which should be elimintated altogether if the intention is to preserve upper case.

Kind regards
Christian


<translation>s#[A\u00c0\u00e0\u00c1\u00e1\u00c2\u00e2\u00c3\u00e3\u00c5\u00e5\u0100\u0101\u0102\u0103\u0104\u0105\u01cd\u01ce\u01de\u01df\u01e0\u01e1\u01fa\u01fb\u0200\u0201\u0202\u0203\u0226\u0227\u1e00\u1e01\u1e9a\u1ea0\u1ea1\u1ea2\u1ea3\u1ea4\u1ea5\u1ea6\u1ea7\u1ea8\u1ea9\u1eaa\u1eab\u1eac\u1ead\u1eae\u1eaf\u1eb0\u1eb1\u1eb2\u1eb3\u1eb4\u1eb5\u1eb6\u1eb7]#a#g</translation>
<translation>s#[B\u0180\u0181\u0253\u0182\u0183\u1e02\u1e03\u1e04\u1e05\u1e06\u1e07]#b#g</translation>
<translation>s#[C\u00c7\u00e7\u0106\u0107\u0108\u0109\u010a\u010b\u010c\u010d\u0187\u0188\u0255\u1e08\u1e09]#c#g</translation>
<translation>s#[D\u010e\u010f\u0110\u0111\u018a\u0257\u018b\u018c\u01c5\u01f2\u0221\u0256\u1e0a\u1e0b\u1e0c\u1e0d\u1e0e\u1e0f\u1e10\u1e11\u1e12\u1e13]#d#g</translation>
<translation>s#[E\u00c8\u00e8\u00c9\u00e9\u00ca\u00ea\u00cb\u00eb\u0112\u0113\u0114\u0115\u0116\u0117\u0118\u0119\u011a\u011b\u0204\u0205\u0206\u0207\u0228\u0229\u1e14\u1e15\u1e16\u1e17\u1e18\u1e19\u1e1a\u1e1b\u1e1c\u1e1d\u1eb8\u1eb9\u1eba\u1ebb\u1ebc\u1ebd\u1ebe\u1ebf\u1ec0\u1ec1\u1ec2\u1ec3\u1ec4\u1ec5\u1ec6\u1ec7]#e#g</translation>
<translation>s#[F\u0191\u0192\u1e1e\u1e1f]#f#g</translation>
<translation>s#[G\u011c\u011d\u011e\u011f\u0120\u0121\u0122\u0123\u0193\u0260\u01e4\u01e5\u01e6\u01e7\u01f4\u01f5\u1e20\u1e21]#g#g</translation>
<translation>s#[H\u0124\u0125\u0126\u0127\u021e\u021f\u0266\u1e22\u1e23\u1e24\u1e25\u1e26\u1e27\u1e28\u1e29\u1e2a\u1e2b\u1e96]#h#g</translation>
<translation>s#[I\u00cc\u00ec\u00cd\u00ed\u00ce\u00ee\u00cf\u00ef\u0128\u0129\u012a\u012b\u012c\u012d\u012e\u012f\u0130\u0197\u0268\u01cf\u01d0\u0208\u0209\u020a\u020b\u1e2c\u1e2d\u1e2e\u1e2f\u1ec8\u1ec9\u1eca\u1ecb]#i#g</translation>
<translation>s#[J\u0134\u0135\u01f0\u029d]#j#g</translation>
<translation>s#[K\u0136\u0137\u0198\u0199\u01e8\u01e9\u1e30\u1e31\u1e32\u1e33\u1e34\u1e35]#k#g</translation>
<translation>s#[L\u0139\u013a\u013b\u013c\u013d\u013e\u013f\u0140\u0141\u0142\u019a\u01c8\u0234\u026b\u026c\u026d\u1e36\u1e37\u1e38\u1e39\u1e3a\u1e3b\u1e3c\u1e3d]#l#g</translation>
<translation>s#[M\u0271\u1e3e\u1e3f\u1e40\u1e41\u1e42\u1e43]#m#g</translation>
<translation>s#[N\u00d1\u00f1\u0143\u0144\u0145\u0146\u0147\u0148\u019d\u0272\u019e\u0220\u01cb\u01f8\u01f9\u0235\u0273\u1e44\u1e45\u1e46\u1e47\u1e48\u1e49\u1e4a\u1e4b]#n#g</translation>
<translation>s#[O\u00d2\u00f2\u00d3\u00f3\u00d4\u00f4\u00d5\u00f5\u00d8\u00f8\u014c\u014d\u014e\u014f\u0150\u0151\u019f\u01a0\u01a1\u01d1\u01d2\u01ea\u01eb\u01ec\u01ed\u01fe\u01ff\u020c\u020d\u020e\u020f\u022a\u022b\u022c\u022d\u022e\u022f\u0230\u0231\u1e4c\u1e4d\u1e4e\u1e4f\u1e50\u1e51\u1e52\u1e53\u1ecc\u1ecd\u1ece\u1ecf\u1ed0\u1ed1\u1ed2\u1ed3\u1ed4\u1ed5\u1ed6\u1ed7\u1ed8\u1ed9\u1eda\u1edb\u1edc\u1edd\u1ede\u1edf\u1ee0\u1ee1\u1ee2\u1ee3]#o#g</translation>
<translation>s#[P\u01a4\u01a5\u1e54\u1e55\u1e56\u1e57]#p#g</translation>
<translation>s#[Q\u02a0]#q#g</translation>
<translation>s#[R\u0154\u0155\u0156\u0157\u0158\u0159\u0210\u0211\u0212\u0213\u027c\u027d\u027e\u1e58\u1e59\u1e5a\u1e5b\u1e5c\u1e5d\u1e5e\u1e5f]#r#g</translation>
<translation>s#[S\u015a\u015b\u015c\u015d\u015e\u015f\u0160\u0161\u0218\u0219\u0282\u1e60\u1e61\u1e62\u1e63\u1e64\u1e65\u1e66\u1e67\u1e68\u1e69]#s#g</translation>
<translation>s#[T\u0162\u0163\u0164\u0165\u0166\u0167\u01ab\u01ac\u01ad\u01ae\u0288\u021a\u021b\u0236\u1e6a\u1e6b\u1e6c\u1e6d\u1e6e\u1e6f\u1e70\u1e71\u1e97]#t#g</translation>
<translation>s#[U\u00d9\u00f9\u00da\u00fa\u00db\u00fb\u0168\u0169\u016a\u016b\u016c\u016d\u016e\u016f\u0170\u0171\u0172\u0173\u01af\u01b0\u01d3\u01d4\u01d5\u01d6\u01d7\u01d8\u01d9\u01da\u01db\u01dc\u0214\u0215\u0216\u0217\u1e72\u1e73\u1e74\u1e75\u1e76\u1e77\u1e78\u1e79\u1e7a\u1e7b\u1ee4\u1ee5\u1ee6\u1ee7\u1ee8\u1ee9\u1eea\u1eeb\u1eec\u1eed\u1eee\u1eef\u1ef0\u1ef1]#u#g</translation>
<translation>s#[V\u01b2\u028b\u1e7c\u1e7d\u1e7e\u1e7f]#v#g</translation>
<translation>s#[W\u0174\u0175\u1e80\u1e81\u1e82\u1e83\u1e84\u1e85\u1e86\u1e87\u1e88\u1e89\u1e98]#w#g</translation>
<translation>s#[X\u1e8a\u1e8b\u1e8c\u1e8d]#x#g</translation>
<translation>s#[Y\u00dd\u00fd\u00ff\u0178\u0176\u0177\u01b3\u01b4\u0232\u0233\u1e8e\u1e8f\u1e99\u1ef2\u1ef3\u1ef4\u1ef5\u1ef6\u1ef7\u1ef8\u1ef9]#y#g</translation>
<translation>s#[Z\u0179\u017a\u017b\u017c\u017d\u017e\u01b5\u01b6\u0224\u0225\u0290\u0291\u1e90\u1e91\u1e92\u1e93\u1e94\u1e95\u1e95]#z#g</translation>


Kind regards
Christian


> I am interested in this.
>
> Before I copy and paste this into "my" standard configuration: I do suspect that some of the characters may have been corrupted because of the email transport. Could you translate this from your original source into the "\uXXXX" notation and send it again? Or maybe raise an issue in GitHub with the original file attached.
>
> Kind Regards,
> Alex.
>
> -------------------
> Alexander Kandzior
>
> Alkacon Software GmbH  - The OpenCms Experts
> http://www.alkacon.com - http://www.opencms.org
>
>
> -----Original Message-----
> From: opencms-dev-bounces at opencms.org [mailto:opencms-dev-bounces at opencms.org] On Behalf Of Christian Steinert
> Sent: Thursday, August 09, 2012 4:51 AM
> To: The OpenCms mailing list
> Subject: [opencms-dev] Some additional file translation rules for dealing with diacritics
>
> Dear All
>
> Maybe somebody else is interested in this. I created some additional file translation rules to deal with additional characters that have diacritical marks.
> The following rules can be added to the other translation rules in opencms-vfs.xml. You should leave all other rules as they are. Please make sure that you add these rules BEFORE THE LAST TWO RULES because the last two rules allow the default characters and replace everything else with "x".
>
> These rules convert all latin characters into the the most similar basic form  and they will also convert everything to lower case.
>
> Kind regards
> Christian
>
>
> <translation>s#[AÀàÁáÂâÃãÅåĀāĂ㥹ǍǎǞǟǠǡǺǻȀȁȂȃȦȧḀḁẚẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặ]#a#g</translation>
> <translation>s#[BƀƁɓƂƃḂḃḄḅḆḇ]#b#g</translation>
> <translation>s#[CÇçĆćĈĉĊċČčƇƈɕḈḉ]#c#g</translation>
> <translation>s#[DĎďĐđƊɗƋƌDžDzȡɖḊḋḌḍḎḏḐḑḒḓ]#d#g</translation>
> <translation>s#[EÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ]#e#g</translation>
> <translation>s#[FƑƒḞḟ]#f#g</translation>
> <translation>s#[GĜĝĞğĠġĢģƓɠǤǥǦǧǴǵḠḡ]#g#g</translation>
> <translation>s#[HĤĥĦħȞȟɦḢḣḤḥḦḧḨḩḪḫẖ]#h#g</translation>
> <translation>s#[IÌìÍíÎîÏïĨĩĪīĬĭĮįİƗɨǏǐȈȉȊȋḬḭḮḯỈỉỊị]#i#g</translation>
> <translation>s#[JĴĵǰʝ]#j#g</translation>
> <translation>s#[KĶķƘƙǨǩḰḱḲḳḴḵ]#k#g</translation>
> <translation>s#[LĹĺĻļĽľĿŀŁłƚLjȴɫɬɭḶḷḸḹḺḻḼḽ]#l#g</translation>
> <translation>s#[MɱḾḿṀṁṂṃ]#m#g</translation>
> <translation>s#[NÑñŃńŅņŇňƝɲƞȠNjǸǹȵɳṄṅṆṇṈṉṊṋ]#n#g</translation>
> <translation>s#[OÒòÓóÔôÕõØøŌōŎŏŐőƟƠơǑǒǪǫǬǭǾǿȌȍȎȏȪȫȬȭȮȯȰȱṌṍṎṏṐṑṒṓỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợ]#o#g</translation>
> <translation>s#[PƤƥṔṕṖṗ]#p#g</translation>
> <translation>s#[Qʠ]#q#g</translation>
> <translation>s#[RŔŕŖŗŘřȐȑȒȓɼɽɾṘṙṚṛṜṝṞṟ]#r#g</translation>
> <translation>s#[SŚśŜŝŞşŠšȘșʂṠṡṢṣṤṥṦṧṨṩ]#s#g</translation>
> <translation>s#[TŢţŤťŦŧƫƬƭƮʈȚțȶṪṫṬṭṮṯṰṱẗ]#t#g</translation>
> <translation>s#[UÙùÚúÛûŨũŪūŬŭŮůŰűŲųƯưǓǔǕǖǗǘǙǚǛǜȔȕȖȗṲṳṴṵṶṷṸṹṺṻỤụỦủỨứỪừỬửỮữỰự]#u#g</translation>
> <translation>s#[VƲʋṼṽṾṿ]#v#g</translation>
> <translation>s#[WŴŵẀẁẂẃẄẅẆẇẈẉẘ]#w#g</translation>
> <translation>s#[XẊẋẌẍ]#x#g</translation>
> <translation>s#[YÝýÿŸŶŷƳƴȲȳẎẏẙỲỳỴỵỶỷỸỹ]#y#g</translation>
> <translation>s#[ZŹźŻżŽžƵƶȤȥʐʑẐẑẒẓẔẕẕ]#z#g</translation>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>
>
>

_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev







More information about the opencms-dev mailing list