i18n and non-Latin: cyrillic hacks


27 November 2008

Now I'm working on some cyrillic site (bilingual, Russian and Ukrainian only). There is some issues about non-Latin characters in page titles:

1. sapphire/javascript/UpdateURL.js - default javascript is valid only for Latin characters; on cyrillic it "suggests" bare blank URL segments or series of "-". Common solution will be enabling Unicode URL segments like this: [url]Милан[/url], but for certain language/alphabet it can be hacked with transliteration.

Here's my patched sapphire/javascript/UpdateURL.js for Cyrillic:

var cyrChars = new Array('А','Б','В','Г','Д','Е','Ё','Ж','З','И','Й','К','Л','М','Н','О','П','Р','С','Т','У','Ф','Х','Ч','Ц','Ш','Щ','Э','Ю','Я','Ы','Ъ','Ь',
var transChars = new Array('A','B','V','G','D','E','Jo','Zh','Z','I','J','K','L','M','N','O','P','R','S','T','U','F','H','Ch','C','Sh','Csh','E','Ju','Ja','Y','\`','\'',

function translitCyrillic(from){
   var to = "";
   var len = from.length;
   var character, isCyr;
   for(i=0; i < len; i++){
      character = from.charAt(i,1);
      isCyr = false;
      for(j=0; j < cyrChars.length; j++){
         if(character == cyrChars[j]){
            isCyr = true;
      to += (isCyr) ? transChars[j] : character;
   return to;

/* some lines skipped - inserting translitCyrillic call here: */

var newSuggestion = urlSegmentField.suggestNewValue( translitCyrillic(this.value.toLowerCase()) );

/* ... */

2. sapphire/core/model/SiteTree.php - I have moved URL Segment input/fieldset from "meta" tab to "main". I have to set or check URL for almost every page, fixing very long auto-suggested URLs and so on.

3. sapphire/core/control/ContentController.php - small addition to render multi-language controls in templates without widgets. Really I don't need complex logic here, so I just wrote a method:

* Returns just current language code - for multilanguage switching etc
function CurrentLang() {
return Translatable::current_lang();

And now my language switching code in template is really dumb'n'simple:

<% if CurrentLang == ru %>
<span class="i18n-link"><a href="$Link?lang=ru" class="active">RU</a></span>
<span class="i18n-link"><a href="$Link?lang=uk">UK</a></span>
<% end_if %>
<% if CurrentLang == uk %>
<span class="i18n-link"><a href="$Link?lang=ru">RU</a></span>
<span class="i18n-link"><a href="$Link?lang=uk" class="active">UK</a></span>
<% end_if %>

And this is only beginning...


12 December 2008