Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

Upgrading SilverStripe /

Ask questions about upgrading SilverStripe to the latest version.

Moderators: martimiz, Sean, biapar, Willr, Ingo, swaiba, simon_w

UTF-8 characters mangled after migration from 3.0.0 to 3.1.2


Reply


2 Posts   425 Views

Avatar
deadlyllama

Community Member, 4 Posts

21 January 2014 at 2:22pm

I'm trying to migrate a site from SilverStripe CMS v3.0.0 to v3.1.2.

So far I've untarred the new version, copied over themes and assets, copied mysite/_config.php and symlinked "__mysite" to "mysite" which the new version seems to need. And the site renders fine apart from all the utf-8 characters appearing to be double-encoded.

The site has a few pages (and page titles) in Korean, and a bunch of non-ASCII characters strewn through the English text. Prior to the upgrade, these rendered fine. I've tested upgrading to 3.0.5 and they still render OK. (3.0.6+ had some odd issues with producing invalid XML).

With 3.1.2, I see spurious "Â" characters in the text, and the page formerly titled "한국어" is now titled " 한국어". And other non-ASCII characters are similarly odd.

The document header includes
<meta http-equiv="Content-type" content="text/html; charset=utf-8" >

and my browser says it's displaying the page as UTF-8 (in fact, it won't let me select anything else). This was true of the old version of the site, too.

What's going wrong?

Avatar
deadlyllama

Community Member, 4 Posts

22 January 2014 at 8:51am

Problem solved. Looked at a hex dump of my database dump and found a bunch of double-encoded UTF-8. Tried several methods of fixing this, but the only one that worked was a fix in situ, http://www.noelboss.com/2011/11/fixing-double-encoded-utf-8-data-in-mysql/