22403 Posts in 6882 Topics by 2748 members
Page: 1 2
|Go to End||Next >|
28 May 2011 at 12:46am
I recently upgraded a site from 2.3.x to 2.4.3. As part of the process I made a copy of the site including the mysql database and then performed the upgrade, additional mods my client required and some fixes to make my custom modules work correctly with 2.4. I then copied the database again (via a MySQL dump) to ensure I had the latest version of the content (as my client needed to be updating the site daily while I worked on the new version).
Ever since then I have noticed odds strings such as Ã�Â appearing in the HTML area fields. They seem to appear next to / in place of special characters such as  c; and © , etc. They appear when clicking "Save" or "Save and Publish" and they seem to build up over time with each save. This means some pages (after being saved a few times) have strings of these "junk" characters in them. eg: Ã�ÂÃ�ÂÃ�ÂÃ�ÂÃ�Â
I've tried looking into issues surrounding htmlentities and php and also looking at the forums on the tinymce web site. So far I've not been able to come up with a fix.
Can anyone help or at least steer me in the right direction?
28 May 2011 at 1:23am
I've been logging my progress with this annoying one...
28 May 2011 at 10:08pm Last edited: 28 May 2011 10:09pm
I've had that happen just recently, after an upgrate 2.3 -> 2.4. In mysite/_config.php there's a line that sets the characterset for the db connection to utf8. After I replaced it with latin1, all was well again.
Obviously this feels like it should be a temporary fix, but it works while trying to figure out the best course of action...
29 May 2011 at 4:26am
Thanks, I tried changing the line but things are exactly the same. Junk characters by the truck load when I save the pages.
30 May 2011 at 12:19am
Just found that the site was in dev mode... I thought I changed that when I made it live but I guess I missed it... Anyway that seems to have fixed it. Tho I have left in the line martimiz suggested.
30 May 2011 at 2:19am
Apparently utf-8 processing in 2.3 wasn't quite consistent:
read this post: http://silverstripe.org/migrating-a-site-to-silverstripe/show/5849
see this ticket: http://open.silverstripe.org/ticket/3746),
So the encoding of some characters in a 2.3 database might very well be multibyte but not quite utf8? Some of these multibyte characters might be recognized by 2.4 as being two separate characters, that are then both converted to utf8 when you (re)publish a page? This could result in a nonbreakable space being converted into a Â followed by a nonbreakable space. On the other hand some characters are suddenly interpreted correctly if you set the charset to latin. Things like this confuse me a lot...
30 May 2011 at 2:46am
That certainly seems consistent with what I have experienced. I now have over 500 pages some of which have errors and some that don't. Although it appears that no more of these characters are now getting introduced, there are some that have got saved there in the intervening period. I know I could try and remove them using an SQL query directly on the database, but I am very loathed to do that lest I make more problems for myself!
3 June 2011 at 10:37pm
I've tried the MySQLDatabase::set_connection_charset('latin1'); fix, which at first appeared to solve the problem but have found it throws up a new error if there is a in the content, which gets inserted by TinyMCE in several cases mainly being inserted into empty table cells. I've included a screen shot of the error.
Any ideas of a way around with without hacking the core to change to  ?
Page: 1 2
|Go to Top||Next >|