PHP UTF-8 tips
It’s not as hard as you might have heard. I’ve finally left my cosy ASCII world and it’s not as bad as I thought.
Start with this : http://www.nicknettleton.com/zine/php/php-utf-8-cheatsheet – a checklist of what you need to do.
Good reading here : http://www.phpwact.org/php/i18n/charsets. See the links at the bottom of this one.
Also added support for sending emails in different character sets in NMC_Email.php and started a library of useful functions in NMC_UTF.php (in repository soon).
Something to watch out for : the AJAX library I was using was escaping parameters it was sending to the server – I stopped it doing this because JavaScript’s escape function doesn’t do it properly. (btw AJAX always sends UTF-8 regardless of the encoding of the page)
I’d recommend going with UTF-8 if there is any doubt that non-latin-1 character sets might be needed in the future. It saves a lot of hassle. I messed around with having parallel sites – one latin 1 for English and one latin 2 for Polish but I needn’t have bothered.
Having all translate-able text in one place to start with also saves time. Otherwise you have to go through all the nooks and crannies of the site to find it.
Good overview of character sets and encoding here : The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)