UTF-8 Encoding Tips

Version 10.1 by Antoine Berry on 2012/09/20 10:12
Warning: For security reasons, the document is displayed in restricted mode as it is not the current version. There may be differences and errors due to this.

UTF-8 Encoding Tips

Encoding questions are asked frequently on the mailing list. This is just a collection of tips for using UTF-8. It's a checklist of sorts.  Make sure you've done all the things specified here before pitching your computer into the ocean emoticon_smile

Check your database

The database needs to be storing values in UTF-8.  If it isn't, then all your effort is wasted.  For example, on MySQL that means a db url like

Unknown macro: noformat. Click on this message for details.

And setting your default charset and collation in your my.cnf file

options.png

Fonts & CSS

Different fonts may not have all the characters to display the different characters.  If you're using a default stylesheet, then the browsers may be displaying differently simply because of fonts.  Speaking of stylesheets, you probably want to encode that in UTF-8 also. Start your stylesheet with something like

Unknown macro: noformat. Click on this message for details.

Set eclipse encoding

preferences.png

Use Project Wonder

I think this goes without saying but: Use Wonder. Set encoding in the properties file. Notice it is UTF-8 with a hyphen. It it always UTF-8 with a hyphen... well, except with the MySQL image above because they excel in doing things differently emoticon_smile

Unknown macro: noformat. Click on this message for details.

Set encoding in your page wrapper

Unknown macro: noformat. Click on this message for details.

Localizable strings should be in UTF-16!

Localizable.strings should be encoded in UTF-16. The localizer can detect UTF-16 without error, where it can confuse UTF-8 with other encodings. Pascal says use UTF-16LE if you want to be explicit about things...  Especially if you are editing your strings files in an external editor like BBEdit or whatnot.  I use the eclipse editor and UTF-16 myself and all seems to work fine for me. So to each his own.

Build your files in UTF-8.

If you have some special characters in your code (like '€' for exemple), then you will need to specify which encoding you want when you build your application. To do that, you have to modify your "build.xml" file by adding the property "encoding="utf-8" into your <wocompile> statement.