Wiki source code of UTF-8 Encoding Tips
Version 8.1 by Ramsey Gurley on 2009/11/17 19:39
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | = UTF-8 Encoding Tips = | ||
| 2 | |||
| 3 | Encoding questions are asked frequently on the mailing list. This is just a collection of tips for using UTF-8. It's a checklist of sorts. Make sure you've done all the things specified here before pitching your computer into the ocean :) | ||
| 4 | |||
| 5 | == Check your database == | ||
| 6 | |||
| 7 | The database needs to be storing values in UTF-8. If it isn't, then all your effort is wasted. For example, on MySQL that means a db url like | ||
| 8 | |||
| 9 | {{noformat}} | ||
| 10 | |||
| 11 | jdbc:mysql://localhost/Example?capitalizeTypenames=true&zeroDateTimeBehavior=convertToNull&useUnicode=true&characterEncoding=UTF-8 | ||
| 12 | |||
| 13 | {{/noformat}} | ||
| 14 | |||
| 15 | And setting your default charset and collation in your my.cnf file | ||
| 16 | |||
| 17 | [[image:options.png]] | ||
| 18 | |||
| 19 | == Fonts & CSS == | ||
| 20 | |||
| 21 | Different fonts may not have all the characters to display the different characters. If you're using a default stylesheet, then the browsers may be displaying differently simply because of fonts. Speaking of stylesheets, you probably want to encode that in UTF-8 also. Start your stylesheet with something like | ||
| 22 | |||
| 23 | {{noformat}} | ||
| 24 | |||
| 25 | @charset "UTF-8"; | ||
| 26 | @import url("reset.css"); | ||
| 27 | |||
| 28 | /* Begin site CSS */ | ||
| 29 | |||
| 30 | {{/noformat}} | ||
| 31 | |||
| 32 | == Set eclipse encoding == | ||
| 33 | |||
| 34 | [[image:preferences.png]] | ||
| 35 | |||
| 36 | == Use Project Wonder == | ||
| 37 | |||
| 38 | I think this goes without saying but: **Use Wonder**. Set encoding in the properties file. Notice it is UTF-8 with a hyphen. It it always UTF-8 with a hyphen... well, except with the MySQL image above because they excel in doing things differently :) | ||
| 39 | |||
| 40 | {{noformat}} | ||
| 41 | |||
| 42 | # Project Encoding | ||
| 43 | er.extensions.ERXApplication.DefaultEncoding=UTF-8 | ||
| 44 | |||
| 45 | {{/noformat}} | ||
| 46 | |||
| 47 | == Set encoding in your page wrapper == | ||
| 48 | |||
| 49 | {{noformat}} | ||
| 50 | |||
| 51 | <?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
| 52 | <!DOCTYPE html PUBLIC | ||
| 53 | "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" | ||
| 54 | "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg-flat.dtd"> | ||
| 55 | |||
| 56 | {{/noformat}} | ||
| 57 | |||
| 58 | == Localizable strings should be in UTF-16! == | ||
| 59 | |||
| 60 | Localizable.strings should be encoded in UTF-16. The localizer can detect UTF-16 without error, where it can confuse UTF-8 with other encodings. Pascal says use UTF-16LE if you want to be explicit about things... Especially if you are editing your strings files in an external editor like BBEdit or whatnot. I use the eclipse editor and UTF-16 myself and all seems to work fine for me. So to each his own. | ||
| 61 | |||
| 62 | == Your tips go here. == | ||
| 63 | |||
| 64 | It's a wiki, ya know :) |