Wiki source code of UTF-8 Encoding Tips

Version 19.1 by Ramsey Gurley on 2013/06/12 13:30

Hide last authors
Antoine Berry 6.1 1 = UTF-8 Encoding Tips =
2
Ramsey Gurley 18.1 3 Encoding questions are asked frequently on the mailing list. This is just a collection of tips for using UTF-8. It's a checklist of sorts. Make sure you've done all the things specified here before pitching your computer into the ocean
Antoine Berry 6.1 4
5 == Check your database ==
6
Ramsey Gurley 18.1 7 The database needs to be storing values in UTF-8. If it isn't, then all your effort is wasted. For example, on MySQL that means a db url like
Antoine Berry 6.1 8
9 {{noformat}}
10 jdbc:mysql://localhost/Example?capitalizeTypenames=true&zeroDateTimeBehavior=convertToNull&useUnicode=true&characterEncoding=UTF-8
11
12 {{/noformat}}
13
14 And setting your default charset and collation in your my.cnf file
15
Ramsey Gurley 18.1 16 [[image:attach:options.png]]
Antoine Berry 6.1 17
18 == Fonts & CSS ==
19
Ramsey Gurley 18.1 20 Different fonts may not have all the characters to display the different characters. If you're using a default stylesheet, then the browsers may be displaying differently simply because of fonts. Speaking of stylesheets, you probably want to encode that in UTF-8 also. Start your stylesheet with something like
Antoine Berry 6.1 21
22 {{noformat}}
23 @charset "UTF-8";
24 @import url("reset.css");
25
26 /* Begin site CSS */
27
28 {{/noformat}}
29
30 == Set eclipse encoding ==
31
Ramsey Gurley 18.1 32 [[image:attach:preferences.png]]
Antoine Berry 6.1 33
34 == Use Project Wonder ==
35
Ramsey Gurley 18.1 36 I think this goes without saying but: **Use Wonder**. Set encoding in the properties file. Notice it is UTF-8 with a hyphen. It it always UTF-8 with a hyphen... well, except with the MySQL image above because they excel in doing things differently
Antoine Berry 6.1 37
38 {{noformat}}
39 # Project Encoding
40 er.extensions.ERXApplication.DefaultEncoding=UTF-8
41
42 {{/noformat}}
43
44 == Set encoding in your page wrapper ==
45
46 {{noformat}}
47 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
48 <!DOCTYPE html PUBLIC
49 "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
50 "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg-flat.dtd">
51
52 {{/noformat}}
53
Ramsey Gurley 18.1 54 == Localizable strings should be in UTF-16! ==
Antoine Berry 6.1 55
Ramsey Gurley 18.1 56 Localizable.strings should be encoded in UTF-16. The localizer can detect UTF-16 without error, where it can confuse UTF-8 with other encodings. Specifically, you should be using UTF-16 BE with no BOM if you are using an external text editor instead of eclipse.
Antoine Berry 6.1 57
Antoine Berry 16.1 58 == Build your files in UTF-8. ==
Antoine Berry 6.1 59
Antoine Berry 10.1 60 If you have some special characters in your code (like '€' for exemple), then you will need to specify which encoding you want when you build your application. To do that, you have to modify your "build.xml" file by adding the property "encoding="utf-8" into your <wocompile> statement.
Antoine Berry 12.1 61
Antoine Berry 16.1 62 {{code language="xml"}}
63 <wocompile srcdir="Sources" destdir="bin" encoding="utf-8">
64
65 {{/code}}