您的位置:首页 > 其它

erlang utf8 与unicode关系

2016-06-13 01:58 260 查看
erlang没有字符串数据类型代之以list表达,要搞清楚erlang中如何处理unicode则需要理解list,binary以及字符编码对应关系。



  utf-8 unicode
 二进制 十进制十六进制  二进制十进制 十六进制
11100101
10001001
10001101
229

137

141
E5

89

8D
10100100100110121069524D
11100101
10000110
10110010
229

134

178
E5

86

B2
1010001101100102091451B2
示例

test_unicode() ->

    Name_utf8_list = "前冲",

    io:fwrite("test_unicode:test_unicode/0:Name_utf8=~p~n",[Name_utf8_list]),

    Name_utf8_bin = iolist_to_binary(Name_utf8_list),

    io:fwrite("test_unicode:test_unicode/0:Name_utf8_bin=~ts~n",[Name_utf8_bin]),

    io:fwrite("test_unicode:test_unicode/0:Name_utf8_bin=~p~n",[Name_utf8_bin]),

    Name_unicode = unicode:characters_to_list(Name_utf8_bin),

    io:fwrite("test_unicode:test_unicode/0:Name_unicode=~ts~n",[Name_unicode]),

    io:fwrite("test_unicode:test_unicode/0:Name_unicode=~p~n",[Name_unicode]),

    S = "my name is " ++ Name_unicode,

    io:fwrite("test_unicode:test_unicode/0:S=~ts~n",[S]),

    io:fwrite("test_unicode:test_unicode/0:S=~p~n",[S]),

    S_utf8_bin = unicode:characters_to_binary(S),

    io:fwrite("test_unicode:test_unicode/0:Name_utf8_bin=~p~n",[S_utf8_bin]).

>>> output <<<

test_unicode:test_unicode/0:Name_utf8=[229,137,141,229,134,178]

test_unicode:test_unicode/0:Name_utf8_bin=前冲

test_unicode:test_unicode/0:Name_utf8_bin=<<229,137,141,229,134,178>>

test_unicode:test_unicode/0:Name_unicode=前冲

test_unicode:test_unicode/0:Name_unicode=[21069,20914]

test_unicode:test_unicode/0:S=my name is 前冲

test_unicode:test_unicode/0:S=[109,121,32,110,97,109,101,32,105,115,32,21069,

                               20914]

test_unicode:test_unicode/0:Name_utf8_bin=<<109,121,32,110,97,109,101,32,105,

                                            115,32,229,137,141,229,134,178>>

see also
UTF-8 - 维基百科,自由的百科全书
Unicode编码查询
Erlang-China ? [荐]Erlang的Unicode支持
Erlang Forum - Trap Exit - View topic - definition of iolist
Erlang Questions - UTF-8 problem ?
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  erlang