Erlang Notes

unicode in erlang
Login

unicode in erlang

  1. Erlang's first impressions for Unicode are dire
  2. Strings vs. lists of numbers

Erlang's first impressions for Unicode are dire

1> "тест".  % not rendered as a string by default
[1090,1077,1089,1090]
2> <<"тест">>.  % mangled binary literals by default
<<"B5AB">>
3> [0|binary_to_list(<<"тест">>)].
[0,66,53,65,66]
4> io:fwrite("~s", ["тест"]).  % unprintable by default
** exception error: bad argument
     in function  io:fwrite/2
        called as io:fwrite("~s",[[1090,1077,1089,1090]])
        *** argument 1: failed to format string
5> io:fwrite("~s", [<<"тест"/utf8>>]).  % mangled output by default
ÑеÑÑ   ok

But past those impressions, it does better:

$ erl +pc unicode
Erlang/OTP 27 [RELEASE CANDIDATE 3] [erts-15.0] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Eshell V15.0 (press Ctrl+G to abort, type help(). for help)
1> "тест".
"тест"
2> ~"тест".
<<"тест"/utf8>>
3> io:fwrite("~ts\n~ts\n", ["тест", ~"тест"]).
тест
тест

c.f. https://www.erlang.org/doc/apps/stdlib/unicode_usage.html

Strings vs. lists of numbers

$ erl
Erlang/OTP 27 [RELEASE CANDIDATE 3] [erts-15.0] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Eshell V15.0 (press Ctrl+G to abort, type help(). for help)
1> "hello".
"hello"
2> [0|"hello"].
[0,104,101,108,108,111]
3> io:fwrite("~lp\n", ["hello"]).
[104,101,108,108,111]
ok
4> io:fwrite("~ts\n", ["тест"]). % works without +pc unicode
тест
ok
5> io:fwrite("~tp\n", ["тест"]). % the t does nothing here!
[1090,1077,1089,1090]


$ erl +pc unicode
Erlang/OTP 27 [RELEASE CANDIDATE 3] [erts-15.0] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Eshell V15.0 (press Ctrl+G to abort, type help(). for help)
1> io:fwrite("~tp\n", ["тест"]).
"тест"
ok
2> io:fwrite("~p\n", [[["тест"]]]).
[[[1090,1077,1089,1090]]]
3> io:fwrite("~tp\n", [[["тест"]]]).
[["тест"]]