[Proj] More Unicode etc.

support.mn at elisanet.fi support.mn at elisanet.fi
Tue Jun 9 12:37:20 PDT 2009


I have included a simple text file decode in different (Windows) formats.

"Test file: abcdöäå<CR><LF<CR><LF>"

Hex dump is done by

http://nobodysoft.com/

At the end of text there are some special characters (öäå).

RESULTS:
-------------------

ANSI:

00000000 54 65 73 74 20 66 69 6C 65 3A 20 61 62 63 64 F6 Test file: abcd.
00000010 E4 E5 0D 0A 0D 0A 

Unicode:

00000000 FF FE 54 00 65 00 73 00 74 00 20 00 66 00 69 00 ..T.e.s.t. .f.i.
00000010 6C 00 65 00 3A 00 20 00 61 00 62 00 63 00 64 00 l.e.:. .a.b.c.d.
00000020 F6 00 E4 00 E5 00 0D 00 0A 00 0D 00 0A 00 

Unicode Big Endian:

00000000 FE FF 00 54 00 65 00 73 00 74 00 20 00 66 00 69 ...T.e.s.t. .f.i
00000010 00 6C 00 65 00 3A 00 20 00 61 00 62 00 63 00 64 .l.e.:. .a.b.c.d
00000020 00 F6 00 E4 00 E5 00 0D 00 0A 00 0D 00 0A 

UTF-8

00000000 54 65 73 74 20 66 69 6C 65 3A 20 61 62 63 64 C3 Test file: abcd.
00000010 B6 C3 A4 C3 A5 0D 0A 0D 0A 

UTF-8 + BOM (Byte Order Mark)

00000000 EF BB BF 54 65 73 74 20 66 69 6C 65 3A 20 61 62 ...Test file: ab
00000010 63 64 C3 B6 C3 A4 C3 A5 0D 0A 0D 0A 
-------------------

Text file formatting is done by EditPlus:

http://www.editplus.com/

I am sure that if the reader does not have any internal support for different
encodings, the proj-4 reader will most propably fail. The EditPlus scanner
can detect the coding just by looking into the text file. Also the Windows
Notepad editor can detect these different text file formats.

Janne.




More information about the Proj mailing list