[Proj] Unicode
Thomas Knudsen
knudsen.thomas at gmail.com
Mon Jun 8 11:52:02 PDT 2009
2009/6/8 Gerald I. Evenden <geraldi.evenden at gmail.com>
>
> Besides, why take up vast quantities of bytes with 16 bit code where
> special
> inline doublet or triplet escape coding like LaTex uses can do the job with
> 7
> bit ASCII with ease.
>
That's what UTF-8 is for.
Markus Kuhn gives all the details in
http://www.cl.cam.ac.uk/~mgk25/unicode.html - from where I quote the
following item of interest:
C support for Unicode and UTF-8
Starting with GNU glibc 2.2, the type wchar_t is officially intended to be
used only for 32-bit ISO 10646 values, independent of the currently used
locale. This is signalled to applications by the definition of the
__STDC_ISO_10646__ macro as required by ISO C99. The ISO C multi-byte
conversion functions (mbsrtowcs(), wcsrtombs(), etc.) are fully implemented
in glibc 2.2 or higher and can be used to convert between wchar_t and any
locale-dependent multibyte encoding, including UTF-8, ISO 8859-1, etc.
For example, you can write
#include <stdio.h>
#include <locale.h>
int main()
{
if (!setlocale(LC_CTYPE, "")) {
fprintf(stderr, "Can't set the specified locale! "
"Check LANG, LC_CTYPE, LC_ALL.\n");
return 1;
}
printf("%ls\n", L"Schöne Grüße");
return 0;
}
Call this program with the locale setting LANG=de_DE and the output will be
in ISO 8859-1. Call it with LANG=de_DE.UTF-8 and the output will be in
UTF-8. The %ls format specifier in printf calls wcsrtombs in order to
convert the wide character argument string into the locale-dependent
multi-byte encoding.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/proj/attachments/20090608/6f04b608/attachment.html>
More information about the Proj
mailing list