[GRASS-dev] charset and locales

Mon Jul 31 17:43:47 EDT 2006

Glynn,

OK, thanks for clear explanation...

Cheers Martin

2006/7/31, Glynn Clements <glynn at gclements.plus.com>:
>
> Martin Landa wrote:
>
> > I would like to kindly ask you about different charset of po-files. There are
> >
> > UTF-8:
> > grep charset=UTF-8 *.po | wc -l
> > 23
> >
> > non-UTF-8:
> > grep charset= *.po | grep -v UTF-8 | wc -l
> > 21
> >
> > Is there a particular reason for this diversity? Why not use only UTF-8?
>
> Unibyte encodings are more widely supported, and are more compatible
> with non-GNU gettext() implementations[1]. It also acts as a safety
> measure against "gratuitous" use of characters outside of the locale's
> normal repertoire.
>
> Also, in locales whose primary language doesn't use the roman alphabet
> (e.g. Russian), the historical encoding(s) are usually sufficiently
> well entrenched that UTF-8 isn't a realistic option.
>
> [1] GNU gettext will automatically convert UTF-8 message catalogues to
> the locale's encoding, while other versions may just pass the strings
> through untouched, meaning that a UTF-8 message catalogue will only
> work in a UTF-8 locale.
>
> --
> Glynn Clements <glynn at gclements.plus.com>
>