[GRASS-dev] Re: [GRASS-stats] Sys.setlocale for GRASS6.4

Glynn Clements glynn at gclements.plus.com
Fri Sep 4 15:26:41 EDT 2009


Markus Neteler wrote:

> @grass-dev: There are encoding issues with --interface-description
>    which states UTF-8 also then the actual language encoding is different:
> 
> 2009/9/4 Roger Bivand <Roger.Bivand at nhh.no>:
> > OK, don't worry about that. I'll try to send an updated spgrass6 when I have
> > adequate net access. The problem is that GRASS writes UTF-8 as the encoding
> > into the XML header from --nterface-description, but the French translations
> > seem (on my XP machine) to be in latin1, that is the 0x.. spell eacute
> > (\'{e}) in latex, then "fin" from "défine...", the first non-ASCII string in
> > the output for g.region.
> >
> > The fix is to insert "latin1" into the header, so I've updated the package
> > to allow the user to do this from within R
> 
> This gave me another idea to check the .po files in GRASS 6.4:
> 
> grep charset locale/po/grass*_fr.po
> locale/po/grasslibs_fr.po:"Content-Type: text/plain; charset=ISO-8859-1\n"
> locale/po/grassmods_fr.po:"Content-Type: text/plain; charset=ISO-8859-1\n"
> locale/po/grasstcl_fr.po:"Content-Type: text/plain; charset=UTF-8\n"
> locale/po/grasswxpy_fr.po:"Content-Type: text/plain; charset=UTF-8\n"
> 
> Apparently the translators mixed several charsets instead of using one,
> this is valid for various languages supported in GRASS.
> 
> @grass-dev: Should we harmonize the encodings to UTF-8 for all/subset
>   of languages? If yes, how? With iconv?

No.

For systems using GNU libc, the .mo files use unicode, which is
converted to the locale's encoding automatically at run time. If your
locale uses ISO-8859-1, that's what the program will get regardless of
the encoding used in the .po files.

Additionally, using "historical" encodings provides better
compatibility. Systems which support unicode will typically convert
the .po files to unicode in the .mo files, then convert this to the
locale's encoding at run time, so it doesn't matter which encoding is
used. Systems which don't support unicode will require the .po files
to use the locale's historical encoding (ISO-8859-*, EUC-JP, etc).

The --interface description option needs to either determine the
locale's encoding via e.g. nl_langinfo() and use that in the header,
or convert the data to UTF-8. The latter has the advantage of
relieving the reader of the burden of handling multiple encodings.

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list