[GRASS-dev] Re: [bug #5195] (grass) ps.map sets encoding to iso-8859-1

Moritz Lennert mlennert at club.worldonline.be
Tue Oct 10 09:35:08 EDT 2006


Glynn Clements wrote:
> Moritz Lennert wrote:

>> Most GNU/Linux distributions 
>> come with UTF-8 as default system encoding nowadays and so users will 
>> have that problem.
> 
> The default locale's encoding doesn't matter. What matters is the
> encoding of the text in the ps.map input file.

Yes, but I would assume that in most cases (i.e. in those where people 
don't even think about encoding issues) files will be encoded in the 
locale's encoding.

> 
> If they have text in UTF-8, they'll need to convert it to ISO-8859-1
> first. If you have text outside of the ISO-8859-1 repertoire, you lose
> regardless of what ps.map does, because your printer probably doesn't
> have those glyphs.
> 
> About the only thing which ps.map can do here is to convert UTF-8 to
> ISO-8859-1. But then it would need some way to determine that the text
> is in UTF-8 (if it assumes it, users would first have to convert any
> ISO-8859-1 text to UTF-8 just so that ps.map can convert it back to
> ISO-8859-1).

This is obviously no solution.

> 
>> I imagine there is no way of automatically identifying the encoding of a 
>> file ?
> 
> Correct. At least, not reliably. You can use various heuristics; e.g. 
> bytes \x80-\x9F aren't valid in any ISO-8859-* encodings, certain
> combinations aren't valid in UTF-8 etc.
> 
> But it's entirely possible to create a text file which is perfectly
> valid in multiple encodings. E.g. if you have an ISO-8859-* file which
> is almost entirely ASCII but with a small number of isolated non-ASCII
> characters, it's almost impossible for a program to determine exactly
> which ISO-8859-* encoding it's meant to be.

Ok, so the only thing to do seems to be a note in the man page.

Does the attached patch look alright ?

Moritz
-------------- next part --------------
Index: ps/ps.map/description.html
===================================================================
RCS file: /grassrepository/grass6/ps/ps.map/description.html,v
retrieving revision 1.45
diff -u -r1.45 description.html
--- ps/ps.map/description.html	19 Jul 2006 10:08:31 -0000	1.45
+++ ps/ps.map/description.html	10 Oct 2006 13:31:33 -0000
@@ -20,7 +20,9 @@
 This program has two distinct modes of operation.  The command-line
 mode requires the user to prepare a file of mapping instructions prior
 to running <EM>ps.map</EM> that describes the various spatial and textual
-information to be printed.
+information to be printed. For users wanting to use special characters (such as accented characters) it is important to not that ps.map uses ISO-Latin1 encoding. This means that your instructions file will have to be encoded in this encoding. If you normally work in a different encoding environment (such as UTF-8), you have to transform your file to the ISO-latin1 encoding, for example by using <EM>iconv</EM>:
+
+<EM>iconv -f UTF-8 -t ISO_8859-1 utf_file > iso_file</EM>
 
 The interactive mode (i.e., no command-line arguments) will prompt the
 user for items to be mapped and does not require the user


More information about the grass-dev mailing list