[GRASS-dev] Re: [bug #5195] (grass) ps.map sets encoding to iso-8859-1

Moritz Lennert mlennert at club.worldonline.be
Thu Oct 12 03:55:48 EDT 2006


Glynn Clements wrote:
> Moritz Lennert wrote:
> 
>>>> Most GNU/Linux distributions 
>>>> come with UTF-8 as default system encoding nowadays and so users will 
>>>> have that problem.
>>> The default locale's encoding doesn't matter. What matters is the
>>> encoding of the text in the ps.map input file.
>> Yes, but I would assume that in most cases (i.e. in those where people 
>> don't even think about encoding issues) files will be encoded in the 
>> locale's encoding.
> 
> Not necessarily.
> 
> If they get the data from an external source (email, web page), the
> file will probably be in whatever format the file's creator chose. 
> Programs which save some data to text files typically just save the
> bytes as they found them.
> 
> If they created the data themselves, it's likely to be in the default
> encoding of their preferred text editor, which isn't necessarily the
> same as the locale's encoding.
> 
>>>> I imagine there is no way of automatically identifying the encoding of a 
>>>> file ?
>>> Correct. At least, not reliably. You can use various heuristics; e.g. 
>>> bytes \x80-\x9F aren't valid in any ISO-8859-* encodings, certain
>>> combinations aren't valid in UTF-8 etc.
>>>
>>> But it's entirely possible to create a text file which is perfectly
>>> valid in multiple encodings. E.g. if you have an ISO-8859-* file which
>>> is almost entirely ASCII but with a small number of isolated non-ASCII
>>> characters, it's almost impossible for a program to determine exactly
>>> which ISO-8859-* encoding it's meant to be.
>> Ok, so the only thing to do seems to be a note in the man page.
>>
>> Does the attached patch look alright ?
> 
>> +information to be printed. For users wanting to use special
>> characters (such as accented characters) it is important to not that
>> ps.map uses ISO-Latin1 encoding.
> 
> Technically, the encoding is ISO-8859-1. "ISO Latin 1" refers to the
> repertoire (the set of characters) rather than the byte-sequences used
> to encode them.
> 
> Other than that, it seems okay.
> 
> One error I noticed in the ps.map manpage; in the description of the
> "text" command, it says:
> 
>         wrong??   font:  cyrilc  gothgbt gothgrt gothitt greekc greekcs greekp
>        greeks italicc italiccs italict romanc  romancs  romand  romans  romant
>        scriptc scripts (The default font is romans);
> 
> Yep, that's definitely wrong. Valid arguments to the "font" option are
> whatever fonts your printer (or Ghostscript, etc) supports. Safe
> choices, present in all PostScript implementations, are:
> 
> 	Times-Roman
> 	Times-Italic
> 	Times-Bold
> 	Times-BoldItalic
> 	Helvetica
> 	Helvetica-Oblique
> 	Helvetica-Bold
> 	Helvetica-BoldOblique
> 	Courier
> 	Courier-Oblique
> 	Courier-Bold
> 	Courier-BoldOblique
> 
> The default is Helvetica.

Just committed changes to description.html including corrections from 
Hamish Glynn and above information about fonts.

Bug can be closed (but we probably should think about how to handle this 
more elegantly in the future).

Moritz




More information about the grass-dev mailing list