[GRASS-dev] Re: [bug #5195] (grass) ps.map sets encoding to iso-8859-1

Tue Oct 10 03:53:27 EDT 2006

Glynn Clements via RT wrote:
> Request Tracker wrote:
> 
>> this bug's URL: http://intevation.de/rt/webrt?serial_num=5195
>> -------------------------------------------------------------------------
>>
>> Subject: ps.map sets encoding to iso-8859-1
>>
>> Platform: GNU/Linux/x86
>> grass obtained from: CVS
>> grass binary for platform: Compiled from Sources
>> GRASS Version: cvs_head_20060921
>>
>> On line 92 of ps/ps.map/prolog.ps the encoding is set to ISOLatin1Encoding.
>>
>> If I understand correctly (and some testing confirms this) this
>> means that the instructions file for ps.map has to be encoded in
>> iso-8859-1 (or similar) to work, i.e. to be able to print accented
>> characters. If you are in a UTF-8 environment, ps.map creates a ps
>> file which doesn't show correct accented characters be it in iso or
>> in utf.
>>
>> Is there any reason why ps.map hardcodes the encoding ? Is it
>> possible to automatically use the users encoding ?
> 
> The reason why we force the font's encoding to ISOLatin1Encoding is
> that the default encoding for most Latin fonts is StandardEncoding,
> which (contrary to its name) is a completely non-standard encoding
> which (AFAICT) is not used by anything except PostScript.
> 
> The value of the Encoding property is an array of 256 glyph names, so
> you can use any unibyte encoding (e.g. ISO-646-*, ISO-8859-*,
> windows-12?? etc).
> 
> If you want to support more complex encodings, you need to use
> CID-keyed fonts. Apart from being rather complex, CID-keyed fonts may
> not be supported by PostScript printers sold outside of South-East
> Asia.

Does UTF-8 count as 'complex encoding' ? Most GNU/Linux distributions 
come with UTF-8 as default system encoding nowadays and so users will 
have that problem.

> 
> In short, allowing the encoding to be changed to other unibyte
> encodings is simple enough. Anything else will require a willing
> volunteer (i.e. not me), and will need to be implemented in such a way
> that users don't end up accidentally producing documents which show up
> fine in (recent versions of) Ghostscript but which will be rejected by
> every PostScript printer on this half of the planet.

Hamish wrote:
> FWIW, if all you want is accents, ps.map should be able to pass through
> the standard ascii extended chars.  e.g. I use the (c), ^2, degree
> symbols a lot. gnome-terminal doesn't like them, but they are fine
> passed from an input file. 

We haven't been able to get accents in ps.map if the instructions file 
containing the accents was utf-8 encoded.

> Alternately direct insertion using rxvt+vi on
> the output PostScript file can get the job done. If you try this & have
> many to do, I suggest leaving some breadcrumbs for the search&replace to
> find.

Yes, manually it is no problem (see below), but that is quite a hassle. 
IMO accents should be easily available in GRASS.

If I did understand correctly what happened during my testing, even in a 
UTF-8 environment it is enough that the ps.map instruction file was 
encoded using a unibyte encoding, i.e.

iconv -f ISO_8859-15 -t UTF-8 test.psmap.iso > test.psmap.utf8

LC_ALL=iso-8859-15

ps.map in=test.psmap.iso out=map.ps
accents are there.

ps.map in=test.psmap.utf8 out=map.ps
garbled accents

LC_ALL=UTF-8

ps.map in=test.psmap.iso out=map.ps
accents are there

ps.map in=test.psmap.utf8 out=map.ps
garbled accents

So, maybe we should at least add a hint to the ps.map man page that 
utf-8-based users who want non-ascii characters should translate their 
utf-8 instruction files to unibyte before using them.

I imagine there is no way of automatically identifying the encoding of a 
file ?

Moritz