[GRASS-dev] Re: [bug #5195] (grass) ps.map sets encoding to iso-8859-1

Glynn Clements glynn at gclements.plus.com
Wed Oct 11 00:55:17 EDT 2006


Moritz Lennert wrote:

> >> Most GNU/Linux distributions 
> >> come with UTF-8 as default system encoding nowadays and so users will 
> >> have that problem.
> > 
> > The default locale's encoding doesn't matter. What matters is the
> > encoding of the text in the ps.map input file.
> 
> Yes, but I would assume that in most cases (i.e. in those where people 
> don't even think about encoding issues) files will be encoded in the 
> locale's encoding.

Not necessarily.

If they get the data from an external source (email, web page), the
file will probably be in whatever format the file's creator chose. 
Programs which save some data to text files typically just save the
bytes as they found them.

If they created the data themselves, it's likely to be in the default
encoding of their preferred text editor, which isn't necessarily the
same as the locale's encoding.

> >> I imagine there is no way of automatically identifying the encoding of a 
> >> file ?
> > 
> > Correct. At least, not reliably. You can use various heuristics; e.g. 
> > bytes \x80-\x9F aren't valid in any ISO-8859-* encodings, certain
> > combinations aren't valid in UTF-8 etc.
> > 
> > But it's entirely possible to create a text file which is perfectly
> > valid in multiple encodings. E.g. if you have an ISO-8859-* file which
> > is almost entirely ASCII but with a small number of isolated non-ASCII
> > characters, it's almost impossible for a program to determine exactly
> > which ISO-8859-* encoding it's meant to be.
> 
> Ok, so the only thing to do seems to be a note in the man page.
> 
> Does the attached patch look alright ?

> +information to be printed. For users wanting to use special
> characters (such as accented characters) it is important to not that
> ps.map uses ISO-Latin1 encoding.

Technically, the encoding is ISO-8859-1. "ISO Latin 1" refers to the
repertoire (the set of characters) rather than the byte-sequences used
to encode them.

Other than that, it seems okay.

One error I noticed in the ps.map manpage; in the description of the
"text" command, it says:

        wrong??   font:  cyrilc  gothgbt gothgrt gothitt greekc greekcs greekp
       greeks italicc italiccs italict romanc  romancs  romand  romans  romant
       scriptc scripts (The default font is romans);

Yep, that's definitely wrong. Valid arguments to the "font" option are
whatever fonts your printer (or Ghostscript, etc) supports. Safe
choices, present in all PostScript implementations, are:

	Times-Roman
	Times-Italic
	Times-Bold
	Times-BoldItalic
	Helvetica
	Helvetica-Oblique
	Helvetica-Bold
	Helvetica-BoldOblique
	Courier
	Courier-Oblique
	Courier-Bold
	Courier-BoldOblique

The default is Helvetica.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list