[GRASS-user] Re: GCC vs. locale

Ivan Shmakov ivan at theory.asu.ru
Sat Feb 2 08:25:58 EST 2008


>>>>> Glynn Clements <glynn at gclements.plus.com> writes:

 >>>>> driver.h:70: error: \u2018BOUND_BOX\u2019 does not name a type
 >>>>> make: *** [OBJ.i686-pc-linux-gnu/grass6_wxvdigit_wrap.o] Error 1

 >>>> "\u2018" -- unicode characters sneaking into what should be a flat
 >>>> ASCII file?

 >>> Those are just the quotes added by the compiler. Recent versions of
 >>> gcc have taken to using gratuitous non-ASCII punctuation in
 >>> diagnostic messages.

 >> Is it due to a locale setting?  It seems reasonable for GCC to put
 >> UTF-8 quotes when asked for such a locale.

 > ASCII is a subset of UTF-8, so there's no problem with using the
 > ASCII quote characters in that situation.

	That's the problem -- ASCII lacks single quote characters.

$ zcat /usr/share/i18n/charmaps/ANSI_X3.4-1968.gz 
<code_set_name> ANSI_X3.4-1968
...

...
% alias ASCII
...
<U0027>     /x27         APOSTROPHE
...
<U0060>     /x60         GRAVE ACCENT
...
$ 

 > It might be different if the locale was one which doesn't normally
 > use "..." for quotations. E.g. using «...» in a French locale or
 > 「...」 in a Japanese locale might be reasonable. But the error message
 > was quite clearly in English.

 > If it was using non-ASCII characters in the C/POSIX locale, that
 > would be an unequivocal bug.

	In C/POSIX locale (as well as in the rest of the ASCII world),
	APOSTROPHE and GRAVE ACCENT are traditionally used to represent
	single quote characters.  However, there's no need in such a
	behaviour when the character set has ``real'' single quote
	characters.

$ zcat /usr/share/i18n/charmaps/UTF-8.gz 
<code_set_name> UTF-8
...

% alias ISO-10646/UTF-8
CHARMAP
...
<U2018>     /xe2/x80/x98 LEFT SINGLE QUOTATION MARK
<U2019>     /xe2/x80/x99 RIGHT SINGLE QUOTATION MARK
...
$ 

 > As it is, it's merely a poor choice.

	I'd say that it's a behaviour consistent with that of other
	applications.  I cannot imagine a case where this behaviour
	could be actually useful (other than inserting the program
	output into a UTF-8 printed document), but it's consistent.

PS.  Is `utf8' a valid MIME charset name?

From: Glynn Clements <...>
Subject: Re: GCC vs. locale
Date: Sat, 2 Feb 2008 13:53:18 +0000
...
Message-ID: <18340.30158.625426.938920 at ...>
...
Mime-Version: 1.0
Content-Type: text/plain; charset=utf8
Content-Transfer-Encoding: quoted-printable

	My Gnus seemingly misrecognized it.



More information about the grass-user mailing list