[GRASS-dev] Re: [GRASS GIS] #1193: Python Menu: Japanese (double
byte character) in menu may cause parser error.
GRASS GIS
trac at osgeo.org
Sun Oct 10 23:30:23 EDT 2010
#1193: Python Menu: Japanese (double byte character) in menu may cause parser
error.
-------------------------+--------------------------------------------------
Reporter: naokiueda | Owner: grass-dev@…
Type: defect | Status: new
Priority: major | Milestone: 6.4.1
Component: Python | Version: 6.4.0
Keywords: | Platform: MSWindows 7
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by glynn):
Replying to [comment:1 neteler]:
> I have tried on Linux and I could launch r.reclass in Japanese without
problems. Perhaps it is
> a Windows-only problem.
AFAICT, it's a problem with Shift-JIS (cp932), which isn't compatible with
ASCII. Unix systems use EUC-JP, which doesn't have this problem.
Shift-JIS is a multi-byte encoding. Non-ASCII characters have a first byte
with the top bit set, but the second byte can be any value >= 64. While
this excludes the digits and most of the punctuation characters, it
includes `[\]^_{|}~`.
This makes it incompatible with any code which parses a stream of bytes
without reference to the encoding, as e.g. '\' (0x5c) might be an ASCII
'\' or it might be the second byte of a JISX0208 character; you can't tell
without tracking the shift state.
Unfortunately, the only Japanese encoding which is supported by Windows'
codepage-based API is Shift-JIS (actually, codepage 932, which is Shift-
JIS plus the usual Microsoft-specific extensions). There is no UTF-8
codepage (cp 65001 is UTF-8, but it can't be used as a normal codepage).
I don't think that there's any solution to this, other than "don't use
kanji (or hiragana or full-width katakana) in command lines". GRASS is
stuck using the codepage-based API (unless someone wants to implement
UTF-8 equivalents of all of the ANSI C and POSIX functions, and change all
of GRASS to use them), and expecting every function which deals with char*
to decode it according to the current locale isn't feasible.
--
Ticket URL: <http://trac.osgeo.org/grass/ticket/1193#comment:2>
GRASS GIS <http://grass.osgeo.org>
More information about the grass-dev
mailing list