[GRASS-dev] [GRASS GIS] #2532: TypeError: environment can only contain string when launching script on Windows
GRASS GIS
trac at osgeo.org
Tue Jan 20 15:22:53 PST 2015
#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by glynn):
Replying to [comment:15 annakrat]:
> No, when I print the string I get xml, seems to be valid:
>
{{{
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE task SYSTEM "C:\Users\akratoc\Programs\GRASS GIS 7.1.svn\gui\xml
\grass-interface.dtd">
<task name="test_workshopá.py">
}}}
> I don't understand what's wrong with it.
The name= attribute will fail to decode due to not being valid UTF-8. The
"á" will be encoded in cp1252 (i.e. '\xe1'); attempting to decode that as
UTF-8 will fail (non-ASCII characters are encoded as multi-byte sequences;
an isolated byte >= 128 can never occur in UTF-8).
> > In any case, the GUI should be encoding the arguments which it passes
to Popen(); it shouldn't be passing unicode values.
>
> Should the be encoding moved to `get_interface_description` in task.py?
No. The GUI shouldn't be passing unicode values to the grass.script
library; it should be converting them to strings itself.
> The `EncodeString` function is in gui, not in python scripting library.
grass.script.core has encode() and decode().
> If I try to run the script (this time the script name is only ascii, but
the path has some non-ascii characters which are in cp1252), I get the gui
dialog and when I run it, I get an error:
>
{{{
File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gcmd.py", line 92, in EncodeString
return string.encode(_enc)
File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\encodings\cp1252.py", line 12, in
encode
return
codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in
position 38: ordinal not in range(128)
}}}
Ugh. I couldn't figure out what was happening here until I read the next
sentence. It appears that str.encode() actually exists; it tries to
convert the string to unicode (using the default encoding) so that it can
encode it.
> because in Popen class in
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/gcmd.py#L161
gcmd.py] some of the arguments are of type `str`, some are `unicode`. So
if encode only the unicode ones, it starts to work.
That makes sense. But the encoding should ideally be done at a higher
level, at the point that wxGUI "knows" that it's dealing with a unicode
value.
This is the main reason why I dislike dynamically-typed languages for
large-scale projects (I'd never have suggested Python if I'd have known
that wxGUI was going to turn into such a behemoth). In C/C++, you'd just
get a compile error if you pass a wchar_t*/std::wstring() where a
char*/std::string was expected. In Python, you get something which appears
to work until it starts getting decent test coverage.
I'm wondering if sys.setdefaultencoding("EBCDIC-CP-BE") would work ...
--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:16>
GRASS GIS <http://grass.osgeo.org>
More information about the grass-dev
mailing list