[GRASS-dev] Python 3 porting and unicode

Laurent C. lrntct at gmail.com
Mon Nov 27 09:38:02 PST 2017


Hi Vaclav,

I think that it would make much more sense to have the GRASS python
libraries using unicode, and to add an interface managing the
translation to/from bytes when dealing with C code.
Python programmers using the GRASS libraries will expect unicode strings.

Laurent


2017-11-26 21:21 GMT-06:00 Vaclav Petras <wenzeslaus at gmail.com>:
> Dear all,
>
> after looking at different Python 2 to 3 porting issues, doing r71849, and
> reading #3392, I understand the following:
>
> * Several solutions for poring exist. Most recent one is python-future
> project, but only from __future__ import ... is part of the library and thus
> guaranteed with recent Python 2.7. (We can discuss concrete steps
> separately.)
>
> * However, the most challenging part of the porting will be the unicode.
>
> * There is no way around the unicode when using Python 3. Unicode is
> inherent part of the language even things such as os.environ or
> sys.stdout.write() work only with unicode. I'm not sure what exactly the
> rule is here, but it seems to be everywhere.
>
> * I haven't seen any simple fix which would limit the changes in the code in
> a way, e.g., in which print statement can be fixed.
>
> * GUI will always use unicode because that's how the libraries and
> interfaces as set.
>
> * In relation to the previous point, one of the reasons why unicode is used
> that thinks like text[:10] actually return 10 characters to display.
>
> * C library will not use unicode for now.
>
> * Users of the Python API who are using Python 3 will expect unicode strings
> to work, i.e. expect run_command('g.region', flags='p') to work (not just
> run_command(b'g.region', flags=b'p')).
>
> * If Python libraries are unicode, there will need to be an interface to
> work with ctypes which would add to existing code for transferring from C
> world to Python and back.
>
> * If Python libraries are bytes, there will need to be an interface to work
> with GUI in unicode as well as with users of the API who will expect unicode
> to work. In other words, internally it would use bytes, but interface must
> be both bytes (for modules and internal use) and unicode (for GUI and
> users).
>
> * Having unicode-based library means encoding and decoding on any "external"
> interface such as file reading or ctypes.
>
> * Having bytes-based library means encoding and decoding on any interface
> such as Python 3 interface such as os.environ and additionally rewriting all
> string literals ("abc") to bytes (b"abc").
>
> * It seems hard to predict when we will know the right encoding of the text.
> It seems that we will need it with any solution since garbage-in-garbage
> stops when you need to use some system interface function in Python 3 which
> requires unicode. Although e.g. sys.stdout.write() has a (less generic)
> sys.stdout.buffer.write() alternative, os.environb does not work on MS
> Windows.
>
> An example fix in r71849 is done using a (custom) decode function which
> creates unicode (standard string in Python3) when file content is read.
> Alternative to this change would be changing all the strings in the file to
> bytes (b'abc' as opposed to 'abc').
>
> Please comment or link other related discussions.
>
> Thanks,
> Vaclav
>
>
> python3 -c "import os; os.environ[b'abc'] = b'def'"
> python3 -c "import os; os.environb[b'abc'] = b'def'"
> python3 -c "import sys; sys.stdout.write(b'abc\n')"
> python3 -c "import sys; sys.stdout.buffer.write(b'abc\n')"
> python3 -c "import os; print(type(os.name))"
> https://trac.osgeo.org/grass/changeset/71849
> https://trac.osgeo.org/grass/ticket/2708
> https://trac.osgeo.org/grass/ticket/3392
> https://trac.osgeo.org/grass/query?status=!closed&keywords=~python3
> https://trac.osgeo.org/grass/query?status=!closed&keywords=~encoding
> https://trac.osgeo.org/grass/query?status=!closed&keywords=~unicode
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev


More information about the grass-dev mailing list