[gdal-dev] Are unicode field values supported in the Python bindings for GDAL?

Roy Hyunjin Han starsareblueandfaraway at gmail.com
Wed Apr 11 06:43:06 EDT 2012


Le 11 avril 2012 03:38, Paolo Corti <pcorti at gmail.com> a écrit :
> this will work:
> feat.SetField(0, u'xxx'.encode('utf-8'))

Yes, but you can't decode it after saving the file.

    feature.SetField2() forces conversion using str()
    feature.SetField() tries to convert unicode to latin-1

git clone git at github.com:invisibleroads/geometryIO.git  # Uses
feature.SetField2()
cd geometryIO

    from geometryIO import save, load, proj4LL
    from osgeo import ogr
    from shapely.geometry import Point

    WORD = 'Спасибо'.decode('utf-8')

    # Raises exception because feature.SetField2() uses str()
    save('test.shp', proj4LL, [Point(0,0)], [(WORD,)], [('String',
ogr.OFTString)])
    # UnicodeEncodeError: 'ascii' codec can't encode characters in
position 0-6: ordinal not in range(128)

git checkout 5344d867a6d665cf3961c795aa771f357f5f3275 # Uses feature.SetField()

    from geometryIO import save, load, proj4LL
    from osgeo import ogr
    from shapely.geometry import Point

    WORD = 'Спасибо'.decode('utf-8')

    # Raises exception because feature.SetField() does not recognize unicode
    save('test.shp', proj4LL, [Point(0,0)], [(WORD,)], [('String',
ogr.OFTString)])
    # NotImplementedError: Wrong number of arguments for overloaded
function 'Feature_SetField'

    # Emits warning that characters cannot be converted to latin-1
    save('test.shp', proj4LL, [Point(0,0)], [(WORD.encode('utf-8'),)],
[('String', ogr.OFTString)])
    # Warning 1: One or several characters couldn't be converted
correctly from UTF-8 to ISO-8859-1.
    print load('test.shp')[2][0][0]
    #  '???????'
    print load('test.shp')[2][0][0].decode('utf-8')
    #  '???????'
    print load('test.shp')[2][0][0].decode('latin-1')
    #  '???????'

    # Some characters cannot be converted to latin-1
    WORD.encode('latin-1')
    # UnicodeEncodeError: 'latin-1' codec can't encode characters in
position 0-6: ordinal not in range(256)


More information about the gdal-dev mailing list