[gdal-dev] RFC 30: Unicode Filenames - call for discussion

Frank Warmerdam warmerdam at pobox.com
Tue Sep 21 15:59:06 EDT 2010


Christopher Barker wrote:
> Frank,
> 
> This looks great!
> 
> One comment about the python bindings:
> 
> """
> In theory functions that return filenames, such as gdal.ReadDir?() 
> should return unicode strings for filenames, but from my perspective it 
> seems adequate to always return utf-8 strings and let the application 
> translate if needed.
> """
> 
> I think that's a mistake, if an app gets back a utf-8 string it has no 
> way of knowing what the heck it is (accept by knowing the GDAL 
> convention). So it will essentially always have to translate. And it's 
> ripe for bugs, if when testing, the utf-8 string happen to be ascii 
> compatible, things will just work, and then break when some odd 
> character is inserted later in production.
> 
> Better for GDAL to return a unicode object. If the user really needs it 
> as a byte string, they can convert, but the normal stuff like:
> 
> file()
> os.path.*

Chris,

I guess my concern is whether there will be significant backward
compatability problems if the GDAL interfaces that previously returned
regular strings start returning unicode strings in GDAL 1.8.  However,
thinking about it, there aren't many such interfaces that were previously
exposed through GDAL.

On further consideration I agree with you and have updated the RFC to
indicate that APIs like ReadDir() and GetFileList() that return filenames
should return them as unicode objects rather than as regular strings (in
Python).

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent



More information about the gdal-dev mailing list