[gdal-dev] UTF-8 String Support in GDALOpen() and OGRSFDriverRegistrar::Open()

Frank Warmerdam warmerdam at pobox.com
Fri Sep 4 12:07:16 EDT 2009

Even Rouault wrote:
> Louis, Chaintanya,
> I just wanted to mention that the topic of encoding for filenames dealt by GDAL
> or OGR is a known issue that has not been addressed yet. You can read
> http://trac.osgeo.org/gdal/wiki/rfc5_unicode which was a proposal but has not
> been implemented. Some infrastructure for re-encoding has been introduced during
> the implementation of http://trac.osgeo.org/gdal/wiki/rfc23_ogr_unicode (but
> RFC23 only addresses the issue of encoding in OGR field values, not for
> filenames)
> My understanding is that :
> * on Windows the current API used by GDAL/OGR does not expect UTF8 or Unicode
> but ANSI.
> * on Linux systems, UTF-8 is now assumed


I wonder if we should implement some mechanism to support UTF-8 filenames
on windows (and generally) before GDAL 1.7 release?

How dangerous would it be for us to always assume filenames are UTF-8 and
act accordingly?

One theoretical downside to treating filenames as UTF8 is that we do a lot
of filename parsing that has no concept that some bytes in the name might
be part of a multi-byte sequence.  So if there was a UTF8 multibyte
character that happened to include ASCII 92 '\' or ASCII 47 '/' it would
confuse the path parsers.  Also for subdatasets, database connections and
other esoteric datasource names we do a lot of parsing - splitting on
spaces, commas, quotes and other special characters.  Some of this could be
confused by unfortunate UTF-8 characters.  I suppose we really ought to
be migrating to doing these manipulations on wchar_t's or perhaps UCS-32

Hmm, this is getting rather complicated to address fully.

But at least as a hack we could provide a build or runtime mechanism to
tell cpl_vsil_win32.cpp code that the passed in filename should be
handled as UTF-8 instead of local code page characters on windows.  Would
that be worth implementing?

Best regards,
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

More information about the gdal-dev mailing list