[gdal-dev] Usage of GDAL_FILENAME_IS_UTF8 config option
Ray Gardener
rayg at daylongraphics.com
Fri Jan 13 15:08:54 PST 2017
Just to add a small note regarding OS X (and iOS) and UTF-8 filenames:
The HFS+ filesystem stores accented characters in decomposed form, which
can differ from the filename given to an API that creates the file such
as fopen(..., "wb").
Applications that store a filename (e.g. into a preference or MRU file)
might store it in precomposed form, which won't match what the
filesystem uses and risks a file-not-found error. e.g. on iOS, text
input gives strings in precomposed form.
More info is available at
http://stackoverflow.com/questions/6153345/different-utf8-encoding-in-filenames-os-x
Ray
On 1/9/2017, Monday 2:03 AM, Damian Dixon wrote:
> Hi Victor,
>
> If you set GDAL_FILENAME_IS_UTF8 to YES then you need to pass in
> filenames and paths encoded as UTF8.
>
> This means that on Windows you will need to do additional work to
> convert from MBCS or UTF16/UCS2 to UTF8.
>
> If your application is built as MBCS then what you essentially have is
> a multi-byte string encoding which is the Windows local code page.
>
> If your application is built for Unicode then you have UTF16/UCS2 so
> you have to convert the filenames to UTF8 for GDAL/OGR to work.
>
> If you save the filenames as part of an application specific
> configuration then you need to consider how you will read read that
> data back in if the Windows code page changes. This is not an easy
> task unless you also save the code page as well. It also becomes a bit
> of a mess supporting this on non-Windows.
>
> The approach we took was to convert our Windows applications to
> Unicode and store/use all paths/filenames as UTF8 for portability to
> Linux/Android/Solaris.
>
> Regards
> Damian
>
> PS. Microsoft has deprecated MBCS build of MFC.
>
>
>
> On 9 January 2017 at 09:31, Poughon Victor <Victor.Poughon at cnes.fr
> <mailto:Victor.Poughon at cnes.fr>> wrote:
>
> Hi,
>
> We are using GDAL in OTB and recently we had a bug report about
> opening non
> ASCII filenames on Windows 10 [0]. They suggest a fix using:
>
> > CPLSetConfigOption("GDAL_FILENAME_IS_UTF8","NO");
>
> The test case is GDALOpen() on a file named 你好.tif, which I
> confirmed works
> fine on Linux, but not on Windows 7 or 10.
>
> So my question is to have some clarification on this option, to
> know if it's
> potentially the correct fix for this problem. The doc says:
>
> > This effectively restores the pre-GDAL1.8 behavior for handling
> filenames on
> > Windows and might be appropriate for applications that treat
> filenames as
> > being in the local encoding.
>
> What does it mean exactly to consider filenames to be in the local
> encoding? And
> how do I know if my application [1] does that?
>
> Cheers,
>
> [0] https://github.com/orfeotoolbox/OTB/pull/14
> <https://github.com/orfeotoolbox/OTB/pull/14>
> [1]
> https://github.com/janestar/OTB/blob/f6ffdc17ab3d7aa91726f03ed619fee806eb508a/Modules/IO/IOGDAL/src/otbGDALDriverManagerWrapper.cxx#L55
> <https://github.com/janestar/OTB/blob/f6ffdc17ab3d7aa91726f03ed619fee806eb508a/Modules/IO/IOGDAL/src/otbGDALDriverManagerWrapper.cxx#L55>
>
> Victor Poughon
>
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org <mailto:gdal-dev at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> <http://lists.osgeo.org/mailman/listinfo/gdal-dev>
>
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170113/dc86f266/attachment.html>
More information about the gdal-dev
mailing list