[Gdal-dev] Wide-character filenames with GDAL file IO?

Ben Discoe ben at vterrain.org
Sat Sep 9 20:31:48 EDT 2006


Hi Frank,

> There is currently no suppot in GDAL for wide strings, or
> wide string filenames.  As far as I know
> fopen() is the only low level open function used by GDAL
> (well there might be a few exceptions in sublibraries using
> open() instead).

Yes, fortunately the problem seems isolated to fopen(), hopefully due to
that, we can find a single-point fix.  It's good that the resulting FILE* is
no different regardless of the filename's charset or how it was opened.

> > Another option is for me to open the file first, e.g. for Windows:
> >         std::wstring fname;
> >         FILE *fp = _wfopen(fname, L"rb");
> >         int fd = _fileno(fp);
> > Then i could pass the fd into GDAL, if it had a way of accepting it.
>
> There is no support for this in GDAL.

That might be a (relatively) painless solution then - for a the few places
in GDAL that take a (const char *) filename, a second entry point that take
a (FILE *).  I will explore this option, but you may already understand the
ramifications better?

> For something like CSLLoad(), it is easy enough to
> reimplement yourself.  But it is a big issue for all the
> files opened through GDALOpen().

Might be.  I hope not too bad.

> Actually, in thinking about it, you could in theory implement
> some sort of special VSI*L handler that knows how to handle
> wide strings.

But the methods above it (like CSLLoad) would still need to accept the wide
string in order to pass it on to the VSI*L handle, i think?

> Are you sure there isn't someway of turning wide string
> filenames into something compatible with
> fopen() on windows?

Nope, fopen() on Windows (and most Unices, AFAIK) assume that the (const
char *) is in the current charset (local code page, multi-byte).  This means
you can open chinese-name files on a chinese-locale OS, and euro-names on a
euro-locale OS, but not any other situation.  That puts on a damper on
international file exchange, the problem i am trying to fix.

> I have tried opening a variety of
> unusual filenames in different character sets, including
> double byte names, on windows successfully (I think). I
> believe I just passed whatever got passed in as a filename
> after using shell completion.

What shell are you using?   
I tried with cygwin, and this is what i get.  I made a couple sample files,
one with western europe non-ASCII characters, and one with Chinese
characters.  When i 'ls', the non-ASCII are displayed as question marks:

$ ls
??City.dem  Pa?sCatalu?a.dem

That should be (Bei)(Jing)City.dem, and PaísCataluña.dem (accent on i, tilde
on n).

Using the shell completion, it works for the western europe name (since
that's my machine's locale) but fails for the Chinese:  The shell apparently
tries to make escape sequences:

$ gdalinfo Pa\355sCatalu\361a.dem
Driver: USGSDEM/USGS Optional ASCII DEM (and CDED)
Size is 374, 466
Coordinate System is: [....]

$ gdalinfo \?\?City.dem
ERROR 4: `??City.dem' does not exist in the file system,
and is not recognised as a supported dataset name.
GDALOpen failed - 4

This could be the limitation of Cygwin's default command line, but even if
the command-line handled it, we know it would fail when it gets to GDALOpen.

-Ben

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20060909/ac64d6ce/attachment.html


More information about the Gdal-dev mailing list