[gdal-dev] RFC 30: Unicode Filenames - call for discussion

Ari Jolma ari.jolma at gmail.com
Tue Sep 21 16:33:14 EDT 2010


On 09/21/2010 10:01 PM, Frank Warmerdam wrote:
> Ari Jolma wrote:
>> The idea of this RFC as I understand it is to build a layer into 
>> GDAL, which would take care of conversions between utf-8 and utf-16 
>> (Windows end) transparently, thus making it similar to the current 
>> case of utf-8 filesystem in unix. Everything should work fine as it 
>> is now, but I'll add encode (to utf8 by default) to be on the safe side.
>>
>> In the case of unix with non utf8 filesystem determining the filename 
>> encoding is left for the user. The encoding is by default utf8 but 
>> can be changed.
>
> Ari,
>
> I'm a bit uncertain about where we stand on Perl.  Is it true that 
> currently
> the filenames are just treated as "plain strings" in Perl

"Plain strings" in Perl are in Perl's internal format that can handle 
unicode text.

> and that these
> strings have no obvious characterset or encoding associated with 
> them?  If
> so I'm not sure that "encoding to utf-8 by default" will necessarily make
> sense if they are already in utf-8.  If you "encode to utf-8", is it 
> assumed
> the encoding is being done from whatever the locale charset is?

Basically Perl should be told what the encoding of text is when it is 
given to it and when text is wanted from Perl it should be told the 
encoding. If GDAL functions assume and produce text in a known format, 
all is well, as we can tell that to Perl.

Best regards,

Ari

>
> Best regards,



More information about the gdal-dev mailing list