[Gdal-dev] Re: OGR - Character Encodings
Charlie Savage
cfis at interserv.com
Fri Oct 14 14:52:30 EDT 2005
> As you suspect, OGR is completely encoding-ignorant currently.
> I would encourage you to commit the change seting the encoding to
> LATIN1 for now, as I gather that is a more inclusive character set than
> the default UTF8.
Actually, I would say UTF8 is more inclusive since the lower 128 match
ASCII, but after that you can encode anything you want (of course it can
take 2 to 6 bytes for non ASCII characters).
I'll give UTF8 a try, it should work. If it does, that would be a
better choice in the long run. However, without the rest of the code
having support for it, its kind of a moot point (I would guess any
multibyte UTF8 characters would get chomped by the string handling code
currently used).
>> Second, what happens when you want to load maps for Asian countries? Is
>> that a no-go at the moment?
>
> OGR provides no special support for this. In cases where double
> byte text has been encountered it is treated as if it were single byte
> which will presumably not work well with Postgres.
Well, if Postgres knows what is coming it will be ok (back to the
original issue!).
I'm more worried about all the string handling code that loops over
pointers to chars* - I think there are places where the implicit
assumption is that characters are always 1 byte long. Thus I would
guess any such string would get mangled long before you tried to post
them to Postgresql, or any other data source for that matter.
> There are no plans currently to support encoding-awareness in OGR.
>
> /me buries his head in the sand for a couple more years...
LOL. I wonder how much work it would really be though. Maybe one
approach would be to update the core OGR string handling code to be
encoding aware, thereby limiting the scope in the first go.
Then you could update various drivers, as the need arose, to be encoding
aware. And probably there are a number of datasources which just assume
LATIN1 anyways, so you wouldn't have to touch them (out of curiosity,
does Shape support encodings?).
Something to ponder.
Charlie
> ---------------------------------------+--------------------------------------
> I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at pobox.com
> light and sound - activate the windows | http://pobox.com/~warmerdam
> and watch the world go round - Rush | Geospatial Programmer for Rent
More information about the Gdal-dev
mailing list