[Gdal-dev] Re: OGR - Character Encodings

Charlie Savage cfis at interserv.com
Fri Oct 14 14:52:30 EDT 2005


> As you suspect, OGR is completely encoding-ignorant currently.
> I would encourage you to commit the change seting the encoding to
> LATIN1 for now, as I gather that is a more inclusive character set than
> the default UTF8.

Actually, I would say UTF8 is more inclusive since the lower 128 match 
ASCII, but after that you can encode anything you want (of course it can 
take 2 to 6 bytes for non ASCII characters).

I'll give UTF8 a try, it should work.  If it does, that would be a 
better choice in the long run.  However, without the rest of the code 
having support for it, its kind of a moot point (I would guess any 
multibyte UTF8 characters would get chomped by the string handling code 
currently used).

>> Second, what happens when you want to load maps for Asian countries?  Is
>> that a no-go at the moment?
> 
> OGR provides no special support for this.  In cases where double
> byte text has been encountered it is treated as if it were single byte
> which will presumably not work well with Postgres.

Well, if Postgres knows what is coming it will be ok (back to the 
original issue!).

I'm more worried about all the string handling code that loops over 
pointers to chars* - I think there are places where the implicit 
assumption is that characters are always 1 byte long.  Thus I would 
guess any such string would get mangled long before you tried to post 
them to Postgresql, or any other data source for that matter.

> There are no plans currently to support encoding-awareness in OGR.
> 
> /me buries his head in the sand for a couple more years...

LOL.  I wonder how much work it would really be though.  Maybe one 
approach would be to update the core OGR string handling code to be 
encoding aware, thereby limiting the scope in the first go.

Then you could update various drivers, as the need arose, to be encoding 
aware.  And probably there are a number of datasources which just assume 
LATIN1 anyways, so you wouldn't have to touch them (out of curiosity, 
does Shape support encodings?).

Something to ponder.

Charlie

> ---------------------------------------+--------------------------------------
> I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
> light and sound - activate the windows | http://pobox.com/~warmerdam
> and watch the world go round - Rush    | Geospatial Programmer for Rent





More information about the Gdal-dev mailing list