[gdal-dev] Issues when exporting from PostGIS to shape

Peter Hopfgartner peter.hopfgartner at r3-gis.com
Sat Mar 13 16:41:15 EST 2010

Frank Warmerdam wrote:
> Peter Hopfgartner wrote:
> (...)
>>>> ii) the resulting shape file has a language identifier set to 57 
>>>> (ANSI), but it really is UTF-8 [1][2].
>>> That is correct.   The Shapefile driver is currently unaware of
>>> encoding issues and always marks the generated dbf files with the 
>>> default
>>> (ANSI) setting.  Internally OGR attempts to manage text attributes
>>> in UTF-8, and the postgres driver does honour that.
>>> The Shapefile driver really needs to be upgraded to be encoding aware;
>>> however, there are manpower and technical issues around how to do that
>>> properly.
>> Where could one start to work on this issue? Is this in the realm of 
>> shapelib? We did some analysis on this in our company and maybe we 
>> can help with this.
> Some work would need to be done within shapelib's dbfopen.c code to
> read and write the encoding indicators.  Some work would need to be
> done in the OGR shapefile driver to translate to UTF-8 when reading
> and to translate to the target encoding on output with a creation
> option to control encoding.
> I am very concerned about compatability issues, so some care would
> be necessary.
> There are tickets on the issue that may have some information.
> Best regards,
Hello Frank,

as far as I can tell (and resulting from some simple test programs, see 
attachment), shapelib seems to handle LDID and the cpg file perfectly 
well, both in reading and writing.

I will try to have a look at the OGR code in the next days.



