[gdal-dev] WindowsLatin1 encoded strings

Even Rouault even.rouault at spatialys.com
Mon Jan 19 04:17:11 PST 2015


Le lundi 19 janvier 2015 12:46:20, MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI 
a écrit :
> Hi,
> I need to get WindowsLatin1 encoded Strings (mapinfo .tab files), and I
> cannot really convert original data to UTF-8 before...
> I'm using OGR (GDAL Java binding) with GetFieldAsString() but string
> lengths (and chars within) are most of the times incorrect
> Is there any way to specify read and write string encodings ?
> Should it be a pb of GDAL Java binding ?

Richard,

This is a problem of the TAB driver that should recode strings to UTF-8 
internally as this is the conventionnal encoding decided in OGR.
And also a problem of the Java bindings which should offer a binary interface 
in that case, since GetFieldAsString() can only be used to convert native 
UTF-8 strings into Java unicode strings.
Both issues could potentially be fixed.

A potential workaround is to convert the .tab into a .shp by using --config 
SHAPE_ENCODING "" in ogr2ogr, so that Latin1 strings are put directly 
unmodified. And then read the shapefile, in which case it will recode from 
Latin1 to UTF-8, and then you can use GetFieldAsString()

Even


> TIA for your answers
> Richard
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list