[gdal-dev] WindowsLatin1 encoded strings
Even Rouault
even.rouault at spatialys.com
Mon Jan 19 04:17:11 PST 2015
Le lundi 19 janvier 2015 12:46:20, MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI
a écrit :
> Hi,
> I need to get WindowsLatin1 encoded Strings (mapinfo .tab files), and I
> cannot really convert original data to UTF-8 before...
> I'm using OGR (GDAL Java binding) with GetFieldAsString() but string
> lengths (and chars within) are most of the times incorrect
> Is there any way to specify read and write string encodings ?
> Should it be a pb of GDAL Java binding ?
Richard,
This is a problem of the TAB driver that should recode strings to UTF-8
internally as this is the conventionnal encoding decided in OGR.
And also a problem of the Java bindings which should offer a binary interface
in that case, since GetFieldAsString() can only be used to convert native
UTF-8 strings into Java unicode strings.
Both issues could potentially be fixed.
A potential workaround is to convert the .tab into a .shp by using --config
SHAPE_ENCODING "" in ogr2ogr, so that Latin1 strings are put directly
unmodified. And then read the shapefile, in which case it will recode from
Latin1 to UTF-8, and then you can use GetFieldAsString()
Even
> TIA for your answers
> Richard
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the gdal-dev
mailing list