[gdal-dev] WindowsLatin1 encoded strings

MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI Richard.Mitanchey at cerema.fr
Mon Jan 19 04:46:58 PST 2015


Even,
Thank you for your excellent suggestion (as always)
I'll try to implement it into my Talend TOS project (quite a bit 
complicated but it  may works)
Richard

Le 19/01/2015 13:17, > Even Rouault (par Internet) a écrit :
> Le lundi 19 janvier 2015 12:46:20, MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI
> a écrit :
>> Hi,
>> I need to get WindowsLatin1 encoded Strings (mapinfo .tab files), and I
>> cannot really convert original data to UTF-8 before...
>> I'm using OGR (GDAL Java binding) with GetFieldAsString() but string
>> lengths (and chars within) are most of the times incorrect
>> Is there any way to specify read and write string encodings ?
>> Should it be a pb of GDAL Java binding ?
> Richard,
>
> This is a problem of the TAB driver that should recode strings to UTF-8
> internally as this is the conventionnal encoding decided in OGR.
> And also a problem of the Java bindings which should offer a binary interface
> in that case, since GetFieldAsString() can only be used to convert native
> UTF-8 strings into Java unicode strings.
> Both issues could potentially be fixed.
>
> A potential workaround is to convert the .tab into a .shp by using --config
> SHAPE_ENCODING "" in ogr2ogr, so that Latin1 strings are put directly
> unmodified. And then read the shapefile, in which case it will recode from
> Latin1 to UTF-8, and then you can use GetFieldAsString()
>
> Even
>
>
>> TIA for your answers
>> Richard
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20150119/40fc67c3/attachment.html>


More information about the gdal-dev mailing list