[gdal-dev] Unicode support in OGR Shape/DBF
Even Rouault
even.rouault at mines-paris.org
Tue Sep 6 17:08:05 EDT 2011
Le mardi 06 septembre 2011 19:42:18, Hilda Villegas a écrit :
> Hi,
>
>
>
> I'm trying to use the preliminary encoding support for shapefile/dbf as
> you said in the Ticket #882, SHAPE_ENCODING configuration variable can
> be used to override the interpretation, but I cannot find the valid
> values for this SHAPE_ENCODING anywhere, What value should I use if I
> want to write Unicode characters (UTF-8) in the DBF?
>
To overrite the encoding when *writing*, you should the ENCODING layer
creation option :
ogr2ogr out.shp indatasource -lco ENCODING=UTF-8
Note: in that case, the encoding of the input datasource must be already
encoded in UTF-8, which is the pivot encoding for OGR. The effect of -lco
ENCODING=UTF-8 will be essentially to write a .cpg file with UTF-8 as its
content. Apart from using a value of the form LDID/a_numeric_value where
a_numeric_value is a value in the first column of table 9 of
http://www.autopark.ru/ASBProgrammerGuide/DBFSTRUC.HTM (from 1 to 204), it is
not entirely clear which other values are valid for the ENCODING parameter to
have interoperability with other systems.
The SHAPE_ENCODING configuration option/environment variable is to be used when
you want to override the encoding indicated in the .dbf/.cpg file when reading
a shapefile. It can be set to any valid value recognized by the iconv library,
whose list you can get with iconv -l on a system with iconv binaries
installed. In that case, OGR will recode from SHAPE_ENCODING to UTF-8. If you
want no recoding to happen, you can set SHAPE_ENCODING="".
More information about the gdal-dev
mailing list