[gdal-dev] Shapefiles encoded in UTF-16 ?
Even Rouault
even.rouault at spatialys.com
Thu Feb 21 05:01:23 PST 2019
> I have never encountered a shapefile in UTF-16, but I am beginning to wonder
> if we ought to support them.
While nobody has ever seen one such file, that remains a rather theoretical
exercice :-)
> I guess they would be more space-efficient for
> languages like Chinese and Japanese, where most characters need three UTF-8
> bytes but only two UTF-16 bytes. This could be important since DBF reserves
> only 10 bytes for field names.
>
> Some questions:
>
> Can the OGR Shape driver handle UTF-16?
Probably not. I guess it would have issues with the NUL bytes found in
characters of the ASCII subset of UTF-16. The shapelib DBF API assumes NUL
terminated strings.
>
> (I also wonder if shapefiles in UTF-16 is a good idea, or if the GIS
> community just ought to forget about them, but I guess there is no definite
> answer to that!)
I'd say unless such beasts are widely found in the wild, let's not bother too
much about that...
Even
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the gdal-dev
mailing list