[gdal-dev] Shapefiles encoded in UTF-16 ?

Even Rouault even.rouault at spatialys.com
Thu Feb 21 05:01:23 PST 2019


> I have never encountered a shapefile in UTF-16, but I am beginning to wonder
> if we ought to support them.

While nobody has ever seen one such file, that remains a rather theoretical 
exercice :-)

> I guess they would be more space-efficient for
> languages like Chinese and Japanese, where most characters need three UTF-8
> bytes but only two UTF-16 bytes. This could be important since DBF reserves
> only 10 bytes for field names.
> 
> Some questions:
> 
> Can the OGR Shape driver handle UTF-16?

Probably not. I guess it would have issues with the NUL bytes found in 
characters of the ASCII subset of UTF-16. The shapelib DBF API assumes NUL 
terminated strings.

> 
> (I also wonder if shapefiles in UTF-16 is a good idea, or if the GIS
> community just ought to forget about them, but I guess there is no definite
> answer to that!)

I'd say unless such beasts are widely found in the wild, let's not bother too 
much about that...

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list