[gdal-dev] Wrapper string encodings are inconsistent
Michael
mbucari1 at gmail.com
Wed Mar 25 12:00:25 PDT 2026
Every function which returns char** has the "char **CSL" typemap applied,
which causes strings in the returned array to be decoded with UTF-8.
Every function which accepts a char** parameter has either the "char
**options", "char **dict", or "char **dictAndCSLDestroy" typemap applied,
which causes strings in the parameter's array to be encoded with UTF-8.
However, many functions which return a single string value or accept single
strings as arguments do not use UTF-8 encoding. This causes several
inconsistencies in the wrapper's behavior.
For example, many times string values from string arrays which are UTF-8
are used in other functions which are not UTF-8.
Some examples:
- AlgorithmRegistry.GetAlgNames() returns a string array of algorithm names
decoded with UTF-8, but AlgorithmRegistry.InstantiateAlg(string algName)
does not encode algName with UTF-8.
- Algorithm.GetArgNames() returns a string array of argument names decoded
with UTF-8, but Algorithm.GetArg(string argName) does not encode argName
with UTF-8.
- GeomCoordinatePrecision.GetFormats() returns a string array of format
names decoded with UTF-8, but
GeomCoordinatePrecision.GetFormatSpecificOptions(string formatName) does
not encode formatName with UTF-8.
Also, some functions which return a string array have related functions
which return a single string value, but the strings in the array are
encoded with UTF-8 while the single string values are not. For example,
AlgorithmArg.GetAsStringList() returns an array of strings decoded with
UTF-8, but AlgorithmArg.GetAsString() does not decode its returned string
with UTF-8.
And finally, many other string functions which accept or return strings not
encoded with UTF-8 probably _should be UTF-8_.
Some examples:
- Any "Get*Name" function or "name" property
- Any "Get*Description" function
- Any "Create*", "Delete*", or "Get*" function which accepts a "*name"
parameter
Really, are there _any_ strings which _shouldn't_ be encoded with UTF-8? I
can't find a single reason why every string passed to the wrapper should
not be encoded as UTF-8, and no reason why every string retrieved from the
wrapper should not be decoded with UTF-8.
--
Michael Bucari
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20260325/66f6d95a/attachment.htm>
More information about the gdal-dev
mailing list