[gdal-dev] unicode support in GDAL

Claudiu Cochior Claudiu.Cochior at bentley.com
Fri Aug 13 09:24:20 PDT 2021


Thanks for the advice,

My string were in utf-8 but were not displaying well even with cpg file. Setting the encoding for the layer did the trick. Thanks

From: Even Rouault <even.rouault at spatialys.com>
Sent: Thursday, August 12, 2021 12:24 PM
To: Claudiu Cochior <Claudiu.Cochior at bentley.com>; gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] unicode support in GDAL


WARNING: This email originated from outside of the organization. DO NOT click links, open attachments, or respond unless you recognize the sender and know the content is safe.

________________________________

Claudiu,

OGR supports unicode strings, and expect/outputs in UTF-8 as the pivot encoding. See https://gdal.org/development/rfc/rfc23_ogr_unicode.html<https://urldefense.com/v3/__https:/gdal.org/development/rfc/rfc23_ogr_unicode.html__;!!F1Q1IbZmrAg!X0yQNxc3iWUObkK4uTo7AmP54lFFTXf-YBmMDpLLGuT-Gwb5bZlrK9YDMJNwbHjHQrf9cA$> for details
For shapefiles on writing, you'll need to pass the ENCODING=UTF-8 layer creation option (see https://gdal.org/drivers/vector/shapefile.html#layer-creation-options<https://urldefense.com/v3/__https:/gdal.org/drivers/vector/shapefile.html*layer-creation-options__;Iw!!F1Q1IbZmrAg!X0yQNxc3iWUObkK4uTo7AmP54lFFTXf-YBmMDpLLGuT-Gwb5bZlrK9YDMJNwbHjfCK585w$>), or at another value that is compatible of cyrillic characters (CP1251 e.g.)

Demo (in UTF-8 console)

$ cat cyrillic.csv
id,txt
1,"Привет"
$ ogr2ogr cyrillic.shp cyrillic.csv -lco ENCODING=CP1251

$ ogrinfo cyrillic.dbf -al -q

Layer name: cyrillic
Metadata:
  DBF_DATE_LAST_UPDATE=2021-08-12
OGRFeature(cyrillic):0
  id (String) = 1
  txt (String) = Привет

Even

Le 12/08/2021 à 17:54, Claudiu Cochior via gdal-dev a écrit :
Hello everybody,

I have a question related to Unicode strings in GDAL

We are using GDAL 3.0.4 and at some point we would like to write to shapefile a string that contains Russian characters. To give you some context, we are in C++, we created a OGRFeature and we want to set a field to the string that contains the Russian characters. My machine is in English and the language for non-unicode programs is set to English. I didn’t find in the GDAL doc a definitive answer if GDAL support Unicode strings for filed values. As a test I converted the System::String to UTF-8 but the shape file displays only ? for the characters. If I change the language for non-unicode programs to Russian then I can safely extract the ANSI string and give it to the SetField method and the result is OK.

So, does GDAL supports unicede field string values somehow?

Thanks,


Claudiu

________________________________


This email, including any attachments, may contain confidential and/or proprietary information intended only for the use of the recipient. If you are not the intended recipient, any distribution, copying, or use of this email or its attachments is prohibited. If you received this email in error, please reply to the sender immediately and delete this message and any copies.

Bentley Systems has taken all reasonable steps to ensure that this communication is free from viruses, data corruption, and unauthorized alteration. Bentley Systems does not accept liability for any damages that may be incurred as a result of this or any communication by email


[Image removed by sender.]



_______________________________________________

gdal-dev mailing list

gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev<https://urldefense.com/v3/__https:/lists.osgeo.org/mailman/listinfo/gdal-dev__;!!F1Q1IbZmrAg!X0yQNxc3iWUObkK4uTo7AmP54lFFTXf-YBmMDpLLGuT-Gwb5bZlrK9YDMJNwbHiygVmSlw$>

--

http://www.spatialys.com<https://urldefense.com/v3/__http:/www.spatialys.com__;!!F1Q1IbZmrAg!X0yQNxc3iWUObkK4uTo7AmP54lFFTXf-YBmMDpLLGuT-Gwb5bZlrK9YDMJNwbHjxpaae3w$>

My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210813/7c7a4a17/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ~WRD0001.jpg
Type: image/jpeg
Size: 823 bytes
Desc: ~WRD0001.jpg
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210813/7c7a4a17/attachment.jpg>


More information about the gdal-dev mailing list