[postgis-users] How to fix double-encoded UTF8 characters

Hugo Nicolau Barbosa de Gusmão hugonbgg at gmail.com
Wed Dec 16 19:30:49 PST 2020


I have a dataset (shapefile) with the same problem as the post below:

"A previous LOAD DATA INFILE was run under the assumption that the CSV file
is latin1-encoded. During this import the multibyte characters were
interpreted as two single character and then encoded using utf-8 (again).

This double-encoding created anomalies like ñ instead of ñ.

How to correct these strings?"
https://stackoverflow.com/questions/11436594/how-to-fix-double-encoded-utf8-characters-in-an-utf-8-table


However, the solution given is in mysql and not postgres, i tried it on
postgres and it didn't work, just on mysql:

UPDATE tablename SET
    field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);

I need to import and fix this shapefile using postgres because I will need
to use postgis to do various spatial analyzes.

How can I solve this using postgis?

Many thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20201217/2d789bc1/attachment.html>


More information about the postgis-users mailing list