[postgis-users] How to fix double-encoded UTF8 characters

Paul Ramsey pramsey at cleverelephant.ca
Thu Dec 17 08:16:25 PST 2020


You could also skip the database entirely, use shp2pgsql to turn the shape file into text, then pass the text through iconv a couple times to do the conversion steps you need.


> On Dec 16, 2020, at 7:30 PM, Hugo Nicolau Barbosa de Gusmão <hugonbgg at gmail.com> wrote:
> 
> I have a dataset (shapefile) with the same problem as the post below:
> 
> "A previous LOAD DATA INFILE was run under the assumption that the CSV file is latin1-encoded. During this import the multibyte characters were interpreted as two single character and then encoded using utf-8 (again).
> 
> This double-encoding created anomalies like ñ instead of ñ.
> 
> How to correct these strings?"
> https://stackoverflow.com/questions/11436594/how-to-fix-double-encoded-utf8-characters-in-an-utf-8-table
> 
> 
> However, the solution given is in mysql and not postgres, i tried it on postgres and it didn't work, just on mysql:  
> 
> UPDATE tablename SET
>     field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);
> 
> I need to import and fix this shapefile using postgres because I will need to use postgis to do various spatial analyzes.
> 
> How can I solve this using postgis?
> 
> Many thanks 
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users



More information about the postgis-users mailing list