[postgis-users] shp2pgsql encoding problem - Thai Language

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Wed Sep 16 01:30:10 PDT 2009


Ben Madin wrote:

> G'day all,
> 
> I have a set of boundaries, with the locations in Thai. As best as I can 
> tell, the encoding is ISO-8559-11 (not an official encoding, but similar 
> to other latin/non-latin encodings)
> 
> does that mean I'm stumped when it comes to importing it using 
> shp2pgsql. I normally set -W UTF-8, but in this case I get utf8: Illegal 
> byte sequence. (the db is utf-8)
> 
> If I try using -W WIN874 (the only option in the postgres manual that 
> mentions Thai) I get utf8: iconv_open: Invalid argument
> 
> Is this a hopeless case - do I need to (can I?) edit the .dbf file to 
> remove the columns with the Thai encoding and then type them all back 
> in!? Is there another way?
> 
> cheers
> 
> Ben

Hi Ben,

AFAICT shp2pgsql always outputs UTF-8 encoded text whenever the input 
encoding is specified via the -W option.

Note that shp2pgsql uses its own iconv conversion routines, and not the 
PostgreSQL conversion routines. Hence if you take a look at the iconv 
documentation here: http://www.gnu.org/software/libiconv/ you can see 
that ISO-8859-11 is actually supported as a -W parameter.

Once your output file has been produced, you should be able to run it 
directly into a UTF-8 encoded database without any problems. I do agree 
that the documentation on this is a little vague though (I had to refer 
to the source to see what was happening...)


ATB,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs



More information about the postgis-users mailing list