[postgis-users] shp2pgsql encoding problem - Thai Language
Mark Cave-Ayland
mark.cave-ayland at siriusit.co.uk
Wed Sep 16 01:30:10 PDT 2009
Ben Madin wrote:
> G'day all,
>
> I have a set of boundaries, with the locations in Thai. As best as I can
> tell, the encoding is ISO-8559-11 (not an official encoding, but similar
> to other latin/non-latin encodings)
>
> does that mean I'm stumped when it comes to importing it using
> shp2pgsql. I normally set -W UTF-8, but in this case I get utf8: Illegal
> byte sequence. (the db is utf-8)
>
> If I try using -W WIN874 (the only option in the postgres manual that
> mentions Thai) I get utf8: iconv_open: Invalid argument
>
> Is this a hopeless case - do I need to (can I?) edit the .dbf file to
> remove the columns with the Thai encoding and then type them all back
> in!? Is there another way?
>
> cheers
>
> Ben
Hi Ben,
AFAICT shp2pgsql always outputs UTF-8 encoded text whenever the input
encoding is specified via the -W option.
Note that shp2pgsql uses its own iconv conversion routines, and not the
PostgreSQL conversion routines. Hence if you take a look at the iconv
documentation here: http://www.gnu.org/software/libiconv/ you can see
that ISO-8859-11 is actually supported as a -W parameter.
Once your output file has been produced, you should be able to run it
directly into a UTF-8 encoded database without any problems. I do agree
that the documentation on this is a little vague though (I had to refer
to the source to see what was happening...)
ATB,
Mark.
--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063
Sirius Labs: http://www.siriusit.co.uk/labs
More information about the postgis-users
mailing list