[postgis-users] shp2pgsql and encoding issues

Markus Schaber schabi at logix-tt.com
Wed May 25 00:11:50 PDT 2005


Hi, Mike,

Mike Leahy wrote:

> I had the same problem a while ago, and found that if my database is set
> to SQL_ASCII it would work fine...though I think you can only set the
> encoding when the database is created.  I tried using some text editors
> to convert the characters in the script created by the shp2pgsql tool,
> but that didn't work.  I was able to use find/replace to change special
> characters (in Spanish) to unicode-compatible characters, which sort-of
> worked, but I didn't really want to get rid of those characters.  I
> found it was just easier to create my database in SQL_ASCII from the
> start.  Perhaps somebody else has a more effective solution...

One should _never_ use SQL_ASCII, except for existing databases that
cannot be migrated.

ASCII has only characters from 0 to 127 per definition. All so-called
international characters are only accepted by PostgreSQL "by accident"
as it works on whole bytes and does not have any encoding checks when
working with SQL_ASCII, to support legacy applications from the time
when PostgreSQL did not support encodings at all.

To insert data with a different encoding than your default encoding, you
can either make shp2pgsql encoding-aware using iconv (as explained by
strk), or you can use a pipe to prepend your input with the appropriate
encoding setting command, as I did for inserting Latin1 data into an
UNICODE database, forcing PostgreSQL to do the conversion:

(echo 'set client_encoding to latin1;' ; shp2pgsql [some options) | psql
 yourdatabase

HTH,
Markus





More information about the postgis-users mailing list