[postgis-users] shp2pgsql character set conversions

Stephen Woodbridge woodbri at swoodbridge.com
Sat Mar 12 06:52:21 PST 2011


On 3/12/2011 3:21 AM, Sandro Santilli wrote:
> On Sat, Mar 12, 2011 at 12:57:50AM -0500, Stephen Woodbridge wrote:
>> Hi All,
>>
>> I am trying to load a shpfile that is in UTF8, but there is some random
>> character that is not. This is causing a load error and rollback on the
>> transaction.
>>
>> I notice looking at:
>>
>> man 2 iconv_open
>>
>> that it is possible to add //IGNORE and/or //TRANSLIT to the tocode string.
>
> Someone submitted a patch some time ago supporting this.
> Generally, I think it'd be good to have a switch to set a policy
> on encoding errors, like we have on NULL geometries handling (-N).

Hi strk,

Thanks. I agree that this should be added to the code.

To work around my problem, I applied this hack:

  diff -urNad postgis-1.5.1/loader/shp2pgsql-core.c 
postgis-1.5.1a/loader/shp2pgsql-core.c
--- postgis-1.5.1/loader/shp2pgsql-core.c       2010-02-03 
17:42:13.000000000 -0500
+++ postgis-1.5.1a/loader/shp2pgsql-core.c      2011-03-12 
00:55:13.000000000 -0500
@@ -87,7 +87,7 @@

         inbytesleft = strlen(inputbuf);

-       cd = iconv_open("UTF-8", fromcode);
+       cd = iconv_open("UTF-8//IGNORE", fromcode);
         if ( cd == ((iconv_t)(-1)) )
                 return NULL;

Data is loading now, so I'll see if that resolves my problem.

-Steve



More information about the postgis-users mailing list