[postgis-users] "Linux" geocoder script ?

Don harterc1 at comcast.net
Tue Apr 12 00:08:06 PDT 2011


My database is encoded as
  geocoder  | drh      | UTF8     | C         | en_US.UTF-8 | .
All my shp2pgsql statements have the -W option like this.
${loader}  -a -s 4269 -g the_geom -W "latin1" $z 
${staging_schema}.${state_abbrev}_${table_name} | $PGBIN/psql -d 
$PGDATABASE;

Here is the bug that I was referring to.
http://trac.osgeo.org/postgis/ticket/808
In one case I had a very large number of inserts processed for the shape 
file and then got that error.

 From your link it says:
"To enable automatic character set conversion, you have to tell 
PostgreSQL the character set (encoding) you would like to use in the 
client. There are several ways to accomplish this: "
Perhaps I need to use

SET CLIENT_ENCODING TO '/value/'; in psql or is shp2pgsql supposed to do that when I use the -W option?
postgis is expecting utf-8 when it should be expecting  latin1 and converting it to utf-8.
Could data type for a column have some effect on this?



On 04/11/2011 08:52 PM, Sylvain Racine wrote:
> Hello,
>
> This is not a shp2pgsql bug. You get this error when you try to insert 
> string data in PostgreSQL from another encoding that the one of your 
> database Ex: Your data is formatted in Latin1 (ISO-8859-1) and you 
> insert them in a UTF-8 database. To fix the error message, you need to 
> convert your data.
>
> PostgreSQL have a internal converter. shp2pgsql have it too. Try 
> shp2pgsql -W <encoding> where <encoding> is the format of you DBase 
> file .dbf. This is called the "client encoding" in PostgreSQL. See 
> list of valid encoding type:
> http://www.postgresql.org/docs/9.0/static/multibyte.html
>
> Don't mix it with the database encoding. It is the one you us to 
> create your databse. There is also a default database charset, 
> depending of your OS. It is the one you use to create template1 
> database in init-db.  Mine is "UTF8" on Ubuntu.
>
> Hope that this information will help you
>
> Regards
>
> Sylvain Racine
>
> On 2011-04-11 21:22, Don wrote:
>> I have got the tiger2010 geodecoder to work on my Opensuse system.
>> geocoder=#
>> geocoder=# SELECT g.rating,
>> geocoder-#         ST_X(geomout) As lon,
>> geocoder-#         ST_Y(geomout) As lat, (addy).*
>> geocoder-# FROM geocode('1731 New Hampshire Avenue Northwest, 
>> Washington, DC 20010') As g;
>>  rating |        lon        |       lat        | address | 
>> predirabbrev |  streetname   | streettypeabbrev | postdirabbrev | 
>> internal |  location  | stateabbrev |  zip  | parsed
>> --------+-------------------+------------------+---------+--------------+---------------+------------------+---------------+----------+------------+-------------+-------+-------- 
>>
>>       0 | -77.0399013800607 | 38.9134181361424 |    1731 
>> |              | New Hampshire | Ave              | NW            
>> |          | Washington | DC          | 20009 | t
>> (1 row)
>> There are a few glitches.  I noticed that I am getting this message 
>> sometimes.
>> INSERT 0 1
>> INSERT 0 1
>> INSERT 0 1
>> INSERT 0 1
>> ERROR:  invalid byte sequence for encoding "UTF8": 0xed6f20
>> HINT:  This error can also happen if the byte sequence does not match 
>> the encoding expected by the server, which is controlled by 
>> "client_encoding".
>> ERROR:  current transaction is aborted, commands ignored until end of 
>> transaction block
>> ERROR:  current transaction is aborted, commands ignored until end of 
>> transaction block
>> ERROR:  current transaction is aborted, commands ignored until end of 
>> transaction block
>> I researched this some and it appears to be a  shp2pgsql bug.
>> But I am using postgis-utils-2.0.0SVN-1.2.x86_64
>> postgis-2.0.0SVN-1.2.x86_64  where this has supposedly been fixed.  
>> Or could the census data be corrupted?
>> So I have "lost" some of the data due to this error.
>> I had problems with psql generating ctrl-m instead of \n which would 
>> really mess up the script when it ran.
>> So after I generated my load tiger script I ran this command
>> tr "\r" "\n" < load_tiger > load_tiger2
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20110412/9a9b1e47/attachment.html>


More information about the postgis-users mailing list