[postgis-users] Newbie Geocoding Error

Dan Putler dan.putler at sauder.ubc.ca
Sun Jul 10 12:25:39 PDT 2011


Hi,

The zip code is just wrong, 94115 is in located in San Francisco. PAGC 
can make a good probabilistic match on the address largely because Steve 
PAGC service is working with an augmented TIGER database that has has 
both the state and place appended to the edges which PAGC can work with 
for what we call the "macro" address component. With just the "stock" 
TIGER edges, the only macro address component available is the five 
digit zipcode, and if that is wrong, the PostGIS TIGER geocoder 
struggles. PAGC can actually handle the address string 477 Camino del 
Rio South, 94115, and get the correct address as the most likely 
candidate, but this is likely to be luck as much as anything else.

Dan

On 07/10/2011 12:05 PM, Paragon Corporation wrote:
>  Mike,
>
> > 1. the upgrade_geocoder script ran with most of the sql returning 
> 'already exists' errors.  Do I need to modify the script to drop 
> before recreating?
> No you should be fine.  The already exists and already exists 
> skipping issue can be ignored.  I didn't want to destroy some of these 
> structures if they existed because they could be tied to data.
> That is why I don't have the whole upgrade script in a transaction 
> because it would fail the whole thing when it doesn't need to.  
> Unfortunately PostgresQL doesn't yet support the CREATE IF NOT EXISTS 
> for a lot of things like new columns I needed to add to lookup etc.
>
> > 2. My original error went away.
>
> > So, now onto another address that does not geocode.  The results 
> don't even include the street in the query.
> > 477 Camino del Rio South, San Diego, CA 94115
>
> > SELECT * from normalize_address returns
> > address    predirabbrev    streetname    streettypeabbrev    
> postdirabbrev    internal    location    stateabbrev    zip    parsed
> > 477 <null>    Camino del Rio <null>    S <null>    San Diego    CA    
> 94115    true
>
> > I don't have enough knowledge of postGIS right now to know if it's a 
> bug or operator error.
>  As Steve mentioned in another post, it could be a data issue since 
> the normalization looks right.  I looked at the results
> for California and I see what you mean that the street appears nowhere 
> in results.
> If I look at my underlying Tiger edges and featnames for California.  
> ca_edges, ca_featnames
> There is a Camino del Rio  Ct, but has its zip listed as 93308   which 
> is no where near 94115 as far as numeric distance goes.
> If I look at the street ranges in ca_addr for that tlid, It gives me 
> ranges from 2400 - 3199 again which is no where near what that is. So 
> your address
> fails the only Camino del Rio match it could possibly match to in a 
> big way.
> In fact when I look thru all the street names in ca_edges with zip 
> 94115, none start with Camino anything.
> So even taking off the zip doesn't help.  It does seem to be a data issue.
> -- Steve,
> I don't think its the zip issue per se because I do have that zip 
> listed in my ca_zip_state and as I recall, I don't think I have
> my loader generate the zips from the zcta5 file, since that would be 
> inaccurate since zips aren't really polygons and also aren't updated
> as frequently as you stated.
> It could be this street is known by another name and Tiger doesn't 
> have this particular name listed in its featnames alias or it somehow 
> missed this place entirely.
> Leo says his friend works around there :) so I guess its an important 
> place to miss.
> Hope that helps,
> Regina and Leo
> http://www.postgis.us
>
> Message: 10
> Date: Sat, 9 Jul 2011 14:23:09 -0400
> From: "Paragon Corporation" <lr at pcorp.us <mailto:lr at pcorp.us>>
> Subject: Re: [postgis-users] Newbie Geocoding Error
> To: "'PostGIS Users Discussion'"
> <postgis-users at postgis.refractions.net 
> <mailto:postgis-users at postgis.refractions.net>>
> Message-ID: <20AE3B391EAA4F1A8CBE0347002383E2 at J>
> Content-Type: text/plain; charset="us-ascii"
>
> Mike,
>
> Which version are you running with?  For the newer ones -- if you look at
> the normalize_address function , you should see a stamp in the 
> beginning of
> the code that has
>
> --$Id: normalize_address.sql 7616 2011-07-07 12:41:13Z robe $-
>
> That is the latest version stamp. Really old versions don't even have a
> stamp.  If you are running something older than a week or 2 ago, that is
> probably why you are having these issues.
> The easiest way to upgrade to the latest is:
> 1) Download the PostGIS 2.0  tar ball from here --
> http://www.postgis.org/download/  (the tiger_geocoder is in
> extras/tiger_geocoder/tiger_2010)
> -- (This new version requires PostGIS 1.5+ and PostgreSQL 8.4+)
>
> 2) Edit the  upgrade_geocoder.sh file with your postgres settings and then
> run it.
>
> 3) Run the Missing_indexes_Generate_Script() to get the commands to build
> indexes you may be missing.
>
> http://www.postgis.org/documentation/manual-svn/Missing_Indexes_Generate_Scr
> ipt.html
> and then execute that generated script
>
> ----------------------------------------------------------------------------
> --------------
> I think this is a bug I might have fixed.  When I run normalize on this I
> don't get an error, and yours seems to be breaking in the normalize step.
>
> When I run
> SELECT * FROM normalize_address('3261 South West 160 Avenue, Miramar, FL
> 33027')
>
> I get the below
>
>
>
> address | predirabbrev | streetname | streettypeabbrev | postdirabbrev |
> internal | location | stateabbrev |  zip  | parsed
> ---------+--------------+------+------------------+---------------+---------
> -+----------+-------------+-------+--------
>     3261 | SW              | 160            | Ave              |
> | Miramar  | FL          | 33027 | t
>
>
> Unfortunately I don't have Florida data loaded so can't tell if it will
> geocode right or not.
>
> Regarding bad rating results, the issue is often what the address gets
> normalized to and  a bad normalization is often the cause of bad geocoding
> results.  So probably the
> best check to do is do a simple
>
> SELECT  * FROM normalize_address(...);
>
> and see what it comes with and if that normalized addess makes sense.
>
> There are many cases where it doesn't that I have fixed recently and some
> more that are known bugs I am working on.  Check out the list of 
> issues and
> closed ones for details:
> http://trac.osgeo.org/postgis/query?component=tiger+geocoder
>
>
> The other issue I see with your code -- which perhaps it was for 
> testing, is
> that you are normalizing twice.
>
> You should just be doing a:
> geocode('3261 South West 160 Avenue, Miramar, FL 33027')
>
> No need for those extra steps.
>
> You might want to take a look at the manual for recent improvements 
> made too
> and examples:
> http://www.postgis.org/documentation/manual-svn/Geocode.html
>
> Hope that helps,
> Regina
> http://www.postgis.us <http://www.postgis.us/>




More information about the postgis-users mailing list