[postgis-users] Newbie Geocoding Error

Stephen Woodbridge woodbri at swoodbridge.com
Sun Jul 10 12:55:53 PDT 2011


Dan,

Thanks for pointing out that I augmented the Tiger data. But it should 
be noted that I only added the preferred postal name for the zipcode as 
an alias place name. So if the zipcode on the record is bad then I add a 
potentially bad alias place name. I do join the tiger records and look 
up the census place, county sub-division, and county names for each 
street record so PAGC can fall back on that is the zip is bad.

-Steve

On 7/10/2011 3:25 PM, Dan Putler wrote:
> Hi,
>
> The zip code is just wrong, 94115 is in located in San Francisco. PAGC
> can make a good probabilistic match on the address largely because Steve
> PAGC service is working with an augmented TIGER database that has has
> both the state and place appended to the edges which PAGC can work with
> for what we call the "macro" address component. With just the "stock"
> TIGER edges, the only macro address component available is the five
> digit zipcode, and if that is wrong, the PostGIS TIGER geocoder
> struggles. PAGC can actually handle the address string 477 Camino del
> Rio South, 94115, and get the correct address as the most likely
> candidate, but this is likely to be luck as much as anything else.
>
> Dan
>
> On 07/10/2011 12:05 PM, Paragon Corporation wrote:
>> Mike,
>>
>> > 1. the upgrade_geocoder script ran with most of the sql returning
>> 'already exists' errors. Do I need to modify the script to drop before
>> recreating?
>> No you should be fine. The already exists and already exists skipping
>> issue can be ignored. I didn't want to destroy some of these
>> structures if they existed because they could be tied to data.
>> That is why I don't have the whole upgrade script in a transaction
>> because it would fail the whole thing when it doesn't need to.
>> Unfortunately PostgresQL doesn't yet support the CREATE IF NOT EXISTS
>> for a lot of things like new columns I needed to add to lookup etc.
>>
>> > 2. My original error went away.
>>
>> > So, now onto another address that does not geocode. The results
>> don't even include the street in the query.
>> > 477 Camino del Rio South, San Diego, CA 94115
>>
>> > SELECT * from normalize_address returns
>> > address predirabbrev streetname streettypeabbrev postdirabbrev
>> internal location stateabbrev zip parsed
>> > 477 <null> Camino del Rio <null> S <null> San Diego CA 94115 true
>>
>> > I don't have enough knowledge of postGIS right now to know if it's a
>> bug or operator error.
>> As Steve mentioned in another post, it could be a data issue since the
>> normalization looks right. I looked at the results
>> for California and I see what you mean that the street appears nowhere
>> in results.
>> If I look at my underlying Tiger edges and featnames for California.
>> ca_edges, ca_featnames
>> There is a Camino del Rio Ct, but has its zip listed as 93308 which is
>> no where near 94115 as far as numeric distance goes.
>> If I look at the street ranges in ca_addr for that tlid, It gives me
>> ranges from 2400 - 3199 again which is no where near what that is. So
>> your address
>> fails the only Camino del Rio match it could possibly match to in a
>> big way.
>> In fact when I look thru all the street names in ca_edges with zip
>> 94115, none start with Camino anything.
>> So even taking off the zip doesn't help. It does seem to be a data issue.
>> -- Steve,
>> I don't think its the zip issue per se because I do have that zip
>> listed in my ca_zip_state and as I recall, I don't think I have
>> my loader generate the zips from the zcta5 file, since that would be
>> inaccurate since zips aren't really polygons and also aren't updated
>> as frequently as you stated.
>> It could be this street is known by another name and Tiger doesn't
>> have this particular name listed in its featnames alias or it somehow
>> missed this place entirely.
>> Leo says his friend works around there :) so I guess its an important
>> place to miss.
>> Hope that helps,
>> Regina and Leo
>> http://www.postgis.us
>>
>> Message: 10
>> Date: Sat, 9 Jul 2011 14:23:09 -0400
>> From: "Paragon Corporation" <lr at pcorp.us <mailto:lr at pcorp.us>>
>> Subject: Re: [postgis-users] Newbie Geocoding Error
>> To: "'PostGIS Users Discussion'"
>> <postgis-users at postgis.refractions.net
>> <mailto:postgis-users at postgis.refractions.net>>
>> Message-ID: <20AE3B391EAA4F1A8CBE0347002383E2 at J>
>> Content-Type: text/plain; charset="us-ascii"
>>
>> Mike,
>>
>> Which version are you running with? For the newer ones -- if you look at
>> the normalize_address function , you should see a stamp in the
>> beginning of
>> the code that has
>>
>> --$Id: normalize_address.sql 7616 2011-07-07 12:41:13Z robe $-
>>
>> That is the latest version stamp. Really old versions don't even have a
>> stamp. If you are running something older than a week or 2 ago, that is
>> probably why you are having these issues.
>> The easiest way to upgrade to the latest is:
>> 1) Download the PostGIS 2.0 tar ball from here --
>> http://www.postgis.org/download/ (the tiger_geocoder is in
>> extras/tiger_geocoder/tiger_2010)
>> -- (This new version requires PostGIS 1.5+ and PostgreSQL 8.4+)
>>
>> 2) Edit the upgrade_geocoder.sh file with your postgres settings and then
>> run it.
>>
>> 3) Run the Missing_indexes_Generate_Script() to get the commands to build
>> indexes you may be missing.
>>
>> http://www.postgis.org/documentation/manual-svn/Missing_Indexes_Generate_Scr
>>
>> ipt.html
>> and then execute that generated script
>>
>> ----------------------------------------------------------------------------
>>
>> --------------
>> I think this is a bug I might have fixed. When I run normalize on this I
>> don't get an error, and yours seems to be breaking in the normalize step.
>>
>> When I run
>> SELECT * FROM normalize_address('3261 South West 160 Avenue, Miramar, FL
>> 33027')
>>
>> I get the below
>>
>>
>>
>> address | predirabbrev | streetname | streettypeabbrev | postdirabbrev |
>> internal | location | stateabbrev | zip | parsed
>> ---------+--------------+------+------------------+---------------+---------
>>
>> -+----------+-------------+-------+--------
>> 3261 | SW | 160 | Ave |
>> | Miramar | FL | 33027 | t
>>
>>
>> Unfortunately I don't have Florida data loaded so can't tell if it will
>> geocode right or not.
>>
>> Regarding bad rating results, the issue is often what the address gets
>> normalized to and a bad normalization is often the cause of bad geocoding
>> results. So probably the
>> best check to do is do a simple
>>
>> SELECT * FROM normalize_address(...);
>>
>> and see what it comes with and if that normalized addess makes sense.
>>
>> There are many cases where it doesn't that I have fixed recently and some
>> more that are known bugs I am working on. Check out the list of issues
>> and
>> closed ones for details:
>> http://trac.osgeo.org/postgis/query?component=tiger+geocoder
>>
>>
>> The other issue I see with your code -- which perhaps it was for
>> testing, is
>> that you are normalizing twice.
>>
>> You should just be doing a:
>> geocode('3261 South West 160 Avenue, Miramar, FL 33027')
>>
>> No need for those extra steps.
>>
>> You might want to take a look at the manual for recent improvements
>> made too
>> and examples:
>> http://www.postgis.org/documentation/manual-svn/Geocode.html
>>
>> Hope that helps,
>> Regina
>> http://www.postgis.us <http://www.postgis.us/>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list