[postgis-users] Newbie Geocoding Error
Dan Putler
dan.putler at sauder.ubc.ca
Sun Jul 10 13:41:16 PDT 2011
Hi Steve,
Thanks for the clarification, are you getting the census place name via
a TLID to TFID translation?
Dan
On 07/10/2011 12:55 PM, Stephen Woodbridge wrote:
> Dan,
>
> Thanks for pointing out that I augmented the Tiger data. But it should
> be noted that I only added the preferred postal name for the zipcode as
> an alias place name. So if the zipcode on the record is bad then I add a
> potentially bad alias place name. I do join the tiger records and look
> up the census place, county sub-division, and county names for each
> street record so PAGC can fall back on that is the zip is bad.
>
> -Steve
>
> On 7/10/2011 3:25 PM, Dan Putler wrote:
>> Hi,
>>
>> The zip code is just wrong, 94115 is in located in San Francisco. PAGC
>> can make a good probabilistic match on the address largely because Steve
>> PAGC service is working with an augmented TIGER database that has has
>> both the state and place appended to the edges which PAGC can work with
>> for what we call the "macro" address component. With just the "stock"
>> TIGER edges, the only macro address component available is the five
>> digit zipcode, and if that is wrong, the PostGIS TIGER geocoder
>> struggles. PAGC can actually handle the address string 477 Camino del
>> Rio South, 94115, and get the correct address as the most likely
>> candidate, but this is likely to be luck as much as anything else.
>>
>> Dan
>>
>> On 07/10/2011 12:05 PM, Paragon Corporation wrote:
>>> Mike,
>>>
>>>> 1. the upgrade_geocoder script ran with most of the sql returning
>>> 'already exists' errors. Do I need to modify the script to drop before
>>> recreating?
>>> No you should be fine. The already exists and already exists skipping
>>> issue can be ignored. I didn't want to destroy some of these
>>> structures if they existed because they could be tied to data.
>>> That is why I don't have the whole upgrade script in a transaction
>>> because it would fail the whole thing when it doesn't need to.
>>> Unfortunately PostgresQL doesn't yet support the CREATE IF NOT EXISTS
>>> for a lot of things like new columns I needed to add to lookup etc.
>>>
>>>> 2. My original error went away.
>>>> So, now onto another address that does not geocode. The results
>>> don't even include the street in the query.
>>>> 477 Camino del Rio South, San Diego, CA 94115
>>>> SELECT * from normalize_address returns
>>>> address predirabbrev streetname streettypeabbrev postdirabbrev
>>> internal location stateabbrev zip parsed
>>>> 477<null> Camino del Rio<null> S<null> San Diego CA 94115 true
>>>> I don't have enough knowledge of postGIS right now to know if it's a
>>> bug or operator error.
>>> As Steve mentioned in another post, it could be a data issue since the
>>> normalization looks right. I looked at the results
>>> for California and I see what you mean that the street appears nowhere
>>> in results.
>>> If I look at my underlying Tiger edges and featnames for California.
>>> ca_edges, ca_featnames
>>> There is a Camino del Rio Ct, but has its zip listed as 93308 which is
>>> no where near 94115 as far as numeric distance goes.
>>> If I look at the street ranges in ca_addr for that tlid, It gives me
>>> ranges from 2400 - 3199 again which is no where near what that is. So
>>> your address
>>> fails the only Camino del Rio match it could possibly match to in a
>>> big way.
>>> In fact when I look thru all the street names in ca_edges with zip
>>> 94115, none start with Camino anything.
>>> So even taking off the zip doesn't help. It does seem to be a data issue.
>>> -- Steve,
>>> I don't think its the zip issue per se because I do have that zip
>>> listed in my ca_zip_state and as I recall, I don't think I have
>>> my loader generate the zips from the zcta5 file, since that would be
>>> inaccurate since zips aren't really polygons and also aren't updated
>>> as frequently as you stated.
>>> It could be this street is known by another name and Tiger doesn't
>>> have this particular name listed in its featnames alias or it somehow
>>> missed this place entirely.
>>> Leo says his friend works around there :) so I guess its an important
>>> place to miss.
>>> Hope that helps,
>>> Regina and Leo
>>> http://www.postgis.us
>>>
>>> Message: 10
>>> Date: Sat, 9 Jul 2011 14:23:09 -0400
>>> From: "Paragon Corporation"<lr at pcorp.us<mailto:lr at pcorp.us>>
>>> Subject: Re: [postgis-users] Newbie Geocoding Error
>>> To: "'PostGIS Users Discussion'"
>>> <postgis-users at postgis.refractions.net
>>> <mailto:postgis-users at postgis.refractions.net>>
>>> Message-ID:<20AE3B391EAA4F1A8CBE0347002383E2 at J>
>>> Content-Type: text/plain; charset="us-ascii"
>>>
>>> Mike,
>>>
>>> Which version are you running with? For the newer ones -- if you look at
>>> the normalize_address function , you should see a stamp in the
>>> beginning of
>>> the code that has
>>>
>>> --$Id: normalize_address.sql 7616 2011-07-07 12:41:13Z robe $-
>>>
>>> That is the latest version stamp. Really old versions don't even have a
>>> stamp. If you are running something older than a week or 2 ago, that is
>>> probably why you are having these issues.
>>> The easiest way to upgrade to the latest is:
>>> 1) Download the PostGIS 2.0 tar ball from here --
>>> http://www.postgis.org/download/ (the tiger_geocoder is in
>>> extras/tiger_geocoder/tiger_2010)
>>> -- (This new version requires PostGIS 1.5+ and PostgreSQL 8.4+)
>>>
>>> 2) Edit the upgrade_geocoder.sh file with your postgres settings and then
>>> run it.
>>>
>>> 3) Run the Missing_indexes_Generate_Script() to get the commands to build
>>> indexes you may be missing.
>>>
>>> http://www.postgis.org/documentation/manual-svn/Missing_Indexes_Generate_Scr
>>>
>>> ipt.html
>>> and then execute that generated script
>>>
>>> ----------------------------------------------------------------------------
>>>
>>> --------------
>>> I think this is a bug I might have fixed. When I run normalize on this I
>>> don't get an error, and yours seems to be breaking in the normalize step.
>>>
>>> When I run
>>> SELECT * FROM normalize_address('3261 South West 160 Avenue, Miramar, FL
>>> 33027')
>>>
>>> I get the below
>>>
>>>
>>>
>>> address | predirabbrev | streetname | streettypeabbrev | postdirabbrev |
>>> internal | location | stateabbrev | zip | parsed
>>> ---------+--------------+------+------------------+---------------+---------
>>>
>>> -+----------+-------------+-------+--------
>>> 3261 | SW | 160 | Ave |
>>> | Miramar | FL | 33027 | t
>>>
>>>
>>> Unfortunately I don't have Florida data loaded so can't tell if it will
>>> geocode right or not.
>>>
>>> Regarding bad rating results, the issue is often what the address gets
>>> normalized to and a bad normalization is often the cause of bad geocoding
>>> results. So probably the
>>> best check to do is do a simple
>>>
>>> SELECT * FROM normalize_address(...);
>>>
>>> and see what it comes with and if that normalized addess makes sense.
>>>
>>> There are many cases where it doesn't that I have fixed recently and some
>>> more that are known bugs I am working on. Check out the list of issues
>>> and
>>> closed ones for details:
>>> http://trac.osgeo.org/postgis/query?component=tiger+geocoder
>>>
>>>
>>> The other issue I see with your code -- which perhaps it was for
>>> testing, is
>>> that you are normalizing twice.
>>>
>>> You should just be doing a:
>>> geocode('3261 South West 160 Avenue, Miramar, FL 33027')
>>>
>>> No need for those extra steps.
>>>
>>> You might want to take a look at the manual for recent improvements
>>> made too
>>> and examples:
>>> http://www.postgis.org/documentation/manual-svn/Geocode.html
>>>
>>> Hope that helps,
>>> Regina
>>> http://www.postgis.us<http://www.postgis.us/>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list