[postgis-tickets] [PostGIS] #4826: Geocoder gives goofy resutls for 1 Main St, Hanover, MA
PostGIS
trac at osgeo.org
Thu Jan 7 21:36:58 PST 2021
#4826: Geocoder gives goofy resutls for 1 Main St, Hanover, MA
-----------------------------+---------------------------
Reporter: robe | Owner: robe
Type: defect | Status: new
Priority: medium | Milestone: PostGIS 3.1.1
Component: tiger geocoder | Version: 3.1.x
Resolution: | Keywords:
-----------------------------+---------------------------
Comment (by turova):
Reverse engineering your solution a bit, I think I see where this might go
wrong, though I don't yet have a clear understanding of how to solve it. I
think the issue is that cities can have multiple zip codes and having the
wrong one for the street can give a result that's in a different city.
e.g. for the example we've been looking at, I see this:
{{{
# SELECT * FROM zip_lookup_base WHERE city = 'Hanover';
zip | state | county | city | statefp
-------+-------+------------+---------+---------
... Unrelated rows skipped ...
02061 | MA | Plymouth | Hanover | 25
02239 | MA | Plymouth | Hanover | 25
02339 | MA | Plymouth | Hanover | 25
}}}
So Hanover MA is partially in 3 different zip codes (and shares 02061 with
Norwell, which is the city that has a 1 Main St at that zip code).
Hanover's 1 Main St is in 02339 and the correct result is returned if
02339 (or even 02239, which I can't find on a map) is explicitly passed.
If 02061 is passed explicitly or implicitly via the new lookup table, the
result comes back as Norwell.
I believe this means that you can't have a zip->city lookup with the
current geocode function because it will only be correct for some of the
addresses. Slow solutions would be to look the address up with each zip
code and then return the one with the matching city, or to have a lookup
table of city,street->zip. Adjusting geocode() to use the zip code to find
the proper city, but then somehow correctly adjust the zip code during the
search sounds like a possible solution if the data allows it.
If having a big lookup table ends up taking up 2x the space, that's not a
big deal to me as long as it returns the correct results. Likewise, if
there's a short-term way to do 2 sequential lookups that would get me the
correct result, I could make that work as well, but hopefully you have a
better idea of how to wrangle this data to get the desired result.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4826#comment:5>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list