[postgis-tickets] [PostGIS] #4826: Geocoder gives goofy resutls for 1 Main St, Hanover, MA

PostGIS trac at osgeo.org
Thu Jan 7 21:36:58 PST 2021


#4826: Geocoder gives goofy resutls for 1 Main St, Hanover, MA
-----------------------------+---------------------------
  Reporter:  robe            |      Owner:  robe
      Type:  defect          |     Status:  new
  Priority:  medium          |  Milestone:  PostGIS 3.1.1
 Component:  tiger geocoder  |    Version:  3.1.x
Resolution:                  |   Keywords:
-----------------------------+---------------------------

Comment (by turova):

 Reverse engineering your solution a bit, I think I see where this might go
 wrong, though I don't yet have a clear understanding of how to solve it. I
 think the issue is that cities can have multiple zip codes and having the
 wrong one for the street can give a result that's in a different city.
 e.g. for the example we've been looking at, I see this:

 {{{
 # SELECT * FROM zip_lookup_base WHERE city = 'Hanover';
   zip  | state |   county   |  city   | statefp
 -------+-------+------------+---------+---------
 ... Unrelated rows skipped ...
  02061 | MA    | Plymouth   | Hanover | 25
  02239 | MA    | Plymouth   | Hanover | 25
  02339 | MA    | Plymouth   | Hanover | 25
 }}}

 So Hanover MA is partially in 3 different zip codes (and shares 02061 with
 Norwell, which is the city that has a 1 Main St at that zip code).
 Hanover's 1 Main St is in 02339 and the correct result is returned if
 02339 (or even 02239, which I can't find on a map) is explicitly passed.
 If 02061 is passed explicitly or implicitly via the new lookup table, the
 result comes back as Norwell.

 I believe this means that you can't have a zip->city lookup with the
 current geocode function because it will only be correct for some of the
 addresses. Slow solutions would be to look the address up with each zip
 code and then return the one with the matching city, or to have a lookup
 table of city,street->zip. Adjusting geocode() to use the zip code to find
 the proper city, but then somehow correctly adjust the zip code during the
 search sounds like a possible solution if the data allows it.

 If having a big lookup table ends up taking up 2x the space, that's not a
 big deal to me as long as it returns the correct results. Likewise, if
 there's a short-term way to do 2 sequential lookups that would get me the
 correct result, I could make that work as well, but hopefully you have a
 better idea of how to wrangle this data to get the desired result.

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4826#comment:5>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list