[postgis-devel] [PostGIS] #1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block

PostGIS trac at osgeo.org
Thu Oct 13 13:11:14 PDT 2011


#1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block
-----------------------------+----------------------------------------------
  Reporter:  mikepease       |       Owner:  robe         
      Type:  enhancement     |      Status:  reopened     
  Priority:  medium          |   Milestone:  PostGIS 2.0.0
 Component:  tiger geocoder  |     Version:  trunk        
Resolution:                  |    Keywords:               
-----------------------------+----------------------------------------------

Comment(by mikepease):

 Hmm.  So, if I understand this correctly, the "tohn" doesn't have the same
 meaning as it's being used for by the geocoder.

 I think the geocoder assumes the "tohn" is the highest address number that
 actually EXISTS.  But I think you're saying that the real definition of
 this column is that "tohn" is the highest POSSIBLE address number, not
 necessarily one that exists.

 Am I understanding that correctly?  If so, that seems like a fundamental
 limitation to the precision of the Tiger geocoder.  You get a house to the
 correct block, but its location within that block can't be precisely
 figured unless somehow the highest EXISTING house number is known.  And
 when the highest existing number is lower than the highest possible
 number, it will follow this "squishing" pattern that we've been seeing.

 I suspect my client won't be satisfied with that level of precision.
 Do you know if this is prevalent throughout all the Tiger address data?

 I wonder if there is any clever solution to this.

 You could "cheat" if this condition is known to be true most/all of the
 time.
 For example, if tohn=3299 but you're pretty sure the real max is 3250,
 then the offset from the end of the block could be roughly doubled.

 3250 is at the 50% mark of the address range, but really at the far end of
 the block.  If the full block was 1000m and the geocoder originally says
 the 3250 house is 500m from the start, you could scale that offset up to
 1000m.

 I guess the trick here is still knowing the actual highest address number.
 What if an assumption was made that the "tohn" is generally higher than
 the real highest number?  Could scaling the offset produce *better* (even
 if not correct) results?

 Did any of that make sense?
 ;-)

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/1052#comment:19>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-devel mailing list