[postgis-devel] [PostGIS] #1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block

PostGIS trac at osgeo.org
Thu Oct 13 07:35:21 PDT 2011


#1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block
-----------------------------+----------------------------------------------
  Reporter:  mikepease       |       Owner:  robe         
      Type:  enhancement     |      Status:  reopened     
  Priority:  medium          |   Milestone:  PostGIS 2.0.0
 Component:  tiger geocoder  |     Version:  trunk        
Resolution:                  |    Keywords:               
-----------------------------+----------------------------------------------

Comment(by robe):

 Mike,

 I took another look at this.  I did find a minor error with my
 interpolate, but that is not the cause of the issue here since it would
 only affect line segments with more than 2 points and many of these are 2
 point line segments in tiger data.

 Other cuase of distance issue is I think google's offset is higher than
 mine.  I think I set it to 10 meters in the interpolate_from_address
 function.  You can just change the default to higher.  I'll probably
 change the default to higher later since I think 10 meters is too low.

 The fundamental issue I think is precision of the raw tiger data versus
 google data.  Just to clarify the process.  The tiger data just has street
 ranges located in the addr table for LEFT and RIGTH side of streets. So in
 the simplest case, there are 2 records for each street segment -- one for
 the right and one for the left.  Lets take the 284 Vincent Ave N for
 example - I have a range of:

 {{{
 272 398  -- right side of street
 }}}



 So the logic interpolates between the 2 addresses to arrive at a fraction
 of:
 utm zone I get from utmzone function which for this area is 32615

 {{{
 -- 0.0952380952380952 and Point of
 -- POINT(-93.3154622528598 44.9791255620491)
 SELECT
 ST_Line_Locate_Point(ST_Transform(ST_GeomFromText('LINESTRING(-93.315971
 44.978958,-93.315956 44.98074)',4269),32615),
 ST_Transform(ST_GeomFromText('POINT(-93.3154622528598 44.9791255620491)',
 4326),32615));


 -- note that answer is 0.0952380952380952381000
 SELECT (284 - 272)*1.00/(398 - 272)*1.00
 Google has:
 POINT(-93.315596 44.979312)
 }}}

 Just in case the street is long, I convert to UTM zone first before
 interpolating the point then transform back to NAD 83 long lat.

 To figure out what google thinks the fraction is I do this:


 {{{
 -- Answer: 0.199538688579411
 SELECT
 ST_Line_Locate_Point(ST_Transform(ST_GeomFromText('LINESTRING(-93.315971
 44.978958,-93.315956 44.98074)',4269),32615),
 ST_Transform(ST_GeomFromText('POINT(-93.315596 44.979312)', 4326),32615));

 -- which using tiger range would put it at: 297
 SELECT 272 + (398 - 272)*.19952

 }}}

 Now the question is:
 1) Is google using street ranges or actual parcel data -- it is not an
 estimation
 2) Is Tiger range data for said issue streets in error.  Tiger sometimes
 doesn't have the right ranges -- perhaps data quality or as some have
 mentioned for privacy reasons.


 I'm planning to put in some debugging logic in the interpolation and
 others so it spits out the street range etc. it's basing its guesses on --
 if debug is enabled using a global function that just returns true or
 false.  This I'll use for the other functions as well, so I don't have to
 explicitly set it in each function.

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/1052#comment:12>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-devel mailing list