[postgis-devel] [PostGIS] #1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block
PostGIS
trac at osgeo.org
Thu Oct 13 07:35:21 PDT 2011
#1052: Tiger Geocoder 2010 Geocode() "squishing" toward end of block
-----------------------------+----------------------------------------------
Reporter: mikepease | Owner: robe
Type: enhancement | Status: reopened
Priority: medium | Milestone: PostGIS 2.0.0
Component: tiger geocoder | Version: trunk
Resolution: | Keywords:
-----------------------------+----------------------------------------------
Comment(by robe):
Mike,
I took another look at this. I did find a minor error with my
interpolate, but that is not the cause of the issue here since it would
only affect line segments with more than 2 points and many of these are 2
point line segments in tiger data.
Other cuase of distance issue is I think google's offset is higher than
mine. I think I set it to 10 meters in the interpolate_from_address
function. You can just change the default to higher. I'll probably
change the default to higher later since I think 10 meters is too low.
The fundamental issue I think is precision of the raw tiger data versus
google data. Just to clarify the process. The tiger data just has street
ranges located in the addr table for LEFT and RIGTH side of streets. So in
the simplest case, there are 2 records for each street segment -- one for
the right and one for the left. Lets take the 284 Vincent Ave N for
example - I have a range of:
{{{
272 398 -- right side of street
}}}
So the logic interpolates between the 2 addresses to arrive at a fraction
of:
utm zone I get from utmzone function which for this area is 32615
{{{
-- 0.0952380952380952 and Point of
-- POINT(-93.3154622528598 44.9791255620491)
SELECT
ST_Line_Locate_Point(ST_Transform(ST_GeomFromText('LINESTRING(-93.315971
44.978958,-93.315956 44.98074)',4269),32615),
ST_Transform(ST_GeomFromText('POINT(-93.3154622528598 44.9791255620491)',
4326),32615));
-- note that answer is 0.0952380952380952381000
SELECT (284 - 272)*1.00/(398 - 272)*1.00
Google has:
POINT(-93.315596 44.979312)
}}}
Just in case the street is long, I convert to UTM zone first before
interpolating the point then transform back to NAD 83 long lat.
To figure out what google thinks the fraction is I do this:
{{{
-- Answer: 0.199538688579411
SELECT
ST_Line_Locate_Point(ST_Transform(ST_GeomFromText('LINESTRING(-93.315971
44.978958,-93.315956 44.98074)',4269),32615),
ST_Transform(ST_GeomFromText('POINT(-93.315596 44.979312)', 4326),32615));
-- which using tiger range would put it at: 297
SELECT 272 + (398 - 272)*.19952
}}}
Now the question is:
1) Is google using street ranges or actual parcel data -- it is not an
estimation
2) Is Tiger range data for said issue streets in error. Tiger sometimes
doesn't have the right ranges -- perhaps data quality or as some have
mentioned for privacy reasons.
I'm planning to put in some debugging logic in the interpolation and
others so it spits out the street range etc. it's basing its guesses on --
if debug is enabled using a global function that just returns true or
false. This I'll use for the other functions as well, so I don't have to
explicitly set it in each function.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/1052#comment:12>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-devel
mailing list