[postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?
Evan Martin
postgresql at realityexists.net
Mon May 28 07:08:36 PDT 2012
Hi all,
I have a table of ~350 multi-polygons and ~185,000 points and I need to
find which points are inside which multi-polygons. Some polygons are
quite large and span the dateline, so I'm using geography ST_DWithin for
this (with a tolerance of 100m). My initial query looks like this
(simplified):
SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)
This works, but it takes about 90 minutes. I'm trying to make it faster
by using ST_SimplifyPreserveTopology. That worked very nicely for my
"adjacent polygons" problem
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
because all polygons were modified in the same way, but this is
trickier. Since I'm modifying the polygon, but not the point, the
results are different. So I thought maybe I could do this in two phases:
if the point is well within or well outside the polygon then take the
result of the "simplified" check as correct, but if it's close to the
border then check it properly, ie.
SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 20000)
AND (ST_Distance(point, simple_border) > 20000 OR ST_DWithin(point,
real_area, 100))
simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
0.01)::geography and simple_border = the exterior ring of simple_area.
This takes about 18 minutes (a 5x improvement) and gives very similar
results, but not quite the same. It falls down on polygons that have
rhumblines along parallels, because they get turned into great circle
lines. Eg. the original polyon may have a rhumbline approximated as (24
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats
it as a great circle line, giving an incorrect result. I tried inserting
extra points to "unsimplify" the rhumblines, but that itself is very
slow and also quite a hack, because I can't really be sure which lines
were supposed be rhumblines when looking at the simplified polygon. I
feel like I'm so close and this is a very silly little problem - but it
is a problem.
Could anyone suggest a way to work around this? Or a different approach
to the whole thing?
Thanks,
Evan
More information about the postgis-users
mailing list