[postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

Evan Martin postgresql at realityexists.net
Mon May 28 07:08:36 PDT 2012


Hi all,

I have a table of ~350 multi-polygons and ~185,000 points and I need to 
find which points are inside which multi-polygons. Some polygons are 
quite large and span the dateline, so I'm using geography ST_DWithin for 
this (with a tolerance of 100m). My initial query looks like this 
(simplified):

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, real_area, 100)

This works, but it takes about 90 minutes. I'm trying to make it faster 
by using ST_SimplifyPreserveTopology. That worked very nicely for my 
"adjacent polygons" problem 
[http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html], 
because all polygons were modified in the same way, but this is 
trickier. Since I'm modifying the polygon, but not the point, the 
results are different. So I thought maybe I could do this in two phases: 
if the point is well within or well outside the polygon then take the 
result of the "simplified" check as correct, but if it's close to the 
border then check it properly, ie.

SELECT ...
FROM points, polygons
WHERE ST_DWithin(point, simple_area, 20000)
     AND (ST_Distance(point, simple_border) > 20000 OR ST_DWithin(point, 
real_area, 100))

simple_area = ST_SimplifyPreserveTopology(real_area::geometry, 
0.01)::geography and simple_border = the exterior ring of simple_area.

This takes about 18 minutes (a 5x improvement) and gives very similar 
results, but not quite the same. It falls down on polygons that have 
rhumblines along parallels, because they get turned into great circle 
lines. Eg. the original polyon may have a rhumbline approximated as (24 
10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and 
simplifies it to (24 10,27 10) and then ST_DWithin on geography treats 
it as a great circle line, giving an incorrect result. I tried inserting 
extra points to "unsimplify" the rhumblines, but that itself is very 
slow and also quite a hack, because I can't really be sure which lines 
were supposed be rhumblines when looking at the simplified polygon. I 
feel like I'm so close and this is a very silly little problem - but it 
is a problem.

Could anyone suggest a way to work around this? Or a different approach 
to the whole thing?

Thanks,

Evan



More information about the postgis-users mailing list