[postgis-users] Speeding up point-in-polygon search using ST_SimplifyPreserveTopology?

Stephen Woodbridge woodbri at swoodbridge.com
Mon May 28 09:21:20 PDT 2012


On 5/28/2012 10:08 AM, Evan Martin wrote:
> Hi all,
>
> I have a table of ~350 multi-polygons and ~185,000 points and I need to
> find which points are inside which multi-polygons. Some polygons are
> quite large and span the dateline, so I'm using geography ST_DWithin for
> this (with a tolerance of 100m). My initial query looks like this
> (simplified):
>
> SELECT ...
> FROM points, polygons
> WHERE ST_DWithin(point, real_area, 100)
>
> This works, but it takes about 90 minutes. I'm trying to make it faster
> by using ST_SimplifyPreserveTopology. That worked very nicely for my
> "adjacent polygons" problem
> [http://postgis.refractions.net/pipermail/postgis-users/2012-January/031992.html],
> because all polygons were modified in the same way, but this is
> trickier. Since I'm modifying the polygon, but not the point, the
> results are different. So I thought maybe I could do this in two phases:
> if the point is well within or well outside the polygon then take the
> result of the "simplified" check as correct, but if it's close to the
> border then check it properly, ie.
>
> SELECT ...
> FROM points, polygons
> WHERE ST_DWithin(point, simple_area, 20000)
> AND (ST_Distance(point, simple_border) > 20000 OR ST_DWithin(point,
> real_area, 100))
>
> simple_area = ST_SimplifyPreserveTopology(real_area::geometry,
> 0.01)::geography and simple_border = the exterior ring of simple_area.
>
> This takes about 18 minutes (a 5x improvement) and gives very similar
> results, but not quite the same. It falls down on polygons that have
> rhumblines along parallels, because they get turned into great circle
> lines. Eg. the original polyon may have a rhumbline approximated as (24
> 10,25 10,26 10,27 10), ST_SimplifyPreserveTopology does its job and
> simplifies it to (24 10,27 10) and then ST_DWithin on geography treats
> it as a great circle line, giving an incorrect result. I tried inserting
> extra points to "unsimplify" the rhumblines, but that itself is very
> slow and also quite a hack, because I can't really be sure which lines
> were supposed be rhumblines when looking at the simplified polygon. I
> feel like I'm so close and this is a very silly little problem - but it
> is a problem.
>
> Could anyone suggest a way to work around this? Or a different approach
> to the whole thing?

I think the alternative approach that has been discussed on the list in 
the past, and should be in the archives, is to cut the multiple polygons 
into tiles with all the attributes of the original multipolygon and then 
to match the points to the tiles..

This works much faster for two basic reasons:

1. the number of points in the each tile is much less than the original 
because it only contains a fraction of the originals complex boundary 
but maintains all the detail.
2. since each tile is spatially smaller, you get better (faster) 
interactions between the tile index and the points.

-Steve



More information about the postgis-users mailing list