[postgis-users] Deleting many points falling outside a large polygon (but within the bbox)

Tim Keitt tkeitt at gmail.com
Thu Jul 22 13:28:25 PDT 2010


Thanks Dan. Looks like it is working:

keittlab=# delete from grid where grid.gid not in (select grid.gid
from grid, regionbound where st_intersects(grid.the_geom,
regionbound.the_geom));
DELETE 61183118

That ran in less than an hour.

My original version was

delete from grid where st_disjoint(the_geom, (select the_geom from
regionbound))  # regionbound has 1 row

which ran very slowly. st_intersects appears to be much faster than st_disjoint.

THK

On Thu, Jul 22, 2010 at 3:17 PM, Dan Putler <dan.putler at sauder.ubc.ca> wrote:
> Hi Tim,
>
> A point-in-polygon test approach would seem to make sense. I did a quick
> Google search and ran into posting on Paul Ramsey's Clever Elephant blog
> that seems to describe exactly your situation:
> http://blog.cleverelephant.ca/2008/09/point-in-polygon-shortcuts.html
>
> A few more comments on this from Paul about implementation would likely be
> helpful, at least for me.
>
> Dan
>
> On 07/22/2010 12:44 PM, Tim Keitt wrote:
>>
>> I have a very large table of points; basically I pushed a raster
>> extending over much of the western hemisphere at 1km resolution into
>> the db as xyz (actually 2d points + z in another column). Not a crazy
>> as it sounds as you can do a lot of interesting things intersecting
>> these grid points with other geometries, and I have a lot of RAM and
>> disk space available. The region of interest however is much smaller
>> than the original raster and is defined by a large polygon (certain
>> continent margins) composed of many vertices. Deleting the points
>> outside the bounding box of the roi is reasonably quick, however there
>> are many points that remain within the bbox, but outside the polygon.
>> These take forever to cull as it appears the entire polygon has to be
>> searched for each point. Its looking like this will at least take
>> days, perhaps much more on a fairly fast machine.
>>
>> I'm curious if anyone has a reasonable solution. I was thinking of
>> dumping the roi polygon as points and then recursively subdividing the
>> bounding box, building quad-polygons on the way down. Those quads that
>> contain roi points are split into 4 while those that contain no points
>> remain. After a few recursion levels, you figure out which
>> quad-polygons are disjoint from the roi polygon and delete any
>> enclosed grid points. Points intersecting quads that are within the
>> roi polygon are not touched. Grid points within the remaining quads
>> would have to be searched one-by-one, but that should be a small
>> fraction of the total. Basically the idea is to emulate a kind of
>> quad-tree index in pure sql. Or alternatively I'll just come back in a
>> few weeks and see if the brute force query is done...
>>
>> THK
>>
>>
>
>



-- 
Timothy H. Keitt
http://www.keittlab.org/



More information about the postgis-users mailing list