[postgis-users] Finding Islands

Brent Wood pcreso at pcreso.com
Wed Nov 20 00:26:35 PST 2013


I figure you have spatially indexed the polygons already?


Any way of pre-categorising your polygons - binning them in some way that allows a non spatial test in the where clause to replace the spatial test...

eg: a boolean attribute set to indicate if a feature has any part <= a particular x value or not.

run this against the 2m features once to populate it, and assuming you get a 50/50 split of T/F features, your 2m^2 query can instead include a "where a.bool = b.bool", as we already know that if the boolean flag is different, they cannot touch, so your query should involve a spatial test only over 1m^2 instead... if you do this on both x & y, the boolean filter will replace even more spatial calculations & drop them to 500,000^2 tests...

If you can pre-classify features so that non-spatial tests can reduce the spatial ones (esp for features with lots of vertices) in a where clause, such queries do run much faster, but you have the trade-off of the time taken to carry out the classification... which is only n*no_classes/2, generally much faster than n^n


Hope this makes sense...

Brent Wood


________________________________
 From: Lee Hachadoorian <Lee.Hachadoorian+L at gmail.com>
To: PostGIS Users <postgis-users at postgis.refractions.net> 
Sent: Wednesday, November 20, 2013 8:52 PM
Subject: [postgis-users] Finding Islands
 

I am trying to find "islands", polygons in a (multi)polygon layer which 
are not connected to any other polygons in the same layer. What I came 
up with runs in a couple of seconds on a layer with ~1000 geometries, 
and a couple of minutes on a layer with ~23,000 geometries, but I want 
to run it on a layer of 2 million+ and it's taking a L-O-O-O-NG time, 
presumably because it's making (2 million)^2 comparisons.

What I came up with is:

SELECT a.*
FROM table a LEFT JOIN table b
     ON (ST_Touches(a.geom, b.geom))
WHERE b.gid is null

or

SELECT *
FROM table
WHERE gid NOT IN (
     SELECT DISTINCT a.gid
     FROM table a JOIN table b
         ON (ST_Intersects(a.geom, b.geom) AND a.gid != b.gid)
     )

The first variant raises "NOTICE:  geometry_gist_joinsel called with 
incorrect join type". So I thought I could improve performance with an 
INNER JOIN instead of an OUTER JOIN, and came up with the second 
variant, and it does seem to perform somewhat better. Any suggestions 
for how to speed up either approach?

Best,
--Lee

-- 
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu
    

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20131120/e17824a2/attachment-0001.html>


More information about the postgis-users mailing list