<html><body><div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt"><div><span>I figure you have spatially indexed the polygons already?<br></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>Any way of pre-categorising your polygons - binning them in some way that allows a non spatial test in the where clause to replace the spatial test...</span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><br><span></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new
york,times,serif; background-color: transparent; font-style: normal;"><span>eg: a boolean attribute set to indicate if a feature has any part <= a particular x value or not.</span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><br><span></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>run this against the 2m features once to populate it, and assuming you get a 50/50 split of T/F features, your 2m^2 query can instead include a "where a.bool = b.bool", as we already know that if the boolean flag is different, they cannot touch, so your query should involve a spatial test only over 1m^2 instead... if you do this on both x & y, the boolean filter will replace even more spatial calculations & drop them to 500,000^2
tests...</span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><br><span></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>If you can pre-classify features so that non-spatial tests can reduce the spatial ones (esp for features with lots of vertices) in a where clause, such queries do run much faster, but you have the trade-off of the time taken to carry out the classification... which is only n*no_classes/2, generally much faster than n^n<br></span></div><div><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>Hope this makes sense...</span></div><div style="color: rgb(0, 0, 0); font-size: 16px;
font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><br><span></span></div><div style="color: rgb(0, 0, 0); font-size: 16px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>Brent Wood<br></span></div> <div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"> <div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"> <div dir="ltr"> <hr size="1"> <font size="2" face="Arial"> <b><span style="font-weight:bold;">From:</span></b> Lee Hachadoorian <Lee.Hachadoorian+L@gmail.com><br> <b><span style="font-weight: bold;">To:</span></b> PostGIS Users <postgis-users@postgis.refractions.net> <br> <b><span style="font-weight: bold;">Sent:</span></b> Wednesday, November 20, 2013 8:52 PM<br> <b><span style="font-weight: bold;">Subject:</span></b> [postgis-users] Finding Islands<br> </font>
</div> <div class="y_msg_container"><br>I am trying to find "islands", polygons in a (multi)polygon layer which <br>are not connected to any other polygons in the same layer. What I came <br>up with runs in a couple of seconds on a layer with ~1000 geometries, <br>and a couple of minutes on a layer with ~23,000 geometries, but I want <br>to run it on a layer of 2 million+ and it's taking a L-O-O-O-NG time, <br>presumably because it's making (2 million)^2 comparisons.<br><br>What I came up with is:<br><br>SELECT a.*<br>FROM table a LEFT JOIN table b<br> ON (ST_Touches(a.geom, b.geom))<br>WHERE b.gid is null<br><br>or<br><br>SELECT *<br>FROM table<br>WHERE gid NOT IN (<br> SELECT DISTINCT a.gid<br> FROM table a JOIN table b<br> ON (ST_Intersects(a.geom, b.geom) AND a.gid != b.gid)<br> )<br><br>The first variant raises "NOTICE: geometry_gist_joinsel called with
<br>incorrect join type". So I thought I could improve performance with an <br>INNER JOIN instead of an OUTER JOIN, and came up with the second <br>variant, and it does seem to perform somewhat better. Any suggestions <br>for how to speed up either approach?<br><br>Best,<br>--Lee<br><br>-- <br>Lee Hachadoorian<br>Assistant Professor in Geography, Dartmouth College<br><a href="http://freecity.commons.gc.cuny.edu/" target="_blank">http://freecity.commons.gc.cuny.edu</a><br> <br><br>_______________________________________________<br>postgis-users mailing list<br><a ymailto="mailto:postgis-users@lists.osgeo.org" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br><a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a><br><br><br></div> </div> </div> </div></body></html>