[postgis-devel] Lame attempt at cascade union in sql

Obe, Regina robe.dnd at cityofboston.gov
Thu Aug 7 12:53:42 PDT 2008


Actually I realized the solution may be much simpler and not require recursion and would be much faster than what I sent you.  I'll try that probably and let you know how it goes.

-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net on behalf of Kevin Neufeld
Sent: Thu 8/7/2008 3:46 PM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] Lame attempt at cascade union in sql
 
Hi Regina,

So, I ran a few tests on a table with ~30000 overlapping polygons.

Test1:
SELECT ST_Union(the_geom) FROM sample_poly;
--  Total runtime: + 2 hours before I cancelled it.


Test2:
SELECT ST_CascadeUnion(the_geom) FROM sample_poly;
--  Total runtime: 19 minutes.


Test3:
SELECT ST_Union(the_geom) AS the_geom
FROM (
   SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
   FROM (
     SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
     FROM (
       SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
       FROM (
         SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
         FROM sample_poly
         GROUP BY round(id/10)
         ORDER BY id) AS tmp1
       GROUP BY round(id/100)
       ORDER BY id) AS tmp2
     GROUP BY round(id/1000)
     ORDER BY id) AS tmp3
   GROUP BY round(id/10000)
   ORDER BY id) AS tmp4
GROUP BY round(id/100000);
--  Total runtime: 1.5 minutes


Test4:
Reordered the tuples in sample_poly by "ORDER BY 
ST_SnapToGrid(ST_Centroid(the_geom), 0.01);"
Reran query in Test3:
--  Total runtime: 50 secs


So, your function is a definite improvement over ST_Union, but the 
timings could still be better.

Cheers,
Kevin

Obe, Regina wrote:
> Kevin,
> 
> Can you try the cascade approach I sent in my last email with your large
> dataset.
> 
> I tried it with my real data and here are the results 
> 
> --time 3328 ms, 3203 ms, 3188 ms
> SELECT ST_CascadeUnion(the_geom)
> FROM (SELECT * FROM landparcels
> WHERE pid lIKE '01%' and the_geom is not null
> LIMIT 1000) p
> 
> --time 23375 ms, 23594 ms, 24109 ms
> SELECT ST_Union(the_geom)
> FROM (SELECT * FROM landparcels
> WHERE pid lIKE '01%' and the_geom is not null
> LIMIT 1000) p
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.
> 
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
_______________________________________________
postgis-devel mailing list
postgis-devel at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-devel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080807/13e33add/attachment.html>


More information about the postgis-devel mailing list