[postgis-devel] More Cascade Union Adventures

Obe, Regina robe.dnd at cityofboston.gov
Wed Aug 13 05:06:42 PDT 2008


Now testing all including JTS 1.9.0 (OpenJump) on a Win XP runing
PostgreSQL 8.3.1, PostGIS 1.3.3, Geos 3.0.0.  It appears using array
trumps all, Cascade aggregate union is a vast improvement over ST_Union
(ST_Union I didn't bother testing of course because it owuld never
finish on this test), but evidentally the array accum calls give a major
penalty.

I found some things I found a bit possibly disturbing.  Maybe its just
the nature of unioning in different orders.  I compared all 3 outputs
and none of them are binary equal or even ST_Equals for that matter.
 
However all 3 give same basic stats using OpenJump - e.g
All result in = 32972 pts	
components = 10
lengths and areas are off by a bit
length = agg cascade union = 17.262684721407624, k nested union =
17.262684721407688,
	jts = 17.262684721406348

Should I be bothered by any of these or are they just rounding errors?
 
--Other odd thing is that the array approach is not as good as it was on
my other machine bu the aggregate union performs better.  I'll just
chuck this off to different postgresql version, and memory settings.

Thanks,
Regina
 
 
--   209,563 | 198,391 ms - SELECT 198391/1000.00/60 = 3.31 minutes
 
SELECT ST_CascadeUnion(the_geom)
FROM (SELECT the_geom FROM sample_poly) As foo;
 
-- 48,594 ms | 47,688 ms = 48 secs 
SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM
sample_poly));
 
-- 74,515 ms | 74,922 ms = SELECT 74515/1000.00/60 = 1.24 minutes
SELECT ST_Union(the_geom) AS the_geom, 'nested union'
FROM (
   SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
   FROM (
     SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
     FROM (
       SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
       FROM (
         SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
         FROM (SELECT the_geom, id FROM sample_poly) As foo
         GROUP BY round(id/10)
         ORDER BY id) AS tmp1
       GROUP BY round(id/100)
       ORDER BY id) AS tmp2
     GROUP BY round(id/1000)
     ORDER BY id) AS tmp3
   GROUP BY round(id/10000)
   ORDER BY id) AS tmp4
GROUP BY round(id/100000);
 

-- Open Jump JTS 1.9.0 Win XP
--After loading running union across the whole set
--1.14 minutes
--Loading database query 
SELECT ST_AsBinary(the_geom)
FROM sample_poly;

-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.




More information about the postgis-devel mailing list