[postgis-devel] More Cascade Union Adventures

Martin Davis mbdavis at refractions.net
Wed Aug 13 08:51:55 PDT 2008


I wouldn't expect to have the results be exactly equal - the union code 
is likely to be input-order dependent.

And the difference in areas doesn't seem surprising either - it's way 
down in the small decimal places, which would occur even with slight 
differences in the geometry. 

A more revealing test would be to compute the Hausdorff distance between 
the union boundaries - that would show if they differed by very much, 
and where.  PostGIS doesn't have this - I can't remember whether 
OpenJUMP does or not.  JEQL has this operation, too.

Obe, Regina wrote:
> Now testing all including JTS 1.9.0 (OpenJump) on a Win XP runing
> PostgreSQL 8.3.1, PostGIS 1.3.3, Geos 3.0.0.  It appears using array
> trumps all, Cascade aggregate union is a vast improvement over ST_Union
> (ST_Union I didn't bother testing of course because it owuld never
> finish on this test), but evidentally the array accum calls give a major
> penalty.
>
> I found some things I found a bit possibly disturbing.  Maybe its just
> the nature of unioning in different orders.  I compared all 3 outputs
> and none of them are binary equal or even ST_Equals for that matter.
>  
> However all 3 give same basic stats using OpenJump - e.g
> All result in = 32972 pts	
> components = 10
> lengths and areas are off by a bit
> length = agg cascade union = 17.262684721407624, k nested union =
> 17.262684721407688,
> 	jts = 17.262684721406348
>
> Should I be bothered by any of these or are they just rounding errors?
>  
> --Other odd thing is that the array approach is not as good as it was on
> my other machine bu the aggregate union performs better.  I'll just
> chuck this off to different postgresql version, and memory settings.
>
> Thanks,
> Regina
>  
>  
> --   209,563 | 198,391 ms - SELECT 198391/1000.00/60 = 3.31 minutes
>  
> SELECT ST_CascadeUnion(the_geom)
> FROM (SELECT the_geom FROM sample_poly) As foo;
>  
> -- 48,594 ms | 47,688 ms = 48 secs 
> SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM
> sample_poly));
>  
> -- 74,515 ms | 74,922 ms = SELECT 74515/1000.00/60 = 1.24 minutes
> SELECT ST_Union(the_geom) AS the_geom, 'nested union'
> FROM (
>    SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
>    FROM (
>      SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
>      FROM (
>        SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
>        FROM (
>          SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
>          FROM (SELECT the_geom, id FROM sample_poly) As foo
>          GROUP BY round(id/10)
>          ORDER BY id) AS tmp1
>        GROUP BY round(id/100)
>        ORDER BY id) AS tmp2
>      GROUP BY round(id/1000)
>      ORDER BY id) AS tmp3
>    GROUP BY round(id/10000)
>    ORDER BY id) AS tmp4
> GROUP BY round(id/100000);
>  
>
> -- Open Jump JTS 1.9.0 Win XP
> --After loading running union across the whole set
> --1.14 minutes
> --Loading database query 
> SELECT ST_AsBinary(the_geom)
> FROM sample_poly;
>
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
>   

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022




More information about the postgis-devel mailing list