[postgis-devel] More Cascade Union Adventures
Martin Davis
mbdavis at refractions.net
Wed Aug 13 08:51:55 PDT 2008
I wouldn't expect to have the results be exactly equal - the union code
is likely to be input-order dependent.
And the difference in areas doesn't seem surprising either - it's way
down in the small decimal places, which would occur even with slight
differences in the geometry.
A more revealing test would be to compute the Hausdorff distance between
the union boundaries - that would show if they differed by very much,
and where. PostGIS doesn't have this - I can't remember whether
OpenJUMP does or not. JEQL has this operation, too.
Obe, Regina wrote:
> Now testing all including JTS 1.9.0 (OpenJump) on a Win XP runing
> PostgreSQL 8.3.1, PostGIS 1.3.3, Geos 3.0.0. It appears using array
> trumps all, Cascade aggregate union is a vast improvement over ST_Union
> (ST_Union I didn't bother testing of course because it owuld never
> finish on this test), but evidentally the array accum calls give a major
> penalty.
>
> I found some things I found a bit possibly disturbing. Maybe its just
> the nature of unioning in different orders. I compared all 3 outputs
> and none of them are binary equal or even ST_Equals for that matter.
>
> However all 3 give same basic stats using OpenJump - e.g
> All result in = 32972 pts
> components = 10
> lengths and areas are off by a bit
> length = agg cascade union = 17.262684721407624, k nested union =
> 17.262684721407688,
> jts = 17.262684721406348
>
> Should I be bothered by any of these or are they just rounding errors?
>
> --Other odd thing is that the array approach is not as good as it was on
> my other machine bu the aggregate union performs better. I'll just
> chuck this off to different postgresql version, and memory settings.
>
> Thanks,
> Regina
>
>
> -- 209,563 | 198,391 ms - SELECT 198391/1000.00/60 = 3.31 minutes
>
> SELECT ST_CascadeUnion(the_geom)
> FROM (SELECT the_geom FROM sample_poly) As foo;
>
> -- 48,594 ms | 47,688 ms = 48 secs
> SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM
> sample_poly));
>
> -- 74,515 ms | 74,922 ms = SELECT 74515/1000.00/60 = 1.24 minutes
> SELECT ST_Union(the_geom) AS the_geom, 'nested union'
> FROM (
> SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> FROM (
> SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> FROM (
> SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> FROM (
> SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> FROM (SELECT the_geom, id FROM sample_poly) As foo
> GROUP BY round(id/10)
> ORDER BY id) AS tmp1
> GROUP BY round(id/100)
> ORDER BY id) AS tmp2
> GROUP BY round(id/1000)
> ORDER BY id) AS tmp3
> GROUP BY round(id/10000)
> ORDER BY id) AS tmp4
> GROUP BY round(id/100000);
>
>
> -- Open Jump JTS 1.9.0 Win XP
> --After loading running union across the whole set
> --1.14 minutes
> --Loading database query
> SELECT ST_AsBinary(the_geom)
> FROM sample_poly;
>
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
>
--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022
More information about the postgis-devel
mailing list