[postgis-devel] More Cascade Union Adventures
Obe, Regina
robe.dnd at cityofboston.gov
Wed Aug 13 16:26:37 PDT 2008
Martin,
Ok just tried in OJ and its 3 seconds for me too and 10 seconds with PostGIS ST_CascadeUnion (although postgis is on a separate server so I'll have to do another test later),
ST_Union is 99,703 ms. I'm not sure why the 30,000 is performing so poorly for me in OJ then. Maybe its my memory settings.
Wish I could find that other set I was trying, but don't have connection to that server at the moment.
Anyrate I did recall trying the US State bounds and for some reason I ran out of heap space in OJ, but it did complete within a minute with the ST_CascadeUnion. I'll have to find where I have that. I think that would be a good test.
So I guess it looks like we are back to the 3 to 1 mark. Drats. I bow to the Doctor.
Thanks,
Regina
-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net on behalf of Martin Davis
Sent: Wed 8/13/2008 7:10 PM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] More Cascade Union Adventures
Ok...
I just tried the dataset you sent in OJ and JTS. They were both
basically the same - 3 secs.
Obe, Regina wrote:
>
> Hmm that doesn't look like the right dataset. Now I have to figure
> out where I dug up this other towns layer. Running on this particular
> layer ST_CascadeUnion runs in
> 10 secs. Regular ST_Union takes 99,703 ms.
>
> I'll let you know when I find it. I have too many places where I have
> servers and all have slightly different datasets with the same name.
>
>
>
> -----Original Message-----
> From: postgis-devel-bounces at postgis.refractions.net on behalf of Obe,
> Regina
> Sent: Wed 8/13/2008 6:40 PM
> To: PostGIS Development Discussion
> Subject: RE: [postgis-devel] More Cascade Union Adventures
>
> Sure - its just the one on the MassGIS site
>
> ftp://data.massgis.state.ma.us/pub/shape/state/towns.exe
>
> Shucks - I guess that means I don't win a prize huh :(
>
> Thanks,
> Regina
>
> -----Original Message-----
> From: postgis-devel-bounces at postgis.refractions.net on behalf of
> Martin Davis
> Sent: Wed 8/13/2008 6:40 PM
> To: PostGIS Development Discussion
> Subject: Re: [postgis-devel] More Cascade Union Adventures
>
> I'm pretty sure that OJ is *not* calling CascadeUnion, but is using a
> special-purpose, similar algorithm developed by Michael Michaud. I
> just looked at the OJ src code to confirm, and this looks to be the
> situation. I think he tested his code against the JTS CascadedUnion and
> decided that the JTS algorithm is still faster.
>
> I just tried the OJ Union as well as the JTS CascadedUnion on the
> 32278-feature dataset. Results were:
>
> OJ Union: 27 sec
> JTS CascUnion: 17.7 sec
>
> So JTS still wins, but they're not too far apart.
>
> Just for grins, can you send me your Mass towns dataset to try?
>
> Obe, Regina wrote:
> >
> > Martin,
> >
> > Just loaded that plugin - unless I put it in the wrong folder. - it
> > just seemed to create an option called aggregation-options under
> > plugins which seems to require two layers and does union, count
> > between the 2 layers. So nope doesn't sound like Cascaded union and
> > didn't seem to change anything with the union function.
> >
> > If OpenJump union function isn't doing cascaded union, then what is
> > that count down thing for. Doing the 30,000 it displays something
> > like this
> >
> > Computing Union
> > 1/8 (32,278)
> > 2/8 (8070)
> > 3/8 ... /2018
> > 86/505
> > 1/127
> > 6/8 (2/32)
> > 8/8 (2/3)
> > 3/3 (8/8)
> >
> > I also thought its speed of unioning of Mass towns was pretty
> impressive.
> >
> > Thanks,
> > Regina
> > -----Original Message-----
> > From: postgis-devel-bounces at postgis.refractions.net on behalf of Obe,
> > Regina
> > Sent: Wed 8/13/2008 5:40 PM
> > To: PostGIS Development Discussion
> > Subject: RE: [postgis-devel] More Cascade Union Adventures
> >
> > I assumed it was since it was counting down like in some sort of
> > upside down pyramid
> >
> > 500
> > 255
> > 10
> > :
> > :
> >
> > So it seemed like it was doing some sort of division of the
> > geometries. I'll give the below
> > a try. Maybe I misunderstood what that counting was for.
> >
> >
> >
> > -----Original Message-----
> > From: postgis-devel-bounces at postgis.refractions.net on behalf of
> > Martin Davis
> > Sent: Wed 8/13/2008 4:49 PM
> > To: PostGIS Development Discussion
> > Subject: Re: [postgis-devel] More Cascade Union Adventures
> >
> > Yep, I don't Hausdorff distance is available in very many places. Pity,
> > because it is a very useful metric for comparing geometry. The full
> > Hausdorff distance is quite challenging to implement, but I have made a
> > VertexHausdorffDistance approximation which is just as useful in most
> > situations, and is a lot simpler to implement (and faster to run).
> >
> > By the way, are you sure that OpenJUMP is using CascadedUnion? AFAIK it
> > didn't in the past... Michael Michaud has just released (today!) an
> > extension which I think does use CascadedUnion - so you might want to
> > try that.
> >
> > http://geo.michaelm.free.fr/OpenJUMP/resources/aggregation-0.1.jar
> >
> > When I did the original testing with the 30K polygon dataset that you're
> > using, I was getting times of around 20 sec using CascadedUnion...
> >
> >
> >
> > Obe, Regina wrote:
> > >
> > > Martin,
> > > Pardon my ignorance.
> > >
> > > I don't see a Hausdorff distance in OpenJump or maybe it goes by a
> > > more verbose name and the descriptions I read about Hausdorff
> > > distances are greek to me.
> > >
> > > Well the Mass town test seems to pass my trivial test exercises. It
> > > looks like massachusetts with no visually apparent gaps, has the same
> > > number points in all cases, similar area
> > >
> > > both num points - 476026
> > > both num geometries - 694
> > > area (ST_CascadeUnion - 2.094208570266725E10)
> > > area (JTS - 2.0942085702666965E10)
> > >
> > >
> > > -----Original Message-----
> > > From: postgis-devel-bounces at postgis.refractions.net on behalf of
> > > Martin Davis
> > > Sent: Wed 8/13/2008 11:51 AM
> > > To: PostGIS Development Discussion
> > > Subject: Re: [postgis-devel] More Cascade Union Adventures
> > >
> > > I wouldn't expect to have the results be exactly equal - the union
> code
> > > is likely to be input-order dependent.
> > >
> > > And the difference in areas doesn't seem surprising either - it's way
> > > down in the small decimal places, which would occur even with slight
> > > differences in the geometry.
> > >
> > > A more revealing test would be to compute the Hausdorff distance
> between
> > > the union boundaries - that would show if they differed by very much,
> > > and where. PostGIS doesn't have this - I can't remember whether
> > > OpenJUMP does or not. JEQL has this operation, too.
> > >
> > > Obe, Regina wrote:
> > > > Now testing all including JTS 1.9.0 (OpenJump) on a Win XP runing
> > > > PostgreSQL 8.3.1, PostGIS 1.3.3, Geos 3.0.0. It appears using array
> > > > trumps all, Cascade aggregate union is a vast improvement over
> > ST_Union
> > > > (ST_Union I didn't bother testing of course because it owuld never
> > > > finish on this test), but evidentally the array accum calls give a
> > major
> > > > penalty.
> > > >
> > > > I found some things I found a bit possibly disturbing. Maybe
> its just
> > > > the nature of unioning in different orders. I compared all 3
> outputs
> > > > and none of them are binary equal or even ST_Equals for that matter.
> > > >
> > > > However all 3 give same basic stats using OpenJump - e.g
> > > > All result in = 32972 pts
> > > > components = 10
> > > > lengths and areas are off by a bit
> > > > length = agg cascade union = 17.262684721407624, k nested union =
> > > > 17.262684721407688,
> > > > jts = 17.262684721406348
> > > >
> > > > Should I be bothered by any of these or are they just rounding
> errors?
> > > >
> > > > --Other odd thing is that the array approach is not as good as it
> > was on
> > > > my other machine bu the aggregate union performs better. I'll just
> > > > chuck this off to different postgresql version, and memory settings.
> > > >
> > > > Thanks,
> > > > Regina
> > > >
> > > >
> > > > -- 209,563 | 198,391 ms - SELECT 198391/1000.00/60 = 3.31 minutes
> > > >
> > > > SELECT ST_CascadeUnion(the_geom)
> > > > FROM (SELECT the_geom FROM sample_poly) As foo;
> > > >
> > > > -- 48,594 ms | 47,688 ms = 48 secs
> > > > SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM
> > > > sample_poly));
> > > >
> > > > -- 74,515 ms | 74,922 ms = SELECT 74515/1000.00/60 = 1.24 minutes
> > > > SELECT ST_Union(the_geom) AS the_geom, 'nested union'
> > > > FROM (
> > > > SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> > > > FROM (
> > > > SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> > > > FROM (
> > > > SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> > > > FROM (
> > > > SELECT min(id) AS id, ST_Union(the_geom) AS the_geom
> > > > FROM (SELECT the_geom, id FROM sample_poly) As foo
> > > > GROUP BY round(id/10)
> > > > ORDER BY id) AS tmp1
> > > > GROUP BY round(id/100)
> > > > ORDER BY id) AS tmp2
> > > > GROUP BY round(id/1000)
> > > > ORDER BY id) AS tmp3
> > > > GROUP BY round(id/10000)
> > > > ORDER BY id) AS tmp4
> > > > GROUP BY round(id/100000);
> > > >
> > > >
> > > > -- Open Jump JTS 1.9.0 Win XP
> > > > --After loading running union across the whole set
> > > > --1.14 minutes
> > > > --Loading database query
> > > > SELECT ST_AsBinary(the_geom)
> > > > FROM sample_poly;
> > > >
> > > > -----------------------------------------
> > > > The substance of this message, including any attachments, may be
> > > > confidential, legally privileged and/or exempt from disclosure
> > > > pursuant to Massachusetts law. It is intended
> > > > solely for the addressee. If you received this in error, please
> > > > contact the sender and delete the material from any computer.
> > > >
> > > > _______________________________________________
> > > > postgis-devel mailing list
> > > > postgis-devel at postgis.refractions.net
> > > > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> > > >
> > > >
> > >
> > > --
> > > Martin Davis
> > > Senior Technical Architect
> > > Refractions Research, Inc.
> > > (250) 383-3022
> > >
> > > _______________________________________________
> > > postgis-devel mailing list
> > > postgis-devel at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > postgis-devel mailing list
> > > postgis-devel at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> > >
> >
> > --
> > Martin Davis
> > Senior Technical Architect
> > Refractions Research, Inc.
> > (250) 383-3022
> >
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> >
>
> --
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
>
> ------------------------------------------------------------------------
>
> *The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended solely for the
> addressee. If you received this in error, please contact the sender
> and delete the material from any computer. *
>
> ------------------------------------------------------------------------
>
> * Help make the earth a greener place. If at all possible resist
> printing this email and join us in saving paper. *
>
> * *
>
> * *
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022
_______________________________________________
postgis-devel mailing list
postgis-devel at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080813/78193f05/attachment.html>
More information about the postgis-devel
mailing list