[postgis-devel] More work on Cascade Union

Martin Davis mbdavis at refractions.net
Tue Aug 19 17:12:34 PDT 2008


It looks like OJ *is* using the new JTS UnaryUnionOp (aka 
CascadedUnion).  This was a change made on July 26 by Michael Michaud.

I'll save looking at your pgsql code for a slow afternoon sometime - I'm 
sure it will be educational.  In the meantime, good work on 
porting/modifying/enhancing the CascadedUnion concept for PostGIS!

Martin

Obe, Regina wrote:
>
> Martin,
>
> Do you know if OpenJump finally fully integrated the cascade union?  I 
> downloaded the openjump nightly build.  I did timings and it was able 
> to complete usstatebounds in about 12 secs (whereas before remember it 
> gave a memory overflow) and the original samply_poly I think it did in 
> about 22 seconds.  I wasn't sure if this is because I am running on a 
> Vista pc (and probably more beefed up one than I ran last time).  I 
> also don't see the count down with the union thing anymore. 
>
>
> Attached are my latest efforts. This includes the envelop check.  The 
> bool_and aggregate was disappointing so I fell on good old EXISTS (in 
> this scenario especially when you can't really rely on indexes, exists 
> performs better than a LEFT JOIN). 
>
>  Unfortunately in PostgreSQL since the geometries get swept into an 
> array, I can't really count on indexes to help out so I limited the 
> max check to be 1000 records and also do iterated checks.  Which I 
> think is fairly different from the JTS approach.
>
>
>  This time I changed the core of the cascade union function into a 
> plpgsql function since I figured people would have an easier time 
> reading it.  Not that I couldn't do it all with sql functions, but sql 
> functions had the disadvantage of not allowing declaration of 
> variables or using pretty names for input variables.
>
> I have the st99_d00 set down to below 8 secs - but still sucks 
> compared to JTS cascade union (3-4 secs), but still much better than 
> the 48-55 second timings I get with the postgis union function.
>
> I fear I can't do much better than this without delving into GEOS 
> land, which I am not quite ready for.
>
> One thing I did discover is that the dump collect method is a very 
> important thing to have and would make the whole ST_Collect thing much 
> more useful for people if it actually existed since I think that is 
> what people actually expect from that function.  I think part of the 
> frustration that people have is that st_collect returns geometry 
> collections more often than not.  I created a helper function that 
> does it since I had to use that technique so many times in the code, 
> but I suspect this would be done much more efficiently in the 
> lwpostgis code since it would then require only one trip instead of 
> many.  I haven't calculated to see how much of the time is wasted in 
> there.  I know that most of the lost time is still in the 
> ST_Unite_Garray function and the non-intersect check did not help 
> quite as much as I had hoped it would.
>
>
>
>
> --7855 ms, 7874 ms
> SELECT ST_CascadeUnion(the_geom) FROM st99_d00;
>
> This syntax which escapes the aggregate penalty gives me these timings
> --7195 ms, 6804 ms, 6996 ms
> SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM st99_d00));
>
>
> --114282 ms -- SELECT 114282/1000.0/60 -- 1.90 minutes (I didn't 
> bother running this with ST_Union since I would be waiting till the 
> cows came home)
> SELECT st_unitecascade_garray_sort(ARRAY(SELECT the_geom FROM 
> usstatebounds));
>
> --152438 ms -- SELECT 152438/1000.0/60 -- 2.55 minutes
> SELECT ST_CascadeUnion(the_geom) FROM usstatebounds;
>
> My cursory glance of the stats looks like I am getting the right 
> results although the internal sorting may not be optimal.
>
> Any feedback would be appreciated.
>
>
>
>
>
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
> *The substance of this message, including any attachments, may be 
> confidential, legally privileged and/or exempt from disclosure 
> pursuant to Massachusetts law. It is intended solely for the 
> addressee. If you received this in error, please contact the sender 
> and delete the material from any computer. *
>
> ------------------------------------------------------------------------
>
> * Help make the earth a greener place. If at all possible resist 
> printing this email and join us in saving paper. *
>
> * *
>
> * *
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>   

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022




More information about the postgis-devel mailing list