[postgis-devel] CascadedUnion Early Results

Tue Jan 20 14:37:24 PST 2009

There should still be an improvement even if the dataset consists of 
disjoint polygons.  The reason is that the total number of union 
operations carried out is fewer, since CascadedUnion ends up clustering 
geometries together and unioning them in batches, rather than adding a 
single geometry at a time to an ever-larger result.

But the performance boost won't be nearly as great as when the 
geometries overlap, when the intermediate unions ends up reducing the 
total number of points which are being handled.

I've experimented with having some up-front checks to detect polygons 
which don't overlap (or more generally disjoint clusters of overlapping 
polygons), but the testing didn't show any great improvement over simply 
running the CascadedUnion directly.  I was a bit disappointed by this, 
since it seems like it should be faster to test disjoint than actually 
carry out a union.  Some more testing might be in order - feel free to 
try and prove me wrong!

Kevin Neufeld wrote:
> Yup, already forwarded it to you.
>
> Yes, it looks much better.  It obviously depends greatly on the 
> dataset being used ... just how much the dataset overlaps.
>
> Paul, out of curiosity, how does this method compare when trying to 
> use a set of completely disjoint polygons?  Since there is nothing to 
> gain by using a cascaded methodology in such a case, is there an 
> overhead to using the function?
>
> Cheers,
> Kevin
>
> Obe, Regina wrote:
>>
>> Ah okay that looks good.  I think the numbers look like about what I 
>> got on JTS.
>>
>> I have to dig up that set Kevin had given me way back when.  Kevin 
>> you don't still happen to have that do you?  I think it was named 
>> sample_poly.zip
>>
>>
>>
>> -----Original Message-----
>> From: postgis-devel-bounces at postgis.refractions.net on behalf of Paul 
>> Ramsey
>> Sent: Tue 1/20/2009 5:08 PM
>> To: PostGIS Development Discussion
>> Subject: Re: [postgis-devel] CascadedUnion Early Results
>>
>> Here's your test, slightly modified. Shows about 30-times improvement
>> over traditional union.
>>
>>   SELECT state,
>>    SUM(ST_NPoints(the_geom)) As  numpointsbefore,
>>      ST_NPoints(ST_Union(the_geom)) As numpointsafter
>>    FROM usstatebounds
>>    GROUP BY state
>>    ORDER BY state;
>>
>> Time: 579121.791 ms
>>
>>   SELECT state,
>>    SUM(ST_NPoints(the_geom)) As  numpointsbefore,
>>      ST_NPoints(ST_Union_Fast(the_geom)) As numpointsafter
>>    FROM usstatebounds
>>    GROUP BY state
>>    ORDER BY state;
>>
>> Time: 21145.717 ms
>>
>>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022