# [postgis-users] Re: Cascaded Union Aggregate function

Martin Davis mbdavis at refractions.net
Wed Oct 8 15:03:51 PDT 2008

```Jose,

Thanks for the very interesting paper.  As it happens, I came to a
similar realization a while back during the development of the Cascaded
Union approach.  If you have a dataset which contains a lot of disjoint
"groups" of intersecting geometries, it's actually a lot faster to
identify the groups and then union them individually.  As you point out,
a spatial index helps here.  Another thing which helps is the new
PreparedGeometry functionality, since it makes the intersection tests
more efficient.  (And it doesn't require a lot of groups to make the
grouping approach faster.  My test dataset was a map of the US states.
Alaska is obviously disjoint, but the CascadedUnion approach added it
into the union a lot sooner than it should have been, which slowed
things down a lot).

To compute the disjoint groups, essentially what you are doing is
computing the disjoint subgraphs in the graph of the intersection
relation over the set of input geometries.  As your paper points out, to
do this you need an iterative or recursive algorithm to walk to graph
marking the geometries with the group they belong to.  You didn't seem
to provide an implementation for that - but this could be a great use of
the upcoming Recursive CTE capability.  Alternatively a procedural
implementation could be used, of course.

If the input is actually (fully or almost) in one intersection group,
then I think the Cascaded Union algorithm is probably more efficient.
So the key is knowing what the structure of your input is.  I don't know
of any way to determine this without actually running the graph
traversal - but for large inputs this may be worth the cost.

I've considered implementing this as an alternative in JTS.  For PostGIS
it may be more efficient to implement it at the SQL level, since there
is a lot of overhead with managing many geometries in memory.

Martin

Jose Carlos Martinez Llario wrote:
> Hi,
>
> I want to tell you about the work i did a couple of years ago about
> the same topics with the hope it can help you.
> I tried to improve the union operator to be able to carry out the
> 'dissolve spatial operation'. Im a cartographer and I noticed that we
> can not use PostGIS in a direct way for dissolving a layer with 10.000
> geometries or more because it takes a long time and it needs a lot of
> resources for working.
>
> I wrote a paper (sorry about the english...) explaining the work i did
> (unfortunately i need to write papers because of my job...i hate that
> of course).
> In the introduction of the article  i did some test about cascade
> union. I got around 10 times faster results with the cascade union.
> In the second part i proposed an algorithm to dissolve just the
> adjacent polygons. For example in a cadastral dataset starting from
> parcels to get lots boundaries. I got around 100 times faster. In
> total around 1000 times faster than just using the union aggregate
> with the whole dataset.
>
> In the article a test removed the shared boundaries in a layer with
> 500 000 polygons. It took around 1000 secs in a kind of old computer.
>
>  Geometries                        The whole aggregate    Cascade (i
> called it 'fragmented')      Just adjacent polygons
> Real case
>   10 000                                    1 600
> s                                 around 100
> s                                     < 15 s
>   30 000                                 i couldnt test
> it                         around 1000 s
> around  50 s
>
> Cadastral simulation (lots with 10 parcels)
>  80 000 parcels                             i couldnt test
> it                             around 5000
> s                             around 100 s
> 500 000 parcels                             i couldnt test
> it                            i couldnt test
> it                              around 1000 s
>
>
> Look in the article because the datasets have special geometry
> configuration.
>
> I think the union operation can be really fast in a system with
> persistent topology. Im looking forward to try this with the current
> implementation of topology in PostGIS. I would like to know if someone
> is checking or implementing these kind of operations in PostGIs using
> topology.
>
> I hope it can be useful for the PostGIS community,
> The link of the article is:
> http://cartosig.upv.es/es/system/files/jomarlla/article-martinez-llario-2.pdf
>
>
> Cheers,
> José Carlos
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

```

More information about the postgis-users mailing list