[postgis-devel] Ordering of geometries in multipolygons, preparedgeometries

Martin Davis mbdavis at refractions.net
Tue Aug 5 12:15:52 PDT 2008


Cascaded union uses a spatial index to order the geometries.  This 
provides better performance than using a single dimension (such as X) to 
order them on, since it improves the spatial (2D) locality of 
reference.  The performance boost provided by orering comes from 
linework being eliminated as the unioning proceeds - and cascaded union 
in general will eliminate more linework than using a 1-d sort.

Also, Cascaded union unions across a hierarchy of geometries.  This 
keeps the union arguments "as small as possible for as long as 
possible".  In constrast, MemUnion adds one geometry at a time into the 
union result, which means that at least one of the union arguments 
starts to get very big.

This is the post I wrote with some stats on performance of cascaded VS 
unordered union.

http://postgis.refractions.net/pipermail/postgis-users/2007-November/017694.html



Obe, Regina wrote:
>
> Kevin,
>
> I remember that thread too and was thinking if that was along the 
> lines of how the cascaded union thing would work.  In the cascaded 
> case I presume we would still want to order the geometries.  I also 
> wasn't quite clear how this is different from using
>
> ST_MemCollect or ST_MemUnion - except it probably bunches more than 2 
> at a time.
>
> Looking at the example I did where I was puzzled about why ordering 
> made things slightly worse, I discovered I had the table clustered on 
> the geometry so that's probably why - it was already naturally sorted 
> so sorting again added a bit more processing - granted it seemed to 
> add more than it should have.  I'll have to revisit that one again.
>
> Thanks,
> Regina
>
> -----Original Message-----
> From: postgis-devel-bounces at postgis.refractions.net on behalf of Kevin 
> Neufeld
> Sent: Tue 8/5/2008 2:39 PM
> To: PostGIS Development Discussion
> Subject: Re: [postgis-devel] Ordering of geometries in multipolygons, 
> preparedgeometries
>
> Yes, specifying a custom ordering for collect often makes a huge
> difference when performing certain operations, but PostGIS does not do
> anything special internally with the ordering of a collection ... yet.
>
> I recall back in Nov.2007 that Lee Keel was trying to union ~33000
> polygons together that was taking over 5 hours to compute.  (See the
> "geomunion revisited..." thread in postgis-users).
>
> Using a technique called cascaded union, Martin Davis and I were able to
> drop the computation to half a minute by simply ordering the geometries
> so that they were spatially close to eachother and then unioning small
> collections at a time.
> (http://postgis.refractions.net/pipermail/postgis-users/2007-November/017696.html)
>
> Martin, what's the status of your CascadedUnion class in JTS?  Do you
> know if it ever made its way into GEOS yet?
>
> Cheers,
> -- Kevin
>
>
> Obe, Regina wrote:
> > Do PostGIS functions take advantage of the internal ordering of
> > geometries in a Multi geometry?
> >
> > The reason I ask is that since ST_Collect just collects geometries in
> > the order they are fed in, I'm wandering if its better to advice to
> > first order the geometries before collecting.
> >
> > Something along the line of
> >
> > SELECT stusps,
> >          ST_Collect(f.the_geom) as singlegeom 
> >        FROM (SELECT stusps, (ST_Dump(the_geom)).geom As the_geom
> >                               FROM
> >                               somestatetable
> >                       ORDER BY stusps, somestatetable.the_geom ) As f
> > GROUP BY stusps
> >
> > or maybe it doesn't make a difference.
> >
> > Also I've noticed when using ST_Union if I order the geometries first
> > before passing them to ST_Union, it is anywhere from 20-30% faster.  On
> > some occasions though - its actually slower by about 10% (even when my
> > ordering doesn't add any significant time).  I'm still trying to figure
> > this out.
> >
> > Case in point - runs in 359-360 ms
> > SELECT ST_Union(the_geom)
> > from (SELECT * from neighborhoods ORDER BY the_geom)  as foo
> >
> > Vs - runs between 530 - 550 ms
> > SELECT ST_Union(the_geom)
> > from (SELECT * from neighborhoods ) as foo
> >
> > This is all running on - "POSTGIS="1.3.3" GEOS="3.0.0-CAPI-1.4.1"
> > PROJ="Rel. 4.6.0, 21 Dec 2007" USE_STATS"
> >
> > Thanks,> Regina
> >
> >
> > -----------------------------------------
> > The substance of this message, including any attachments, may be
> > confidential, legally privileged and/or exempt from disclosure
> > pursuant to Massachusetts law. It is intended
> > solely for the addressee. If you received this in error, please
> > contact the sender and delete the material from any computer.
> >
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>   

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022




More information about the postgis-devel mailing list