[postgis-devel] More work on Cascade Union

Obe, Regina robe.dnd at cityofboston.gov
Wed Aug 20 12:32:29 PDT 2008


Martin,
On further inspection into the depths of GEOS code.  I see you are right it is doing a Binary ops thing.  I'm not sure you assumption about effectiveness of binary ops is wrong for points since OpenJump works pretty fast.

So maybe its an issue with the unite_garray function in PostGIS lwgeom code that calls GEOS -. The penalty incurred for each and every  POSTGIS2GEOS(geom) it to a GEOS one and then unioning it with the result that I am seeing and not necessarily an issue with GEOS itself.

What confused me is that there are 2 versions of unite_garray - one that uses buffer collect and one that uses an iterate GEOS union approach.

The version of the trunk code I'm looking at has this remarked out 
/* #define UNITE_USING_BUFFER 1 */
But I was mistakingly reading the UNITE USING BUFFER version which apparently is not in use.


Thanks,
Regina

-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net on behalf of Martin Davis
Sent: Wed 8/20/2008 2:52 PM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] More work on Cascade Union
 


Obe, Regina wrote:
> So why does GEOS perform so poorly then? I suspect the cascade 
> algorithm I am using is gaining by the fact that it realizes fairly 
> quickly it can just collect all the points except the rare cases where 
> the points are duplicated then it unions.
>
>
>  I'll check on unioning tiger lines to see what kind of metrics I 
> get.  Looking at the GEOS code - I don't get a sense its doing the 
> same thing as JTS, but I haven't really delved to see what the Overlay 
> classes are actually doing in JTS or done more than a scant glance at 
> the GEOS functions so I could very well be wrong in my reading.
>
The GEOS overlay algorithms are basically identical to JTS. 

So you're saying that GEOS is slow for unioning lots of points?  Not 
sure why this would be.  It's possible that it's just the GEOS mem 
management which is slowing things down.  Or, perhaps the Point union 
algorithm is not as performant as I thought it was.  When I get time 
(after my 3 wk vacation...)  I'll try and have a look at this.
>
>  
> > Hmm...   what exactly is the data that causes this crash?  I'd like to
> > dig into this and find out what's going on in JTS.
>
> Simple - Create a table like this. Granted very unrealistic and then 
> try to union all the points in JTS (well in my case OpenJump latest 
> snapshot - reaches memory heap fault in 11 seconds) -
> 1,000,000 points in one geom is hard to swallow anyway so I wasn't too 
> concerned.
>
> CREATE TABLE tmp_points(the_geom geometry);
> INSERT INTO tmp_points
> SELECT ST_MakePoint(x,y) As the_geom FROM generate_series(1,1000) x,
> generate_series(1,1000) y
> order by random();
>
Sounds like the JVM memory limit is set too low to run a million 
points.  Crank it up to 1 GB and try again. 
>
>
>
> > Not sure what you mean by "internally sorted geometry"?  If you talking
> > about a linear sort of the geometries (by x or possibly x,y), I don't
> > think this is going to be as good as doing a true spatial sort (which is
> > what the Rtree index in CascadedUnion does, and what Kevin's grid
> > approach does)
> Doing a true spatial sort - basically order by the_geom which I assume 
> would use the Rtree operators for sorting or at least seems to.
>
Hmm...  AFAIK sorting geometry just does a conventional "x then y" 
ordering of the geometry.  This is better than nothing, but isn't as 
effective as a true spatial index ordering.  It's worth noting that 
using a spatial index actually doesn't give you a linear order of the 
geometries - the key is to walk the tree of objects in depth-first 
order, performing unions progressively up the tree.  I don't think a 
linear order provides the same kind of effect.
>
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
> *The substance of this message, including any attachments, may be 
> confidential, legally privileged and/or exempt from disclosure 
> pursuant to Massachusetts law. It is intended solely for the 
> addressee. If you received this in error, please contact the sender 
> and delete the material from any computer. *
>
> ------------------------------------------------------------------------
>
> * Help make the earth a greener place. If at all possible resist 
> printing this email and join us in saving paper. *
>
> * *
>
> * *
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>   

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
postgis-devel mailing list
postgis-devel at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-devel





-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080820/ad3be590/attachment.html>


More information about the postgis-devel mailing list