[geos-devel] Gridded Union idea

Martin Davis mbdavis at refractions.net
Mon Jun 28 13:16:39 EDT 2010


Here's the idea of Gridded Union I mentioned as a workaround to memory 
limitations for CascadedUnion in JTS & GEOS.

Basically this mimics the operation of CascadedUnion, but using a grid 
index instead of a tree, since that's easier to code in a database 
environment.

1. Choose an appropriate grid size.  This will have to be determined by 
experimentation.  The goal is to use as large a cell size as possible, 
but small enought to allow the unions in step 4 to all execute.
2. First, extract the "lowest" coordinate from each geometry.  This will 
be used to assign the geometry to a grid cell.  The lowest coordinate 
can be defined as the coordinate with minimum X, and then minimum Y.  (I 
have to admit, I don't know how to do this in PostGIS - but I'm sure 
there's a way!)
3. Compute the grid cell for each geometry, by determining which grid 
cell the min point lies in
4. Union the geometries in each grid cell to a new table.
5. Finally, union all the geometries in the new table.

As long as there is a significant amount of overlap between the 
geometries, the intermediate union step should result in a large 
decrease in the number of points which need to be unioned in the final step

Note that step 4 parallelizes easily, so you might be able to get a 
significant performance increase as well.  This can be parallized on the 
DB server by issuing multiple union queries concurrently.

Note that I haven't actually tried this, so there may be some flaw I'm 
overlooking.  If anyone tries this, I'd like to hear about the results.

Martin

Martin Davis wrote:
> Giovanni,
>
> That's a good point. In both JTS and GEOS CascadedUnion is limited to 
> operating in main memory.  (However, it may be that GEOS running under 
> PostGIS is more memory-efficient, or is able to utilize a larger 
> memory space).
>
> There's not really an easy solution for working with JTS CascadedUnion 
> and very large datasets.  (This also applies to using CascadedUnion in 
> JEQL - although JEQL itself is not memory-bound, it can process data 
> in a streaming fashion.  But the CascadedUnion algorithm is not 
> stream-based).
>
> Some alternatives that come to mind are:
>
> - use 64-bit Java and lots of memory
>
> - implement your own "GriddedUnion" in PostGIS (and maybe involving 
> JTS).  I can provided details if you like.
>
> Martin
>
> G. Allegri wrote:
>> Thanks Martin.
>> I see that jts 1.11 works fine on a subset of my dataset. I have
>> hundreds of thousands od polygons to merge... I have to find the way
>> to read them to pass the collection to JTS without receiving
>> java.lang.OutOfMemoryError (just happened!).
>> What do you suggest to build the collection needed by CascadedUnion
>> without havind to create a huge Collection of geometries?
>>
>> I will take a look at JEQL. Thanks for the hint, I didn't know it...
>>
>> Giovanni
>>
>> 2010/6/22 G. Allegri <giohappy at gmail.com>:
>>  
>>> I'm using Postgis 1.5.1 to compute a Union on a quite large
>>> multipolygons dataset (some hundreds of thousands of features), but
>>> I'm facing the following error:
>>>
>>> NOTICE:  TopologyException: found non-noded intersection between
>>> LINESTRING (1.7318e+006 4.77959e+006, 1.7318e+006 4.77958e+006) and
>>> LINESTRING (1.7318e+006 4.77958e+006, 1.73174e+006 4.77954e+006) at
>>> 1.7318e+006 4.77958e+006
>>>
>>> ERROR:  GEOS union() threw an error!
>>>
>>> I know it's a known issue, but I haven't been able to find a good,
>>> replicable, solution. The input geometries result valid (in simple
>>> feature meaning).
>>> What is causing this? I have overlapping polygons, but I thought the
>>> cascaded union algorithm computed the necessary noding on the
>>> linestrings.
>>> I've supposed it depends on tolerances/precision model, but I don't if
>>> it makes sense and how, eventually, tune it.
>>>
>>> Thanks a lot,
>>> Giovanni
>>>
>>>     
>> _______________________________________________
>> geos-devel mailing list
>> geos-devel at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/geos-devel
>>
>>   
>

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022



More information about the geos-devel mailing list