[postgis-devel] Geometry clustering functions

Rémi Cura remi.cura at gmail.com
Thu Jan 15 05:37:26 PST 2015


Hey,
I also frequently use the function 1, it would be cool to have it in native
efficient C .
Function 2 is especially interesting coupled with pg_pointcloud.

I do it with sql (temp table or CTE depending on size) :
cluster = St_Dump(St_Union)
, then for each geometry, test in which cluster it is.
It can be accelerated by creating temp table and indexes on it.

As long as the "interonnected " means spatially overlapsing its fine,
but sadly it doesn't work when using multipolygons for instance.
My workaround was to solve the interconnected problem with plpython and
networkx (in network, this is called the connected components problem)..
It can work with any relation (for instance, you want your cluster to be
spatial wise and also relating to an attribute).

Function 2 is function 1 with either a buffer of X, or use of ST_DWithin
instead of intersect for interconnection definition then previous networkx
solution.
It can be greatly accelerated (at some cost) if you are willing to quantize
your coordinates (with snaptogrid for instance).
First you snap to grid all points, then use a SELECT DISTINCT on quantized
coordinates. This can greatly reduce the number of points.
Then it is fast to compute cluster either with buffer or networkx.

Cheers,
Rémi-C


2015-01-15 13:14 GMT+01:00 Daniel Baston <dbaston at gmail.com>:

> Hello,
>
> I'm working on a couple of C-language functions to solve some clustering
> problems I frequently face in PostGIS.  I'm wondering if there may be any
> level of interest in including such functions in the project?  I would be
> happy to flesh these out to include the error-handling and unit-testing
> that would be required.
>
> Function #1 takes a GeometryCollection and returns the same geometries,
> re-aggregated into groups of interconnected geometries (solving a question
> I posted about at http://gis.stackexchange.com/q/94203 )
>
> (Internally this builds an STRtree using GEOS and uses a union-find
> structure to identify connected components....as an aside, I am wondering
> if there is some way to take advantage of a database spatial index that may
> already exist)
>
> Function #2 (not written) would solve a related problem - we want to take
> a set of points and return a set of MultiPoints, where each point in a
> MultiPoint is within distance X of some other point in the MultiPoint.
>
> Thanks,
> Dan
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20150115/9a03bca7/attachment.html>


More information about the postgis-devel mailing list