[postgis-users] Determining clusters of points
Sébastien Lorion
sl at thestrangefactory.com
Wed Dec 8 07:38:53 PST 2010
Hello Brent,
The points are not static, they will potentially be in a different position
each time the query is run. Similar to if I was taking a snapshot of all
cars on the roads at a given point in time. It is not what I am doing, but
it is a good enough comparison.
I am also very impressed by what PostGIS can do. Oracle/SDO or SDE offering
is really strong, but so is the price tag that comes along with it ...
Sébastien
On Wed, Dec 8, 2010 at 03:03, <pcreso at pcreso.com> wrote:
> Hi Kevin,
>
> I'm increasingly impressed with the analytical capability of Postgis, and
> with how fast it is increasing. I can do (& have done) pretty substantial
> analytical processing totally within the db, and have tools like QGIS to
> visualise the results & GMT for publication quality cartography. Postgis is
> much more than a data management & query tool.
>
> And the developments are coming from both the Postgis & Postgres camps, as
> you demo here with the RECURSIVE capability.
>
> It would be interesting to throw a recursive wrapper around your function
> to iterate through the 1m points until they are all assigned to a cluster,
> to see how long it takes... but once done, the clusters are there to be
> used, assuming it is a one off assignment :-)
>
> Cheers,
>
> Brent
>
> --- On *Wed, 12/8/10, Kevin Neufeld <kneufeld.ca at gmail.com>* wrote:
>
>
> From: Kevin Neufeld <kneufeld.ca at gmail.com>
>
> Subject: Re: [postgis-users] Determining clusters of points
> To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
> Date: Wednesday, December 8, 2010, 6:59 PM
>
>
> Here's the RECURSIVE script run over a dataset of 20000 random points. I
> overlayed the points with a polygonal layer of the points buffered and
> grouped by the group id for visual purposes. I must say, I like this
> RECURSIVE business (though it's not as performant as I hoped) ... this would
> have been real handy years ago when I was doing network analysis on a
> connected linear dataset.
>
> -- Kevin
>
>
> On 12/7/2010 9:20 PM, Kevin Neufeld wrote:
> > On 12/7/2010 5:15 PM, pcreso at pcreso.com<http://mc/compose?to=pcreso@pcreso.com>wrote:
> >> You might look at the newish (v8.4+ ?) RECURSIVE/WITH capabilities of
> Postgres.
> >>
> >> Given Postgis allows you to create arbitrary buffer zones around your
> points, the RECURSIVE query capability allows you to write an SQL that will
> recursively track points which can interact with each other
> >
> > So, just playing around a bit, I came up with this RECURSIVE query
> statement that will find all points in a particular "cluster". Every time I
> run the UPDATE statement, another cluster of points is identified and
> updated to have the same group id. I have no idea how this would perform
> over a table of a million points ...
> >
> >
> > CREATE TABLE pts (id serial, grp int, geom geometry);
> > CREATE SEQUENCE pts_grp_seq; -- to label cluster groups
> >
> > -- Sample data similiar to Sébastien's original post
> > INSERT INTO pts (geom)
> > SELECT (ST_Dump(
> > 'MULTIPOINT (
> > 6 129,
> > 21 200,
> > 30 66,
> > 33 109,
> > 41 137,
> > 48 191,
> > 76 119,
> > 90 197,
> > 175 212,
> > 178 230,
> > 182 53,
> > 196 209,
> > 259 124,
> > 259 148
> > )')).geom;
> >
> > -- Simple wrapper around an UPDATE statement that identifies a
> > -- point cluster everytime its run
> > CREATE OR REPLACE FUNCTION cluster_pts() RETURNS text AS
> > $body$
> > DECLARE
> > results int;
> > BEGIN
> > LOOP
> > -- Assign all records in the same cluster, the same group id
> > UPDATE pts
> > SET grp = f.grp
> > FROM (
> > WITH RECURSIVE t(id, geom, grp, path, cycle) AS (
> > -- Grab the first record that doesn't have a group id
> > (SELECT id, geom, nextval('pts_grp_seq'), ARRAY[id], false
> > FROM pts
> > WHERE grp IS NULL
> > LIMIT 1
> > )
> > UNION ALL
> > -- Find all the other records that are iteratively within
> some
> > -- distance to the starting point.
> > -- This includes A to B to C paths, but not A to B to A
> cyclic paths
> > SELECT a.id, a.geom, b.grp, path || ARRAY[a.id], a.id =
> ANY(path)
> > FROM pts a, t b
> > WHERE a.id != b.id
> > AND ST_DWithin(a.geom, b.geom, 80)
> > AND NOT cycle
> > )
> > SELECT DISTINCT id, grp FROM t
> > ) AS f
> > WHERE pts.id = f.id;
> >
> > -- repeat until no more updates are performed.
> > GET DIAGNOSTICS results = ROW_COUNT;
> > EXIT WHEN results = 0;
> > END LOOP;
> >
> > RETURN 'done.';
> > END;
> > $body$ LANGUAGE plpgsql;
> >
> > SELECT cluster_pts();
> >
> >
>
> -----Inline Attachment Follows-----
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net<http://mc/compose?to=postgis-users@postgis.refractions.net>
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20101208/5c5a47f6/attachment.html>
More information about the postgis-users
mailing list