[postgis-users] Re: Help with finding multi-variate clusters
Eli Dylan Lorimer
e.d.lorimer at sms.ed.ac.uk
Wed Jul 13 05:37:48 PDT 2005
Hello,
So I've been trying quasi unsuccessfully to find clusters in my data
using spatial sql. I think I"m pretty close but I need a tip or
something to get me past this little hurdle I'm hitting. Basically,
I've got a simple test table with a 2D Point field in it.
For testing purposes, I've placed the following data in it:
select AsText(point) from waypoints_test;
--------------
POINT(0 0)
POINT(4 4)
POINT(5 5)
POINT(6 6)
POINT(9 9)
POINT(10 10)
I want to run whatever magical SQL command I'm seeking to get back
the following clusters (based on a Euclidean distance of 2)
MULTIPOINT(4 4,5 5,5 5,6 6)
MULTIPOINT(9 9,10 10)
(2 rows)
What I've come up with is sort of close but not quite there. Here are
my attempts:
select distinct AsText(a.point) from waypoints_test a,
waypoints_test b where intersects(buffer(a.point,2), buffer(b.point,
2)) and not equals(a.point,b.point) group by a.point;
astext
--------------
POINT(10 10)
POINT(4 4)
POINT(5 5)
POINT(6 6)
POINT(9 9)
(5 rows)
select distinct AsText(a.point) from waypoints_test a,
waypoints_test b where distance(a.point,b.point)<2 and not equals
(a.point,b.point);
astext
--------------
POINT(10 10)
POINT(4 4)
POINT(5 5)
POINT(6 6)
POINT(9 9)
(5 rows)
select AsText(collect(a.point)) from waypoints_test a,
waypoints_test b where distance(a.point,b.point)<2 and not equals
(a.point,b.point);
---------------------------------------
MULTIPOINT(4 4,5 5,5 5,6 6,9 9,10 10)
I cant' seem to figure out how to get the GeomUnion or Collect rows
back with each row containing my cluster. The basic logistics behind
finding them is really simple, I just want to pass some tolerance
(perhaps a mean nearest neighbor value) and then:
For each point in my table
buffer it by the passed tolerance.
Find all points in the table that exist in that buffer.
Buffer them.
Continue until no points exist within the buffer.
Take Union (or collection) of all points and make that a cluster.
I would really appreciate a tip on how to group the rows being
returned based on the buffer they appeared in. Or, alternatively, I
could use a distance function in stead of buffering and find all
points that are within said distance(tolerance) of each other and
union all these points together.
Anything will do.
Thank you in advance.
Cheers,
./dylan
More information about the postgis-users
mailing list