[postgis-users] Re: Help with finding multi-variate clusters

Stephen Woodbridge woodbri at swoodbridge.com
Wed Jul 13 06:37:24 PDT 2005


Hi Eli,

In general, the algorithm you presented is an iterative operation and 
SQL operations tend to be set operations. This means that it is likely 
that there is not a simple solution to this problem. That said, there is 
no reason you can write a pgsql function that does the iterative solution.

I for one would love to see a function like this, as it is part and 
parcel of what needs to be done to generate thematic maps.

-Steve W.

Eli Dylan Lorimer wrote:
> Hello,
> 
> So I've been trying quasi unsuccessfully to find clusters in my data  
> using spatial sql. I think I"m pretty close but I need a tip or  
> something to get me past this little hurdle I'm hitting. Basically,  
> I've got a simple test table with a 2D Point field in it.
> 
> For testing purposes, I've placed the following data in it:
> 
> select AsText(point) from waypoints_test;
> 
> --------------
> POINT(0 0)
> POINT(4 4)
> POINT(5 5)
> POINT(6 6)
> POINT(9 9)
> POINT(10 10)
> 
> I want to run whatever magical SQL command I'm seeking to get back  the 
> following clusters (based on a Euclidean distance of 2)
> 
> MULTIPOINT(4 4,5 5,5 5,6 6)
> MULTIPOINT(9 9,10 10)
> (2 rows)
> 
> What I've come up with is sort of close but not quite there. Here are  
> my attempts:
> 
> 
> select distinct  AsText(a.point) from waypoints_test a,  waypoints_test 
> b where intersects(buffer(a.point,2), buffer(b.point, 2)) and not 
> equals(a.point,b.point) group by a.point;
>     astext
> --------------
> POINT(10 10)
> POINT(4 4)
> POINT(5 5)
> POINT(6 6)
> POINT(9 9)
> (5 rows)
> 
> select distinct AsText(a.point) from  waypoints_test a,  waypoints_test 
> b where distance(a.point,b.point)<2 and not equals (a.point,b.point);
>     astext
> --------------
> POINT(10 10)
> POINT(4 4)
> POINT(5 5)
> POINT(6 6)
> POINT(9 9)
> (5 rows)
> 
> select AsText(collect(a.point))  from waypoints_test a,  waypoints_test 
> b where distance(a.point,b.point)<2 and not equals (a.point,b.point);
> ---------------------------------------
> MULTIPOINT(4 4,5 5,5 5,6 6,9 9,10 10)
> 
> I cant' seem to figure out how to get the GeomUnion or Collect rows  
> back with each row containing my cluster. The basic logistics behind  
> finding them is really simple, I just want to pass some tolerance  
> (perhaps a mean nearest neighbor value) and then:
> 
> For each point in my table
>     buffer it by the passed tolerance.
>     Find all points in the table that exist in that buffer.
>     Buffer them.
>     Continue until no points exist within the buffer.
>     Take Union (or collection) of all points and make that a cluster.
> 
> I would really appreciate a tip on how to group the rows being  returned 
> based on the buffer they appeared in. Or, alternatively, I  could use a 
> distance function in stead of buffering and find all  points that are 
> within said distance(tolerance) of each other and  union all these 
> points together.
> 
> Anything will do.
> Thank you in advance.
> 
> Cheers,
> ./dylan
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 




More information about the postgis-users mailing list