[postgis-users] Determining clusters of points

Sébastien Lorion sl at thestrangefactory.com
Tue Dec 7 11:03:49 PST 2010


I like your idea very much, especially since something I did not say is that
the points have a range attribute which determines how far they can interact
with other points. So the circle buffer you talk about would have a diameter
equals to the range of each point.

What would be fastest as the last step : bounding box, minimum bounding
circle or minimum convex hull ? I am guessing the BB, but is the difference
significant enough that one should be chosen over another ?

Sébastien

On Tue, Dec 7, 2010 at 13:42, Emilie Laffray <emilie.laffray at gmail.com>wrote:

>
>
> On 7 December 2010 17:01, Sébastien Lorion <sl at thestrangefactory.com>wrote:
>
>> Hello,
>>
>> I am trying to find an efficient way to find clusters of points as shown
>> in the attached image. The only clustering criteria is the distance between
>> the points. The dataset can be very large (millions of points) and point
>> distribution is mostly clustered with some sparse points in the gaps.
>>
>> I searched the net and this mailing list and found two promising solution
>> paths:
>>
>> - use a statistical tools such as R with a density function (
>> http://www.r-project.org)
>> - use a clustering algorithm like those explained here
>> http://www.med.nyu.edu/biostatistics/people/Ilana%20Belitskaya-Levy/Courses/MAS/Handouts/clustering.pdf (agnes
>> seems the most promising for my purposes)
>>
>>
>> <http://www.med.nyu.edu/biostatistics/people/Ilana%20Belitskaya-Levy/Courses/MAS/Handouts/clustering.pdf>I
>> would like your advice to help me find which approach would be best suited
>> with PostGIS (maybe there is even something already made that I can use?).
>> Whatever solution I pick, it must be efficient and the workload must be able
>> to be distributed on a cluster of commodity hardware.
>>
>> I am new to GIS and this mailing list, so please excuse me if I am not
>> using the right vocabulary.
>>
>> Thank you very much!
>>
>>
> Hello,
>
> Some time ago I have worked on something similar, except that I was using
> circles instead of boxes which should not be a problem. I am just giving the
> logic as I don't have access to the code right now.
> You can start by creating a buffer around each of your points of the
> distance you want.
> The next step is to create an UNION of all the buffers that intersect.
> You get the list of points included in each of the resulting polygons and
> then you create either a bounding box around them or use a minimum bounding
> circle (Postgis 1.5 and above).
>
> Emily Laffray
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20101207/8b5cc89b/attachment.html>


More information about the postgis-users mailing list