Hello,<div><br></div><div>I am trying to find an efficient way to find clusters of points as shown in the attached image. The only clustering criteria is the distance between the points. The dataset can be very large (millions of points) and point distribution is mostly clustered with some sparse points in the gaps.</div>
<div><br></div><div>I searched the net and this mailing list and found two promising solution paths: </div><div><br></div><div>- use a statistical tools such as R with a density function (<a href="http://www.r-project.org/">http://www.r-project.org</a>)</div>
<div>- use a clustering algorithm like those explained here <a href="http://www.med.nyu.edu/biostatistics/people/Ilana%20Belitskaya-Levy/Courses/MAS/Handouts/clustering.pdf">http://www.med.nyu.edu/biostatistics/people/Ilana%20Belitskaya-Levy/Courses/MAS/Handouts/clustering.pdf</a> (agnes seems the most promising for my purposes)</div>
<div><br></div><div><a href="http://www.med.nyu.edu/biostatistics/people/Ilana%20Belitskaya-Levy/Courses/MAS/Handouts/clustering.pdf"></a>I would like your advice to help me find which approach would be best suited with PostGIS (maybe there is even something already made that I can use?). Whatever solution I pick, it must be efficient and the workload must be able to be distributed on a cluster of commodity hardware.</div>
<div><br></div><div>I am new to GIS and this mailing list, so please excuse me if I am not using the right vocabulary.</div><div><br></div><div>Thank you very much!</div><div><br></div><div>Sébastien</div>