<div dir="ltr"><div><div><div>Hey,<br>I did the same thing using PLR and a minimal spanning tree.<br></div>It was fast, easy, and clustering method could be changed at will <br></div>(see all classical unsupervised learning).<br>
<br></div>Cheers,<br>Rémi-C<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-04-24 12:53 GMT+02:00 Paragon Corporation <span dir="ltr"><<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u>
<div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial">Quick glance at your plpgsql function. It won't be able
to use a spatial index. To utilize a spatial index, you'll need to change
your construct</font></span></div><div class="">
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial"></font></span> </div>
<div dir="ltr" align="left"><span>
ST_Distance(a.geom, b.geom) <= 4891.96981025 * 40<br></span></div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial"></font></span> </div>
</div><div dir="ltr" align="left"><span><font color="#0000ff" face="Arial">to:</font></span></div>
<div dir="ltr" align="left"><span> ST_DWithin(a.geom,
b.geom, 4891.96981025 * 40)</span></div>
<div><span></span> </div>
<div><span><font color="#0000ff" face="Arial">Also
make sure you have spatial indexes on your points table.</font></span></div>
<div><span><font color="#0000ff" face="Arial"></font></span> </div>
<div><span><font color="#0000ff" face="Arial">Hope
that helps,</font></span></div>
<div><span><font color="#0000ff" face="Arial">Regina</font></span></div>
<div><span><font color="#0000ff" face="Arial"><a href="http://www.postgis.us" target="_blank">http://www.postgis.us</a></font></span></div>
<div><span><font color="#0000ff" face="Arial"><a href="http://postgis.net" target="_blank">http://postgis.net</a></font></span></div>
<div><span><font color="#0000ff" face="Arial"></font> </span></div>
<div dir="ltr" align="left"><br></div><br>
<div dir="ltr" align="left" lang="en-us">
<hr>
<font face="Tahoma"><b>From:</b> <a href="mailto:postgis-users-bounces@lists.osgeo.org" target="_blank">postgis-users-bounces@lists.osgeo.org</a>
[mailto:<a href="mailto:postgis-users-bounces@lists.osgeo.org" target="_blank">postgis-users-bounces@lists.osgeo.org</a>] <b>On Behalf Of </b>Matthias
Ludwig<br><b>Sent:</b> Wednesday, April 23, 2014 10:53 AM<br><b>To:</b>
<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a><br><b>Subject:</b> [postgis-users] Distance based
clustering (like in Openlayers orLeaflet)<br></font><br></div><div><div class="h5">
<div></div>
<div style="FONT-SIZE:12px;FONT-FAMILY:Verdana">
<div>
<div>Hi,</div>
<div>I try to implement a simple distance based clustering in PostgreSQL.</div>
<div>It's working the following way:</div>
<div>1. loop as long as there are non clustered points</div>
<div> 2. took one point that is not clustered</div>
<div> 3. define this as a new cluster</div>
<div> 4. get all surrounding points in a defined distance and add
them into the cluster</div>
<div>5. calculate the centroid for each cluster and count cluster points</div>
<div> </div>
<div>The calculation speed depends mostly on the count of points and the scale
used. In the moment i'am using a python script and it took around 1sec for 1000
points for TMS zoomlevel 5 (distance = 4891.96981025m * 40 pixel) and around
3sec at zoomlevel 15 (distance = <span style="WHITE-SPACE:normal;TEXT-TRANSFORM:none;WORD-SPACING:0px;FLOAT:none;COLOR:rgb(42,42,42);FONT:13px/18px 'Segoe UI','Lucida Grande',Verdana,Arial,Helvetica,sans-serif;DISPLAY:inline!important;LETTER-SPACING:normal;TEXT-INDENT:0px">4.78m</span>
* 40 pixel). Implementing it with pgsql doesn't speeds it up.</div>
<div>Is it possible to do the work completely with SQL and maybe with just one
query? I'am not sure if I understood the WITH RECURSIVE concept completely...but
shouldn't it by possible to replace the iteration(2.-4.) with a WITH RECURSIVE
statement? Are there any other suggestion to speed it up or a better
solution?</div>
<div> </div>
<div>Greetings</div>
<div> </div>
<div> </div>
<div>CREATE TABLE points<br>(<br> id_point serial PRIMARY KEY,<br>
geom geometry(Point,3857),<br> cluster_nr integer<br>);<br>INSERT INTO
points (geom)<br>SELECT ST_SetSrid(ST_MakePoint(12.5 + random()*5, 51 +
random()*5), 3857)<br>FROM generate_series(0, 1000);</div>
<div> </div>
<div>CREATE TABLE pg_clusters<br>(<br> gid
integer,<br> cluster_feature_ids
integer[],<br> count integer,<br> geom
geometry<br>);</div>
<div> </div>
<div>-- select first point, which isn't clustered yet<br>SELECT
id_point<br>FROM points<br>WHERE cluster_nr IS
NULL<br>LIMIT 1;</div>
<div>-- use first point and get all surrounding points in defined distance
including the point itself</div>
<div>-- use them as first cluster<br>WITH first_cluster AS
(<br> SELECT a.id_point AS
cluster_nr,<br>
b.id_point,<br>
b.geom AS geom<br>
FROM points
a,<br>
points b<br> WHERE
ST_Distance(a.geom, b.geom) <= 4891.96981025 *
40<br>
AND a.id_point =
185<br>
AND a.cluster_nr IS NULL<br>)<br>UPDATE points AS
a<br>SET cluster_nr =
b.cluster_nr<br>FROM first_cluster AS b<br>WHERE
a.id_point = b.id_point;</div>
<div>--> choose next id<br>--> calculate next cluster<br>etc.</div>
<div>-- calculate the centroid of a cluster and count cluster points<br>DELETE
FROM pg_clusters;<br>WITH clusters AS
(<br> SELECT cluster_nr,
array_agg(id_point) AS feature_ids, count(id_point),
ST_Centroid(ST_Collect(geom)) AS
geom<br> FROM
points<br> GROUP BY
cluster_nr<br>)<br>INSERT INTO pg_clusters(gid, cluster_feature_ids, count,
geom)<br>SELECT * FROM clusters;</div></div></div></div></div></div>
<br>_______________________________________________<br>
postgis-users mailing list<br>
<a href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a><br></blockquote></div><br></div>