<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META content="text/html; charset=us-ascii" http-equiv=Content-Type>

<META name=GENERATOR content="MSHTML 10.00.9200.16866"></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff 

size=2 face=Arial>Quick glance at your plpgsql function.  It won't be able 

to use a spatial index.  To utilize a spatial index, you'll need to change 

your construct</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff 

size=2 face=Arial></FONT></SPAN> </DIV>

<DIV dir=ltr align=left><SPAN class=951254810-24042014>  

ST_Distance(a.geom, b.geom) <= 4891.96981025 * 40<BR></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff 

size=2 face=Arial></FONT></SPAN> </DIV>

<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff 

size=2 face=Arial>to:</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=951254810-24042014>  ST_DWithin(a.geom, 

b.geom, 4891.96981025 * 40)</SPAN></DIV>

<DIV><SPAN class=951254810-24042014></SPAN> </DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial>Also 

make sure you have spatial indexes on your points table.</FONT></SPAN></DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 

face=Arial></FONT></SPAN> </DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial>Hope 

that helps,</FONT></SPAN></DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 

face=Arial>Regina</FONT></SPAN></DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial><A 

href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial><A 

href="http://postgis.net">http://postgis.net</A></FONT></SPAN></DIV>

<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 

face=Arial></FONT> </DIV>

<DIV dir=ltr align=left><BR></DIV></SPAN><BR>

<DIV lang=en-us class=OutlookMessageHeader dir=ltr align=left>

<HR tabIndex=-1>

<FONT size=2 face=Tahoma><B>From:</B> postgis-users-bounces@lists.osgeo.org 

[mailto:postgis-users-bounces@lists.osgeo.org] <B>On Behalf Of </B>Matthias 

Ludwig<BR><B>Sent:</B> Wednesday, April 23, 2014 10:53 AM<BR><B>To:</B> 

postgis-users@lists.osgeo.org<BR><B>Subject:</B> [postgis-users] Distance based 

clustering (like in Openlayers orLeaflet)<BR></FONT><BR></DIV>

<DIV></DIV>

<DIV style="FONT-SIZE: 12px; FONT-FAMILY: Verdana">

<DIV>

<DIV>Hi,</DIV>

<DIV>I try to implement a simple distance based clustering in PostgreSQL.</DIV>

<DIV>It's working the following way:</DIV>

<DIV>1. loop as long as there are non clustered points</DIV>

<DIV>   2. took one point that is not clustered</DIV>

<DIV>   3. define this as a new cluster</DIV>

<DIV>   4. get all surrounding points in a defined distance and add 

them into the cluster</DIV>

<DIV>5. calculate the centroid for each cluster and count cluster points</DIV>

<DIV> </DIV>

<DIV>The calculation speed depends mostly on the count of points and the scale 

used. In the moment i'am using a python script and it took around 1sec for 1000 

points for TMS zoomlevel 5 (distance = 4891.96981025m * 40 pixel) and around 

3sec at zoomlevel 15 (distance = <SPAN 

style="WHITE-SPACE: normal; TEXT-TRANSFORM: none; WORD-SPACING: 0px; FLOAT: none; COLOR: rgb(42,42,42); FONT: 13px/18px 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; DISPLAY: inline !important; LETTER-SPACING: normal; TEXT-INDENT: 0px; -webkit-text-stroke-width: 0px">4.78m</SPAN> 

* 40 pixel). Implementing it with pgsql doesn't speeds it up.</DIV>

<DIV>Is it possible to do the work completely with SQL and maybe with just one 

query? I'am not sure if I understood the WITH RECURSIVE concept completely...but 

shouldn't it by possible to replace the iteration(2.-4.) with a WITH RECURSIVE 

statement? Are there any other suggestion to speed it up or a better 

solution?</DIV>

<DIV> </DIV>

<DIV>Greetings</DIV>

<DIV> </DIV>

<DIV> </DIV>

<DIV>CREATE TABLE points<BR>(<BR>  id_point serial PRIMARY KEY,<BR>  

geom geometry(Point,3857),<BR>  cluster_nr integer<BR>);<BR>INSERT INTO 

points (geom)<BR>SELECT ST_SetSrid(ST_MakePoint(12.5 + random()*5, 51 + 

random()*5), 3857)<BR>FROM generate_series(0, 1000);</DIV>

<DIV> </DIV>

<DIV>CREATE TABLE pg_clusters<BR>(<BR>    gid 

integer,<BR>    cluster_feature_ids 

integer[],<BR>    count integer,<BR>    geom 

geometry<BR>);</DIV>

<DIV> </DIV>

<DIV>-- select first point, which isn't clustered yet<BR>SELECT  

id_point<BR>FROM    points<BR>WHERE   cluster_nr IS 

NULL<BR>LIMIT   1;</DIV>

<DIV>-- use first point and get all surrounding points in defined distance 

including the point itself</DIV>

<DIV>-- use them as first cluster<BR>WITH first_cluster AS 

(<BR>        SELECT  a.id_point AS 

cluster_nr,<BR>                

b.id_point,<BR>                

b.geom AS geom<BR>        

FROM    points 

a,<BR>                

points b<BR>        WHERE   

ST_Distance(a.geom, b.geom) <= 4891.96981025 * 

40<BR>                

AND a.id_point = 

185<BR>                

AND a.cluster_nr IS NULL<BR>)<BR>UPDATE  points AS 

a<BR>SET     cluster_nr = 

b.cluster_nr<BR>FROM    first_cluster AS b<BR>WHERE   

a.id_point = b.id_point;</DIV>

<DIV>--> choose next id<BR>--> calculate next cluster<BR>etc.</DIV>

<DIV>-- calculate the centroid of a cluster and count cluster points<BR>DELETE 

FROM pg_clusters;<BR>WITH clusters AS 

(<BR>        SELECT  cluster_nr, 

array_agg(id_point) AS feature_ids, count(id_point), 

ST_Centroid(ST_Collect(geom)) AS 

geom<BR>        FROM    

points<BR>        GROUP BY 

cluster_nr<BR>)<BR>INSERT INTO pg_clusters(gid, cluster_feature_ids, count, 

geom)<BR>SELECT * FROM clusters;</DIV></DIV></DIV></BODY></HTML>