<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 10.00.9200.16866"></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff
size=2 face=Arial>Quick glance at your plpgsql function. It won't be able
to use a spatial index. To utilize a spatial index, you'll need to change
your construct</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=951254810-24042014>
ST_Distance(a.geom, b.geom) <= 4891.96981025 * 40<BR></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=951254810-24042014><FONT color=#0000ff
size=2 face=Arial>to:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=951254810-24042014> ST_DWithin(a.geom,
b.geom, 4891.96981025 * 40)</SPAN></DIV>
<DIV><SPAN class=951254810-24042014></SPAN> </DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial>Also
make sure you have spatial indexes on your points table.</FONT></SPAN></DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2
face=Arial></FONT></SPAN> </DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial>Hope
that helps,</FONT></SPAN></DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2
face=Arial>Regina</FONT></SPAN></DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial><A
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2 face=Arial><A
href="http://postgis.net">http://postgis.net</A></FONT></SPAN></DIV>
<DIV><SPAN class=951254810-24042014><FONT color=#0000ff size=2
face=Arial></FONT> </DIV>
<DIV dir=ltr align=left><BR></DIV></SPAN><BR>
<DIV lang=en-us class=OutlookMessageHeader dir=ltr align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B> postgis-users-bounces@lists.osgeo.org
[mailto:postgis-users-bounces@lists.osgeo.org] <B>On Behalf Of </B>Matthias
Ludwig<BR><B>Sent:</B> Wednesday, April 23, 2014 10:53 AM<BR><B>To:</B>
postgis-users@lists.osgeo.org<BR><B>Subject:</B> [postgis-users] Distance based
clustering (like in Openlayers orLeaflet)<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV style="FONT-SIZE: 12px; FONT-FAMILY: Verdana">
<DIV>
<DIV>Hi,</DIV>
<DIV>I try to implement a simple distance based clustering in PostgreSQL.</DIV>
<DIV>It's working the following way:</DIV>
<DIV>1. loop as long as there are non clustered points</DIV>
<DIV> 2. took one point that is not clustered</DIV>
<DIV> 3. define this as a new cluster</DIV>
<DIV> 4. get all surrounding points in a defined distance and add
them into the cluster</DIV>
<DIV>5. calculate the centroid for each cluster and count cluster points</DIV>
<DIV> </DIV>
<DIV>The calculation speed depends mostly on the count of points and the scale
used. In the moment i'am using a python script and it took around 1sec for 1000
points for TMS zoomlevel 5 (distance = 4891.96981025m * 40 pixel) and around
3sec at zoomlevel 15 (distance = <SPAN
style="WHITE-SPACE: normal; TEXT-TRANSFORM: none; WORD-SPACING: 0px; FLOAT: none; COLOR: rgb(42,42,42); FONT: 13px/18px 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; DISPLAY: inline !important; LETTER-SPACING: normal; TEXT-INDENT: 0px; -webkit-text-stroke-width: 0px">4.78m</SPAN>
* 40 pixel). Implementing it with pgsql doesn't speeds it up.</DIV>
<DIV>Is it possible to do the work completely with SQL and maybe with just one
query? I'am not sure if I understood the WITH RECURSIVE concept completely...but
shouldn't it by possible to replace the iteration(2.-4.) with a WITH RECURSIVE
statement? Are there any other suggestion to speed it up or a better
solution?</DIV>
<DIV> </DIV>
<DIV>Greetings</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>CREATE TABLE points<BR>(<BR> id_point serial PRIMARY KEY,<BR>
geom geometry(Point,3857),<BR> cluster_nr integer<BR>);<BR>INSERT INTO
points (geom)<BR>SELECT ST_SetSrid(ST_MakePoint(12.5 + random()*5, 51 +
random()*5), 3857)<BR>FROM generate_series(0, 1000);</DIV>
<DIV> </DIV>
<DIV>CREATE TABLE pg_clusters<BR>(<BR> gid
integer,<BR> cluster_feature_ids
integer[],<BR> count integer,<BR> geom
geometry<BR>);</DIV>
<DIV> </DIV>
<DIV>-- select first point, which isn't clustered yet<BR>SELECT
id_point<BR>FROM points<BR>WHERE cluster_nr IS
NULL<BR>LIMIT 1;</DIV>
<DIV>-- use first point and get all surrounding points in defined distance
including the point itself</DIV>
<DIV>-- use them as first cluster<BR>WITH first_cluster AS
(<BR> SELECT a.id_point AS
cluster_nr,<BR>
b.id_point,<BR>
b.geom AS geom<BR>
FROM points
a,<BR>
points b<BR> WHERE
ST_Distance(a.geom, b.geom) <= 4891.96981025 *
40<BR>
AND a.id_point =
185<BR>
AND a.cluster_nr IS NULL<BR>)<BR>UPDATE points AS
a<BR>SET cluster_nr =
b.cluster_nr<BR>FROM first_cluster AS b<BR>WHERE
a.id_point = b.id_point;</DIV>
<DIV>--> choose next id<BR>--> calculate next cluster<BR>etc.</DIV>
<DIV>-- calculate the centroid of a cluster and count cluster points<BR>DELETE
FROM pg_clusters;<BR>WITH clusters AS
(<BR> SELECT cluster_nr,
array_agg(id_point) AS feature_ids, count(id_point),
ST_Centroid(ST_Collect(geom)) AS
geom<BR> FROM
points<BR> GROUP BY
cluster_nr<BR>)<BR>INSERT INTO pg_clusters(gid, cluster_feature_ids, count,
geom)<BR>SELECT * FROM clusters;</DIV></DIV></DIV></BODY></HTML>