[postgis-users] hi, kevin, I'll explain my initial motivation for clustering.

sunpeng bluevaley at gmail.com
Wed Feb 10 12:36:54 PST 2010


    hi,Kevin,thanks for your help.

    Now, I'll explain my initial motivation.

    Suppose we have a table with:
    create table houses  (
       NAME              VARCHAR(128)                     not null,
    );
    SELECT AddGeometryColumn('houses', 'location', 4214, 'POINT', 2);

    if we want to do clustering(in data mining environment) the houses on
location, that is, clustering those house near as a cluster(or a group), and
then calculate each cluster's shape, we can not use the following sql:

    select ST_Boundary(*)
    from houses
    group by location

    I would like to extend the postgresql or postgis to support the
following sql:

    select ST_Boundary(*)
    from houses
    clustering by location

    in which the following steps are considered:
    1. cluster the houses into several goups
    2. calculate each goup's shape

    I know the second step is usually related to aggregation functions like
sum,count,and so on which i don't care too much now. All what I want to do
is to add this sql semantics into postgresql or postgis and then add a
clustering algorithm like dbscan in the executor in postgresql(or postgis?).
    How could I do? Any detailed steps?Like I should modify kwlist.h to
support "CLUSTERING" keyword, and the following steps?

    Thanks!
    peng

---------------------------------------------------------------------------
>
> we all know
>
> I'm not sure I follow.  Can you explain what exactly you want to do?
>
> The following query will collect points into clusters (multipoints),
> clustered on a 100x100 grid.
>
> -- generate a sample random point dataset
> CREATE TABLE points AS
> SELECT ST_MakePoint(random()*1000, random()*1000) AS geom
> FROM generate_series(1, 100000);
>
> -- create point clusters
> SELECT st_collect(geom)
> FROM points
> GROUP BY
>   round(st_x(geom)/100)*100,
>   round(st_y(geom)/100)*100;
>
>
> Kevin
>
>
> On 2/4/2010 11:26 PM, sunpeng wrote:
> > I want to write "cluster by" instead of "group by" on geospatial point
> > data,should I write the code at postgresql or postgis ?
> > thanks
> >
> > peng
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20100210/00add9c0/attachment.html>


More information about the postgis-users mailing list