[postgis-tickets] [PostGIS] #4850: ST_ClusterKMeans with M seems to do nothing

PostGIS trac at osgeo.org
Sun Feb 14 17:42:44 PST 2021


#4850: ST_ClusterKMeans with M seems to do nothing
---------------------+---------------------------
 Reporter:  robe     |      Owner:  komzpa
     Type:  defect   |     Status:  assigned
 Priority:  medium   |  Milestone:  PostGIS 3.1.2
Component:  postgis  |    Version:  3.1.x
 Keywords:           |
---------------------+---------------------------
 According to the docs, ST_ClusterKMeans in PostGIS 3.1 should support
 weights ergo - M coordinate.

 I thought I could use this to handle things like clustering by population
 density so that if I have a hi-rise with say 300 people and town houses
 with say 1-4 people, I should see my hi-rise area clusters have fewer
 records.  It doesn't seem to make a difference whether I pass in M or not.
 Z   does something.

 here is a revised example I was going to put in the docs.


 {{{
 CREATE TABLE parcels AS
 SELECT lpad(g.ord::text,3,'0') As parcel_id, geom,
 ('{residential, commercial}'::text[])[1 + mod(g.ord,2)] As type,
 CASE WHEN g.ord < 3 THEN g.ord*3000 ELSE 1 END AS population

 FROM
     ST_Subdivide(ST_Buffer('SRID=3857;LINESTRING(40 100, 98 100, 100 150,
 60 90)'::geometry,
     40, 'endcap=square'),12)  WITH ORDINALITY AS g(geom,ord);

 }}}


 {{{
 -- no weight
 SELECT ST_ClusterKMeans(geom, 5) OVER() AS cid, parcel_id, population
 FROM parcels
 ORDER BY cid, parcel_id;

 -- yields

  cid | parcel_id | population
 -----+-----------+------------
    0 | 002       |       6000
    0 | 003       |          1
    1 | 006       |          1
    1 | 007       |          1
    2 | 001       |       3000
    3 | 004       |          1
    4 | 005       |          1
 (7 rows)

 }}}


 {{{
 -- with weight by population

 SELECT ST_ClusterKMeans(ST_Force3DM(geom, population), 5) OVER() AS cid,
 parcel_id, population
 FROM parcels
 ORDER BY cid, parcel_id;

 yields:
  cid | parcel_id | population
 -----+-----------+------------
    0 | 002       |       6000
    0 | 003       |          1
    1 | 006       |          1
    1 | 007       |          1
    2 | 001       |       3000
    3 | 004       |          1
    4 | 005       |          1
 (7 rows)
 }}}




 See answers are the same.  I would have expected parcels 002 and 001 to
 have their own dedicated cluster cause they have such a huge population

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4850>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list