[postgis-tickets] [PostGIS] #4850: ST_ClusterKMeans with M seems to do nothing
PostGIS
trac at osgeo.org
Sun Feb 14 17:50:05 PST 2021
#4850: ST_ClusterKMeans with M seems to do nothing
----------------------+---------------------------
Reporter: robe | Owner: komzpa
Type: defect | Status: assigned
Priority: medium | Milestone: PostGIS 3.1.2
Component: postgis | Version: 3.1.x
Resolution: | Keywords:
----------------------+---------------------------
Description changed by robe:
Old description:
> According to the docs, ST_ClusterKMeans in PostGIS 3.1 should support
> weights ergo - M coordinate.
>
> I thought I could use this to handle things like clustering by population
> density so that if I have a hi-rise with say 300 people and town houses
> with say 1-4 people, I should see my hi-rise area clusters have fewer
> records. It doesn't seem to make a difference whether I pass in M or
> not. Z does something.
>
> here is a revised example I was going to put in the docs.
>
> {{{
> CREATE TABLE parcels AS
> SELECT lpad(g.ord::text,3,'0') As parcel_id, geom,
> ('{residential, commercial}'::text[])[1 + mod(g.ord,2)] As type,
> CASE WHEN g.ord < 3 THEN g.ord*3000 ELSE 1 END AS population
>
> FROM
> ST_Subdivide(ST_Buffer('SRID=3857;LINESTRING(40 100, 98 100, 100 150,
> 60 90)'::geometry,
> 40, 'endcap=square'),12) WITH ORDINALITY AS g(geom,ord);
>
> }}}
>
> {{{
> -- no weight
> SELECT ST_ClusterKMeans(geom, 5) OVER() AS cid, parcel_id, population
> FROM parcels
> ORDER BY cid, parcel_id;
>
> -- yields
>
> cid | parcel_id | population
> -----+-----------+------------
> 0 | 002 | 6000
> 0 | 003 | 1
> 1 | 006 | 1
> 1 | 007 | 1
> 2 | 001 | 3000
> 3 | 004 | 1
> 4 | 005 | 1
> (7 rows)
>
> }}}
>
> {{{
> -- with weight by population
>
> SELECT ST_ClusterKMeans(ST_Force3DM(geom, population), 5) OVER() AS cid,
> parcel_id, population
> FROM parcels
> ORDER BY cid, parcel_id;
>
> yields:
> cid | parcel_id | population
> -----+-----------+------------
> 0 | 002 | 6000
> 0 | 003 | 1
> 1 | 006 | 1
> 1 | 007 | 1
> 2 | 001 | 3000
> 3 | 004 | 1
> 4 | 005 | 1
> (7 rows)
> }}}
>
>
> See answers are the same. I would have expected parcels 002 and 001 to
> have their own dedicated cluster cause they have such a huge population
New description:
According to the docs, ST_ClusterKMeans in PostGIS 3.1 should support
weights ergo - M coordinate.
I thought I could use this to handle things like clustering by population
density so that if I have a hi-rise with say 300 people and town houses
with say 1-4 people, I should see my hi-rise area clusters have fewer
records. It doesn't seem to make a difference whether I pass in M or not.
Z does something.
here is a revised example I was going to put in the docs.
{{{
CREATE TABLE parcels AS
SELECT lpad(g.ord::text,3,'0') As parcel_id, geom,
('{residential, commercial}'::text[])[1 + mod(g.ord,2)] As type,
CASE WHEN g.ord < 3 THEN g.ord*3000 ELSE 1 END AS population
FROM
ST_Subdivide(ST_Buffer('SRID=3857;LINESTRING(40 100, 98 100, 100 150,
60 90)'::geometry,
40, 'endcap=square'),12) WITH ORDINALITY AS g(geom,ord);
}}}
{{{
-- no weight
SELECT ST_ClusterKMeans(ST_Centroid(geom), 5) OVER() AS cid, parcel_id,
population
FROM parcels
ORDER BY cid, parcel_id;
-- yields
cid | parcel_id | population
-----+-----------+------------
0 | 002 | 6000
0 | 003 | 1
1 | 006 | 1
1 | 007 | 1
2 | 001 | 3000
3 | 004 | 1
4 | 005 | 1
(7 rows)
}}}
{{{
-- with weight by population
SELECT ST_ClusterKMeans(ST_Force3DM(ST_Centroid(geom), population), 5)
OVER() AS cid, parcel_id, population
FROM parcels
ORDER BY cid, parcel_id;
yields:
cid | parcel_id | population
-----+-----------+------------
0 | 002 | 6000
0 | 003 | 1
1 | 006 | 1
1 | 007 | 1
2 | 001 | 3000
3 | 004 | 1
4 | 005 | 1
(7 rows)
}}}
See answers are the same. I would have expected parcels 002 and 001 to
have their own dedicated cluster cause they have such a huge population
--
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4850#comment:1>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list