<div dir="ltr"><div><div><div><div><div><div>When running ST_ClusterKmeans on a large amount (>100) of clusters it becomes clear that there is a uneven distribution in the clustering, even when the points are evenly distributed. <br><br></div>Consider the following query:<br><span style="font-family:monospace,monospace">WITH <br>points AS (<br>    SELECT (ST_DumpPoints(ST_generatePoints(ST_MakeEnvelope(0,0,1000,1000),100000))).geom geom<br>)<br>SELECT ST_ClusterKMeans(geom,1000) over () AS cid, geom<br>FROM points;</span><br><br></div>This will generate the following clusters:<br><img src="cid:ii_1609db5fa260c796" alt="Inline image 1" width="477" height="478"><br><br></div>Obviously, clusters on the lowleft, uppright diagonal are smaller then clusters further from this diagonal. Could this be an issue with the starting (random?) seeding?<br></div>If people agree this is undesired behaviour (for me it is), I can file a report.<br><br></div>Best,<br></div> Tom<br></div>