Continuing on clustering queries
Max Pyziur
pyz at brama.com
Thu Mar 28 08:31:27 PDT 2024
Thank you again; results are starting to come in.
Apologies for starting a new thread; but the other one was getting messy.
Here are the queries that I am testing:
SELECT city, state, ST_ClusterDBScan( ST_Transform(geom,2163), eps :=
0.3048*5000, minpoints :=1) OVER () AS cluster_id
INTO TEMP aa_zz
FROM shapefiles.atp_filling_stations aa
WHERE aa.amenity = 'fuel'
AND aa.ctry_code = 'US' AND aa.geom IS NOT NULL;
SELECT cluster_id, COUNT(*) FROM aa_zz GROUP BY 1 ORDER BY 2 DESC LIMIT
10;
SELECT state, cluster_id, COUNT(*) FROM aa_zz GROUP BY 2, 1 ORDER BY 3
DESC LIMIT 20;
SELECT city, state, cluster_id, COUNT(*) FROM aa_zz GROUP BY 3, 2, 1
ORDER BY 4 DESC LIMIT 20;
I receive the following results:
SELECT 76023
cluster_id | count
------------+-------
8467 | 422
3888 | 371
5701 | 249
3351 | 177
1111 | 141
8352 | 120
1610 | 107
1812 | 90
1824 | 86
14232 | 81
(10 rows)
state | cluster_id | count
-------+------------+-------
MI | 8467 | 422
NY | 3888 | 371
IL | 5701 | 245
NY | 3351 | 177
CO | 8352 | 110
TX | 1610 | 101
NY | 1824 | 86
DC | 1111 | 82
IN | 14748 | 78
TX | 14297 | 75
MO | 3661 | 74
CA | 11272 | 74
MI | 462 | 71
TX | 14232 | 70
TX | 19326 | 69
TX | 1812 | 68
FL | 13693 | 68
TX | 16824 | 68
CA | 14006 | 66
IL | 2727 | 64
(20 rows)
city | state | cluster_id | count
--------------+-------+------------+-------
DETROIT | MI | 8467 | 147
CHICAGO | IL | 5701 | 104
Austin | TX | 1610 | 101
Detroit | MI | 8467 | 85
BROOKLYN | NY | 3888 | 77
Chicago | IL | 5701 | 75
Fresno | CA | 11272 | 73
Brooklyn | NY | 3888 | 71
WASHINGTON | DC | 1111 | 66
HOUSTON | TX | 14297 | 64
St. Louis | MO | 3661 | 63
HOUSTON | TX | 14232 | 63
DENVER | CO | 8352 | 54
BRONX | NY | 3351 | 53
LOS ANGELES | CA | 14006 | 50
Bakersfield | CA | 1777 | 49
INDIANAPOLIS | IN | 14748 | 44
Bronx | NY | 3351 | 42
Amarillo | TX | 622 | 41
Austin | TX | 1601 | 38
(20 rows)
Obviously, the case-based aspect of the city name can be cancelled with a
function such as UPPER(city).
I also am generous in the radius dimension setting it at 5k feet.
But my next question is how do position the cluster_id on a map w/o
referencing the city, and then tying that out to a geometry that I have
stored for the city?
Thank you again,
Max
More information about the postgis-users
mailing list