Continuing on clustering queries

Max Pyziur pyz at brama.com
Thu Mar 28 14:31:31 PDT 2024


On Thu, 28 Mar 2024, Regina Obe wrote:

[Deleted for the sake of brevity]
>>
>> Obviously, the case-based aspect of the city name can be cancelled with a
>> function such as UPPER(city).
>>
>> I also am generous in the radius dimension setting it at 5k feet.
>>
>> But my next question is how do position the cluster_id on a map w/o
>> referencing the city, and then tying that out to a geometry that I have
>> stored for the city?
>>
>> Thank you again,
>>
>> Max
>
> Depends how you want to show the cluster. If you want to show like a bubble
> to represent the cluster, you can probably using ST_MinimumBoundingCircle
> https://postgis.net/docs/en/ST_MinimumBoundingCircle.html
> First by keeping the point geometry
>
> CREATE TEMP TABLE tmp_aa_zz AS
> SELECT city, state, ST_ClusterDBScan( ST_Transform(geom,2163), eps :=
> 0.3048*5000, minpoints :=1) OVER () AS cluster_id, aa.geom
> FROM
> shapefiles.atp_filling_stations aa WHERE aa.amenity = 'fuel'
> AND aa.ctry_code = 'US' AND aa.geom IS NOT NULL;
>
>
> SELECT cluster_id , ST_MinimumBoundingCircle(ST_Collect(geom)) AS geom,
> COUNT(*)
> FROM tmp_aa_zz
> GROUP BY cluster_id;
>
> Or you might want to show the individual points in which case you can
> dispense with the ST_MinimumBoundingCircle call or have it in addition to
> like:
>
> SELECT cluster_id , ST_Collect(ST_MinimumBoundingCircle(ST_Collect(geom)),
> ST_Collect(geom)) AS geom, COUNT(*)
> FROM tmp_aa_zz
> GROUP BY cluster_id;
>
> Other alternative options you can try besides the minimum bounding circle
> might be ST_ConvexHull  https://postgis.net/docs/en/ST_ConvexHull.html
> , ST_ConcaveHull https://postgis.net/docs/en/ST_ConcaveHull.html, or maybe
> even ST_GeometricMedian https://postgis.net/docs/en/ST_GeometricMedian.html
> (if you want to show only one point to represent all the points or
> ST_Centroid ( I think ST_GeometricMedian would be a more accurate
> representation though, think of centroid as the average so would be highly
> affected by outlier points where as the median would be less so)

I've adapted some of your recommendations. One of the targets of this 
effort is to add a layer to an existing QGIS project: various energy 
infrastructure of the U.S./North America.

So, I've taken a portion of your recommendations, and have built the 
following SQL:
DROP TABLE shapefiles.filling_station_clusters;

CREATE TABLE shapefiles.filling_station_clusters AS
SELECT city, state, ST_ClusterDBScan( ST_Transform(geom,2163), eps := 
0.3048*5000, minpoints :=1) OVER () AS cluster_id, aa.geom
FROM shapefiles.atp_filling_stations aa WHERE aa.amenity = 'fuel'
AND aa.ctry_code = 'US' AND aa.geom IS NOT NULL;

The reason is that it doesn't seem that QGIS can work w TEMP tables.

>From this, I want to create a layer; I use the following SQL:

SELECT cluster_id AS cid1 , 
ST_Collect(ST_MinimumBoundingCircle(ST_Collect(geom)), ST_Collect(geom)) 
AS geom, COUNT(*) as ct FROM shapefiles.filling_station_clusters GROUP BY 
1;

(for some reason, it rejected the SQL if a result was returned with column 
name of cluster_id - possible reserved name?)

This SQL variant does not return errors, but it isn't visible in my 
QGIS project (it's an ongoing project with about twenty layers built with 
PostGIS).

Any advice?

Should I direct this inquiry to the QGIS list?

Thank you,

Max


More information about the postgis-users mailing list