Running clustering queries

Greg Troxel gdt at lexort.com
Thu Mar 28 03:52:38 PDT 2024


Gary Turner <g at elizr.com> writes:

>> I then issue these two commands:
>> ALTER TABLE shapefiles.atp_filling_stations ADD COLUMN geom
>> geometry(Point, 4326);
>>
>> UPDATE shapefiles.atp_filling_stations SET geom =
>> ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);
>>
>> So the SRS is 4326, right?
>
> Yes, (well, SRID, but who's being picky ;) which is the id of WGS84,
> which is based on degrees.

Well, you have set it.  You have not explained where you got the data
and what datum it was in, but it is highly likely that the data is so
fuzzy that caring about whether it is WGS84 vs NAD83 is not useful.

> My (limited) understanding is that you would need to have reprojected
> your points to a projected SRS with a suitable unit, rather than
> WGS84.

Either that, or accept a distance expressed in degrees, which will be
more or less an ellipse vs a circle.

> As Regina said,
>
>> I'm guessing the issue might be your units of measure.  If you want to do
>> 500 ft you'd be best using a meter or mile based spatial reference
>> system.
>
> Sorry, I'm not familiar with a suitable one to suggest for your country.
>
> Given she suggested miles, there might not be one in feet, so you will
> need to scale your 1000 feet to miles or metres, depending on the base
> unit of the SRS you use

First, I would suggest that you avoid Fred Flintone Units as much as
possible and try to do things in meters.  You can easily convert 500 ft
to meters of course.  Henceforth I will omit all non-metric things.

The basic issue is that you want a coordinate system that has the
property that sqrt(x^2+y^2) is distance.  Many others have the same
sorts of feelings, so there are such things.  But the math is such that
they cannot work over large areas.

In the US, there are a number of usable spatial reference systems.
More or less, this comes down to two choices, trying to simplify while
not being wrong.

  datum:
    WGS84 // ITRFxx (global datum)
    NAD83 (plate fixed datum)
      [with perhaps some complexity for non-CONUS]
    (everything else is either future, ancient, or not relevant)

  projection:
    geographic (lat/lon)
    Universal Transverse Mercator (meters easting and northing)
      [used by DoD and USGS, generally federal}
    State Plane Coordinate Systems
      Transverse Mercator
      Lambert Conformal Conic
      unusual things for Alaska
      maybe something else unusual for some other unusual case

If you were asking this about a state, I would say to use the state
plane system.  But those...  only cover a state or even part of a state.
UTM zones cover a lot, but one needs a dozen or so.

So I would suggest two options:

A)
  Convert 500' to degrees latitude

  Convert 500' to degrees longitude, at Meades Ranch, KS.

  Take the geometric mean.

  Run your query in 4326.

  Accept that you'll find things very roughly within 600' east-west and
  400' north-south and call that good enough

B)

  For each UTM zone in the US:
    create a buffer that's 5 km bigger than the zone
    intersect the data (projected to the zone) with the buffer
    run the query in that zone, with 152.4m distance
  union the results

C)

  Decide to use an XYZ coordinate system, as hinted at in
    https://postgis.net/docs/ST_ClusterKMeans.html

  This is cartesian, except in 3D.  So you'd have to either have
  elevations, or decide to just stick in 0.   However, the distortion of
  the XYZ distances with 0 assumed elevation from the true horizontal
  distance will be tiny at 150m.


If you're going to get picky about exact distances, you will have to
deal with

  gas stations are not points

  your input data is wrong

  do you mean 500' along a geodesic, or something else?

  do you mean at the average elevation of the two stations, or do you
  mean at the ellipsoid, from the points below?

  or do you mean the 3d distance from the points with elevation

and this is all of course comical.



It would be cool if postgis had a way to do B.

I would do C, if I had your problem.



More information about the postgis-users mailing list