[postgis-users] ST_ClusterDBSCAN for "geography" data type?

Marco Boeringa marco at boeringa.demon.nl
Tue Dec 22 03:15:15 PST 2020


As I understand it now, Darafei's suggestion essentially puts the 
geometries on a globe by transforming to EPSG:4978, and then doing 
straight line distance calculations between geometries. Since the 
cluster distance, depending on the data of course, is usually very small 
compared to the earth, the error of not accounting for the 
sphere/spheroid should be low. E.g., if you wanted to cluster buildings 
like in the example page of the PostGIS documentation, than 10m max eps 
already creates clusters of > 1k records in some areas. In addition, the 
whole principal of clustering does not care much about exact geodesic 
distances, or not at all, just that with Darafei's suggestion, the data 
can more easily span large surface areas without clustering results 
negatively being influenced by a projection's distance distortions 
(Correct me if I am wrong, Darafei).

An example query for creating a table with clusters, discarding IDs that 
do not form part of a cluster:

ST_ClusterDBSCAN(ST_Force3D(ST_Transform(<GEOMETRY_COLUMN>,4978)), eps 
:= 10 , minpoints := 5) over () AS cluster_id FROM <MY_TABLE>) sq WHERE 
cluster_id IS NOT NULL;


Op 21-12-2020 om 15:06 schreef Marco Boeringa:
> Hi Darafei,
> Thanks for the suggestion.
> I must admit I have a bit of difficulty visualizing what this exactly 
> does, and had to look up that EPSG:4978 projection you mention, but do 
> I understand it right that the solution you suggest should work for 
> global data sets, and isn't influenced or limited by the usual 
> projection distortions?
> I also understand this is not something I could easily run myself, 
> because this is experimental changes in underlying PostGIS code? Or is 
> there a set of PostGIS commands I need to run in specific order to get 
> such result, e.g. is it really as simple as maybe:
> "ST_ClusterDBSCAN(ST_Force3D(ST_Transform(<GEOMETRY_COLUMN>,4978)))" ??
> Marco
> Op 21-12-2020 om 12:24 schreef Darafei "Komяpa" Praliaskouski:
>> Hi,
>> My last exercise in KMeans showed that it's enough to add support for 
>> 3D distances instead of 2D distances in the code and transform your 
>> geometry into EPSG:4978 (after Force3D). That will cluster in a 3D 
>> XYZ coordinate system using straight lines in 3D, which is usually 
>> good enough.
>> On Mon, Dec 21, 2020 at 2:15 PM Giuseppe Broccolo 
>> <g.broccolo.7 at gmail.com <mailto:g.broccolo.7 at gmail.com>> wrote:
>>     Hi Marco,
>>     Il giorno dom 20 dic 2020 alle ore 10:03 Marco Boeringa
>>     <marco at boeringa.demon.nl <mailto:marco at boeringa.demon.nl>> ha
>>     scritto:
>>         Hi,
>>         Reading through the PostGIS documentation, I noticed the
>>         "ST_ClusterDBSCAN" function takes a distance as one of the
>>         inputs. Now
>>         the docs suggest the current algorithm only takes in
>>         "geometry" type
>>         data. Is that true? Based on the distance input variable, it
>>         would seem
>>         logical to have a "geography" variant as well, even if that is a
>>         considerably slower variant considering the more complex
>>         distance
>>         calculation.
>>     Yes, ST_ClusterDBSCAN takes just the geometry type in input. This
>>     is because most of the algorithm
>>     is implemented using utilities already existing in GEOS for
>>     planar geometries.
>>     Adding the support for the geography type would mean to
>>     re-implement the algorithm specifically for
>>     spherical objects, as you mentioned.
>>     Regards,
>>     Giuseppe.
>>     _______________________________________________
>>     postgis-users mailing list
>>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>>     https://lists.osgeo.org/mailman/listinfo/postgis-users
>>     <https://lists.osgeo.org/mailman/listinfo/postgis-users>
>> -- 
>> Darafei "Komяpa" Praliaskouski
>> OSM BY Team - http://openstreetmap.by/ <http://openstreetmap.by/>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20201222/33726da0/attachment.html>

More information about the postgis-users mailing list