[postgis-users] ST_ClusterDBSCAN: is it deterministic?

Darafei "Komяpa" Praliaskouski me at komzpa.net
Fri Jan 22 07:25:50 PST 2021


Hello,

Cluster functions don't have cross-PostGIS-version stability guarantee. For
many production applications that is equal to being non-deterministic.

While debugging KMeans I believe I've seen blinking tests on different
compiler flags as some optimizations may mean your distance computation
will get different last bits and that may affect clustering, especially on
grids.

пт, 22 сту 2021, 18:00 карыстальнік Giuseppe Broccolo <
g.broccolo.7 at gmail.com> напісаў:

> Hello,
>
> I have a question about how the function ST_ClusterDBSCAN is implemented
> in PostGIS: basically, the question is if I'm able to pass the same window
> of geometries in input to the function, would it return the same clusters?
>
> I tried to find an answer by myself having a look to the code: see link
> here
> <https://github.com/postgis/postgis/blob/962ef92215fdac72b317d7d99201931394174afa/postgis/lwgeom_window.c#L8>
>
> What I understood is that the geometries in the window are accessed
> exactly as they are stored in memory. The geometries are then processed
> sequentially in order to create the clusters.
>
> Is there a way to create a window in a way that passing exactly the same
> geometries in the window I can obtain exactly the same clusters (i.e.
> sorting the window partition with an ORDER BY or any similar trick)?
>
> Thank you very much,
> Giuseppe.
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20210122/a4042399/attachment.html>


More information about the postgis-users mailing list