[postgis-users] ST_ClusterDBSCAN: is it deterministic?

Giuseppe Broccolo g.broccolo.7 at gmail.com
Mon Feb 8 11:14:40 PST 2021


Hi Daniel,

Il giorno dom 24 gen 2021 alle ore 18:16 Daniel Baston <dbaston at gmail.com>
ha scritto:

> Hi Giuseppe,
>
> You can order the inputs by anything you like; OVER(ORDER BY feature_id)
> would work just as well. If you have an example that is not deterministic
> despite ordered inputs, I'd be curious to see it if you can share.
>

Sorry for the delay of my reply, but I took the opportunity to run some
more testing on this. Basically, I found the origin of the issue on our
side: we use the DBSCAN algorithm as part of a data pipeline which creates
PG instances "on the fly" in order to build the geospatial data. The not
deterministic behaviour is not present if for instance we avoid to switch
on/off the PG instance. Probably in this configuration it is able to access
the data in memory always sorted in the same way. We are currently studying
if adding the ORDER BY clause helps in having it deterministic even
recreating the PG instance on the fly.

Thanks for your help.
Giuseppe.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20210208/6bfc97f2/attachment.html>


More information about the postgis-users mailing list