[postgis-users] ST_ClusterDBSCAN: is it deterministic?

Giuseppe Broccolo g.broccolo.7 at gmail.com
Sun Jan 24 09:33:05 PST 2021


Hi Daniel,

Il giorno ven 22 gen 2021 alle ore 18:07 Daniel Baston <dbaston at gmail.com>
ha scritto:

> It should be deterministic for most real data if the inputs are ordered
> consistently, using the OVER() clause as you suggest. It's possible that
> there may be a contrived situation involving duplicates in the input where
> a result would be different (as GEOS STRtree is using std::sort instead of
> std::stable_sort), but I'm not sure. Also, there are sometimes multiple
> possible clusterings that satisfy the DBSCAN algorithm, so it is expected
> that the results may differ from different implementations or different
> orderings of the same input.
>

Thank you for the answer. I think I'll try to define the partition
with the ORDER
BY geom clause in order to check if I can obtain more determinism. If I
correctly understood, the ORDER BY should add a further step with
preordering the geometries using an Hilbert curve. Of course, this would
impact the overall duration of the query.

Giuseppe.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20210124/93a08f55/attachment.html>


More information about the postgis-users mailing list