[postgis-users] ST_ClusterDBSCAN: is it deterministic?

Daniel Baston dbaston at gmail.com
Fri Jan 22 09:07:11 PST 2021


It should be deterministic for most real data if the inputs are ordered
consistently, using the OVER() clause as you suggest. It's possible that
there may be a contrived situation involving duplicates in the input where
a result would be different (as GEOS STRtree is using std::sort instead of
std::stable_sort), but I'm not sure. Also, there are sometimes multiple
possible clusterings that satisfy the DBSCAN algorithm, so it is expected
that the results may differ from different implementations or different
orderings of the same input.

Dan

On Fri, Jan 22, 2021 at 11:47 AM Giuseppe Broccolo <g.broccolo.7 at gmail.com>
wrote:

> Hi Darafei,
>
> Thank you for your answer!
>
> Il giorno ven 22 gen 2021 alle ore 16:26 Darafei "Komяpa" Praliaskouski <
> me at komzpa.net> ha scritto:
>
>> Hello,
>>
>> Cluster functions don't have cross-PostGIS-version stability guarantee.
>> For many production applications that is equal to being non-deterministic.
>>
>> While debugging KMeans I believe I've seen blinking tests on different
>> compiler flags as some optimizations may mean your distance computation
>> will get different last bits and that may affect clustering, especially on
>> grids.
>>
>
> I see the problem here. In my company we use the DBSCAN algorithm to
> cluster some geometries and we are experiencing the not deterministic
> behaviour, even running on the same datasets. Since the geometries are
> included on a specific window partition we define in the query, I was
> curious to know if there was any trick in order to have reproducible
> results considering exactly the same boundary conditions - same underlying
> architecture, same PostgreSQL version, of course same input. But I see it's
> a bit pretentious :)
>
> Thanks again,
> Giuseppe.
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20210122/f43d3636/attachment.html>


More information about the postgis-users mailing list