[postgis-users] st_intersects and st_disjoint inconsistent results
Paul Ramsey
pramsey at cleverelephant.ca
Thu Feb 10 17:57:30 PST 2022
> On Feb 10, 2022, at 4:55 PM, Emily Gouge <egouge at refractions.net> wrote:
>
> I have a linear dataset on which I was building a query to find edges that are “very close” but don’t touch. While working on this query I found some unexpected results with the st_intersects and st_disjoint functions. As outlined below, the query returned true for both st_instersects and st_disjoint for a few geometries comparisons, but ONLY when a where clause was used to filter the geometries spatially. When unique identifiers were used to filter geometries only st_disjoint returns true.
>
> Versions:
> Except where noted otherwise the results below reference testing on these versions:
> POSTGIS="3.2.0 3.2.0" [EXTENSION] PGSQL="140" GEOS="3.10.1-CAPI-1.16.0" PROJ="7.2.1" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)"
> PostgreSQL 14.1, compiled by Visual C++ build 1914, 64-bit
>
> When I run this query:
> select a.id, b.id,
> st_intersects(a.geometry, b.geometry),
> st_intersects(b.geometry, a.geometry),
> st_disjoint(a.geometry, b.geometry),
> st_disjoint(b.geometry, a.geometry)
> from test.eflowpath a, test.eflowpath b
> where a.id != b.id
> and st_dwithin(a.geometry, b.geometry, 0.00001)
> and st_disjoint(a.geometry, b.geometry);
>
> PostGIS 3.2: 50 rows were returned, but there are three rows that return true for both st_disjoint and st_intersects. Given the query this in itself is a bit odd as you'd expect reciprocal results for the pairs of the geometry (so at least 4 rows).
> -------
> 889105be-5782-43f1-b50c-5a5825c83875
> e5703673-a995-472e-b4b4-0280143eba0c
> true
> true
> true
> true
> -------
> 0d05aabb-9ff3-4d8f-b6c7-b2b44b0868c8
> 3a09b2af-5932-4e36-9e3e-d8109e5463fa
> true
> true
> true
> true
> -------
> 3a09b2af-5932-4e36-9e3e-d8109e5463fa
> 0d05aabb-9ff3-4d8f-b6c7-b2b44b0868c8
> true
> true
> true
> true
> -------
> ...
>
> Note: In PostGIS 3.1.2 50 rows were returned by only two rows returned true for both disjoint and intersects.
>
> HOWEVER,
> When I compare one pair of those edges specifically using the ids they are only disjoint (which is the result I would expect to see)
>
> select a.id, b.id,
> st_intersects(a.geometry, b.geometry),
> st_intersects(b.geometry, a.geometry),
> st_disjoint(a.geometry, b.geometry),
> st_disjoint(b.geometry, a.geometry)
> from test.eflowpath a, test.eflowpath b
> where b.id = '889105be-5782-43f1-b50c-5a5825c83875' and a.id = 'e5703673-a995-472e-b4b4-0280143eba0c'
>
> Results:
> -------
> e5703673-a995-472e-b4b4-0280143eba0c
> 889105be-5782-43f1-b50c-5a5825c83875
> false
> false
> true
> true
> -------
>
> I thought perhaps this had something to do with the indexes so I removed all geometry indexes from the table and re-ran the initial query. In postgis 3.2 this returned the same results as the indexed query.
> Note: In PostGIS 3.1.2 this also returned three rows with intersects and disjoint true. As noted above with indexes in 3.1.2 only 2 rows were returned where intersects and disjoint were true.
>
> Similar results occurred if st_dwithin from the where statement was increased to 0.01.
>
> Test Data: When I made a table with only the edges in question all queries returned expected results: st_intersects is false and st_disjoint is true. As a result providing a small test case for this issue doesn’t seem possible. But I am happy to provide all the data - there are 27,444 rows.
>
> While this isn’t a problem for me, I find it unexpected that the results from st_intersects and st_disjoint of two geometries would be different based on the where clause in the query (and the data in the table).
>
> Thoughts?
The fact that a single test case returns one result, but results in a larger set returns another says to me that likely the issue in different code lines because of different cache behaviour. When you do the single test case you get a brute force intersects. When you do several in a batch, you get prepared geometry (at least, for the cases that happen after caching).
If you can provide the HEXWKB of the two geometries that showed disagreement (where b.id = '889105be-5782-43f1-b50c-5a5825c83875' and a.id = 'e5703673-a995-472e-b4b4-0280143eba0c') we can set up a test case in GEOS that compares the normal and prepared geometry calls and see if that's the problem.
P.
>
> Thanks,
> Emily
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list