[postgis-users] problems with _st_distance() function on queries returning more than one row

Rémi Cura remi.cura at gmail.com
Fri Jan 9 01:48:38 PST 2015


Hey,
@Paul
Paul I'm afraid there is a true postgis problem here.
ST_Intersects for geography amount to
`( geom1 && geom2 AND _ST_Distance(geom1, geom2 , 0.0, false) < 0.00001 )  `
now manually checking, && gives correct result, *but  `_ST_Distance`
returns wrong results in this case*,

Yet `_st_distanceuncached(geom, multip, FALSE)` *returns correct answer
(0).*
I checked also by putting "0." before each coordinate to rule out a
too-big-value issue.
I also ruled out a mixed srid problem.
At the end I put the complete test script, I must stop here.

@Bob
So Bob for the moment, you have some work around :
 * use geometry type if you can afford
 * replace all "ST_Intersects(geom1,geom2)" by exactly this :
   ( (geom1&& geom2) AND ( _st_distanceuncached(geom1, geom2, FALSE) <
0.00001) )
 * replace all "ST_Intersects(geom1,geom2)" by exactly this :
    ST_IsEmpty(ST_Intersection(geom1,geom2)::geometry)=FALSE

I don't know which one of option 2 or 3 will be slowest. It depends on your
data.
Note that it will be slow any way if you have many geometry that have
intersecting bounding boxes.

Cheers,
Rémi-C


----------------------
CREATE SCHEMA errror_distance_geography;
SET search_path to errror_distance_geography, public ;
DROP TABLE IF EXISTS  multipolygon ;
create table multipolygon(
    mpid         SERIAL                      not null,
    multip       Geography(multipolygon)    not null
);
insert into multipolygon
values (1, ST_GeographyFromText('multipolygon(((0.20 0.10, 0.21 0.10, 0.22
0.10, 0.23 0.10, 0.24 0.10, 0.25 0.10, 0.26 0.10, 0.27 0.10, 0.28 0.10,
0.29 0.10, 0.30 0.10, 0.30 0.20, 0.29 0.20, 0.28 0.20, 0.27 0.20, 0.26
0.20, 0.25 0.20, 0.24 0.20, 0.23 0.20, 0.22 0.20, 0.21 0.20, 0.20 0.20,
0.20 0.10)))'));
insert into multipolygon
values (2, ST_GeographyFromText('multipolygon(((0.25 0.15, 0.26 0.15, 0.27
0.15, 0.28 0.15, 0.29 0.15, 0.30 0.15, 0.31 0.15, 0.32 0.15, 0.33 0.15,
0.34 0.15, 0.35 0.15, 0.35 0.25, 0.34 0.25, 0.33 0.25, 0.32 0.25, 0.31
0.25, 0.30 0.25, 0.29 0.25, 0.28 0.25, 0.27 0.25, 0.26 0.25, 0.25 0.25,
0.25 0.15)))'));
SELECT *
FROM multipolygon   ;
WITH the_geom_to_intersect AS (
    SELECT ST_GeographyFromText('MULTIPOLYGON(((0.27 0.17, 0.28 0.17, 0.28
0.19, 0.27 0.19, 0.27 0.17)))') AS geom
)
SELECT
    ( geom && multip AND _ST_Distance(geom, multip , 0.0, false) < 0.00001
) AS intersects_for_geography
    , _ST_Distance(geom,multip, 0.0, false) AS distance_for_geography
    ,_st_distanceuncached(geom, multip, FALSE) AS
uncached_distance_for_geography
    ,( geom && multip AND _st_distanceuncached(geom, multip, FALSE) <
0.00001 )  intersects_for_geography_uncached
    ,ST_IsEmpty(ST_Intersection(geom,multip)::geometry)
is_intersection_empty_trick
from the_geom_to_intersect, multipolygon m1
---------------------


2015-01-09 0:44 GMT+01:00 Paul Ramsey <pramsey at cleverelephant.ca>:

> Move up to the latest patch revision of postgis, the 2.1.0 release
> definitely had a few subtle issues with geography distance of the sort
> you are describing.
>
> P.
>
> On Thu, Jan 8, 2015 at 2:09 PM, Hartranft, Robert M.
> (GSFC-423.0)[RAYTHEON COMPANY] <robert.m.hartranft at nasa.gov> wrote:
> > Hi All,
> >
> > I am hoping someone here has experience with the
> > issue I am having…
> >
> >
> > I am using:
> > -rw-r--r--  1 postgres postgres 10241357 Oct  8 22:04 gdal-1.10.1.tar.gz
> > -rw-r--r--  1 postgres postgres  1813726 Oct  8 22:04 geos-3.4.2.tar.bz2
> > -rw-r--r--  1 postgres postgres   340953 Oct  8 22:04 json-c-0.9.tar.gz
> > -rw-r--r--  1 postgres postgres  5161069 Oct  8 22:04
> libxml2-2.9.0.tar.gz
> > -rw-r--r--  1 postgres postgres 16930885 Oct  8 22:04 perl-5.16.3.tar.gz
> > -rw-r--r--  1 postgres postgres   195442 Oct  8 22:04 pgtap-0.94.0.tar.gz
> > -rw-r--r--  1 postgres postgres  6518378 Oct  8 22:04
> postgis-2.1.0.tar.gz
> > -rw-r--r--  1 postgres postgres 21865589 Oct  8 22:04
> > postgresql-9.3.4.tar.gz
> > -rw-r--r--  1 postgres postgres   785279 Oct  8 22:04 proj-4.8.0.tar.gz
> >
> > I have a table that contains a geography column.
> > The column contains “multipolygon” values.
> >
> > When I do a query that uses the ST_Intersects() function it
> > seems to return the wrong results if the underlying plan uses
> > the _st_distance() function and it returns the correct results if
> > the plan uses the st_intersects() function.
> >
> > I am having the issue with a multipolygon column but it might
> > happen of other types as well.
> >
> >
> >
> > Here is an small example of the issue:
> >
> > =# create table multipolygon
> > (
> >     mpid         bigint                      not null,
> >     multip       Geography(multipolygon, 4326)    not null
> > );
> > CREATE TABLE
> >
> >
> > =# alter table multipolygon add constraint PK_POLYGON primary key (mpid);
> > ALTER TABLE
> >
> >
> > Now I will add two small “rectangles” into the table.  I don’t really
> need
> > to
> > use multiple polygons to reproduce the issue so I will use a single
> polygon
> > with
> > points to approximate lines of constant latitude (just trying to avoid
> the
> > great
> > circle arc).
> >
> > =# insert into multipolygon
> > values (1, ST_GeographyFromText('SRID=4326;multipolygon(((20 10, 21 10,
> 22
> > 10, 23 10, 24 10, 25 10, 26 10, 27 10, 28 10, 29 10, 30 10, 30 20, 29
> 20, 28
> > 20, 27 20, 26 20, 25 20, 24 20, 23 20, 22 20, 21 20, 20 20, 20 10)))'));
> > INSERT 0 1
> >
> > =# insert into multipolygon
> > values (2, ST_GeographyFromText('SRID=4326;multipolygon(((25 15, 26 15,
> 27
> > 15, 28 15, 29 15, 30 15, 31 15, 32 15, 33 15, 34 15, 35 15, 35 25, 34
> 25, 33
> > 25, 32 25, 31 25, 30 25, 29 25, 28 25, 27 25, 26 25, 25 25, 25 15)))'));
> > INSERT 0 1
> >
> >
> >
> > Now I run a query that should return both rows.
> >
> >
> > =# select m1.mpid from multipolygon m1 where st_intersects(m1.multip
> > ,'POLYGON((27 17, 28 17, 28 19, 27 19, 27 17))'::geography);
> >  mpid
> > ------
> >     1
> > (1 row)
> >
> > =#  select m1.mpid from multipolygon m1 where st_intersects((select
> multip
> > from multipolygon m2 where m2.mpid = m1.mpid ),'POLYGON((27 17, 28 17, 28
> > 19, 27 19, 27 17))'::geography );
> >  mpid
> > ------
> >     1
> >     2
> > (2 rows)
> >
> >
> > You can see the two queries seem to be logically the same but they return
> > different results
> > Next I will look at the plans for the two queries
> >
> >
> > =# explain select m1.mpid from multipolygon m1 where
> st_intersects(m1.multip
> > ,'POLYGON((27 17, 28 17, 28 19, 27 19, 27 17))'::geography);
> >
> > QUERY PLAN
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Seq Scan on multipolygon m1  (cost=0.00..317.40 rows=77 width=8)
> >    Filter: ((multip &&
> >
> '0103000020E610000001000000050000000000000000003B4000000000000031400000000000003C4000000000000031400000000000003C4000000000000033400000000000003B4000000000000033400000000000003B400000000000003140'::geography)
> > AND (_st_distance(multip,
> >
> '0103000020E610000001000000050000000000000000003B4000000000000031400000000000003C4000000000000031400000000000003C4000000000000033400000000000003B4000000000000033400000000000003B400000000000003140'::geography,
> > 0::double precision, false) < 1e-05::double precision))
> > (2 rows)
> >
> >
> > =# explain select m1.mpid from multipolygon m1 where
> st_intersects((select
> > multip from multipolygon m2 where m2.mpid = m1.mpid ),'POLYGON((27 17, 28
> > 17, 28 19, 27 19, 27 17))'::geography );
> >
> > QUERY PLAN
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Seq Scan on multipolygon m1  (cost=0.00..9788.80 rows=387 width=8)
> >    Filter: st_intersects((SubPlan 1),
> >
> '0103000020E610000001000000050000000000000000003B4000000000000031400000000000003C4000000000000031400000000000003C4000000000000033400000000000003B4000000000000033400000000000003B400000000000003140'::geography)
> >    SubPlan 1
> >      ->  Index Scan using pk_polygon on multipolygon m2  (cost=0.15..8.17
> > rows=1 width=32)
> >            Index Cond: (mpid = m1.mpid)
> > (5 rows)
> >
> > You can see that the query using _st_distance is the one that returns a
> bad
> > result.
> > However I can also show _st_distance() works if it is limited to finding
> > just one
> > result.  Here is the same query that didn’t work, however this time I
> add a
> > constraint
> > to limit the result set to one row.
> >
> >
> > =# select m1.mpid from multipolygon m1 where st_intersects(m1.multip
> > ,'POLYGON((27 17, 28 17, 28 19, 27 19, 27 17))'::geography) and m1.mpid
> = 1;
> >  mpid
> > ------
> >     1
> > (1 row)
> >
> > =# select m1.mpid from multipolygon m1 where st_intersects(m1.multip
> > ,'POLYGON((27 17, 28 17, 28 19, 27 19, 27 17))'::geography) and m1.mpid
> = 2;
> >  mpid
> > ------
> >     2
> > (1 row)
> >
> > =# explain select m1.mpid from multipolygon m1 where
> st_intersects(m1.multip
> > ,'POLYGON((27 17, 28 17, 28 19, 27 19, 27 17))'::geography) and m1.mpid
> = 2;
> >
> > QUERY PLAN
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Index Scan using pk_polygon on multipolygon m1  (cost=0.15..8.43 rows=1
> > width=8)
> >    Index Cond: (mpid = 2)
> >    Filter: ((multip &&
> >
> '0103000020E610000001000000050000000000000000003B4000000000000031400000000000003C4000000000000031400000000000003C4000000000000033400000000000003B4000000000000033400000000000003B400000000000003140'::geography)
> > AND (_st_distance(multip,
> >
> '0103000020E610000001000000050000000000000000003B4000000000000031400000000000003C4000000000000031400000000000003C4000000000000033400000000000003B4000000000000033400000000000003B400000000000003140'::geography,
> > 0::double precision, false) < 1e-05::double precision))
> > (3 rows)
> >
> >
> > This might be the wrong conclusion but it appears that
> > _st_distance() doesn’t work if the result set contains more
> > than one row.
> >
> >
> >
> > Thanks in advance,
> > Bob
> > -----------------------------------------------------------------
> > Robert Hartranft
> > Mission Operations and Services
> > Intelligence & Information System
> > Raytheon Company
> > 301-851-8197
> > robert_m_hartranft at raytheon.com
> > robert.m.hartranft at nasa.gov
> > 5700 Rivertech Court
> > Riverdale, MD 20737
> > -----------------------------------------------------------------
> >
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at lists.osgeo.org
> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150109/bf9c224c/attachment.html>


More information about the postgis-users mailing list