[postgis-users] Strange behavior with empty geometries GROUP BY

Regina Obe lr at pcorp.us
Mon Sep 11 14:53:05 PDT 2017


Yes good replacement and what I recommend or geom::text  I think works just
as well.

 

From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf
Of Pierre Racine
Sent: Monday, September 11, 2017 5:17 PM
To: PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

:) 

 

Well if you have some geometric aggregators you have to expect that at some
point some crazy people will want to GROUP BY or DISTINCT geometries. A
simple example (not crazy at all) is if I want to search for duplicates
geometries:

 

SELECT geom, count(*) cnt

FROM mytable

GROUP BY geom

HAVING count(*) > 1;

 

Now I know it doesn’t work well enough since two different geometries having
the same bounding box will look like duplicates. I should probably GROUP BY
ST_AsEWKB(geom) instead:

 

SELECT ST_AsEWKB(geom)::geometry, count(*) cnt

FROM mytable

GROUP BY ST_AsEWKB(geom)

HAVING count(*) > 1;

 

I guess that a good replacement?

 

Pierre

 

De : postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] De la part
de Regina Obe
Envoyé : 11 septembre 2017 16:32
À : 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org
<mailto:postgis-users at lists.osgeo.org> >
Cc : 'PostGIS Development Discussion' <postgis-devel at lists.osgeo.org
<mailto:postgis-devel at lists.osgeo.org> >
Objet : Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Pierre,

 

Jump on the boat of all those folks who want it changed.  You are in crowded
company.

Only reason I think  we haven't done it is it will require a lot of testing
to see how it breaks peoples code, any performance bottlenecks, and people
relying on that odd behavior for well over 15 years.

 

I personally don't care as why anyone would want to group by something as
largish as a geometry always puzzled me

And no definition of geometric equality ever satisfied me to the point where
I wouldn't be asked questions

 

How come ST_Equals says these are equal but   geom::text = geom::text says
they are not.

 

Seems to me like trading one set of problems for another.

 

Thanks,

Regina

From: postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Pierre Racine
Sent: Monday, September 11, 2017 4:20 PM
To: PostGIS Users Discussion < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Any reason why it’s not something more precise then the equality of the
bounding boxes? Is it redefinable?

 

De : postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] De la part de Regina Obe
Envoyé : 11 septembre 2017 15:59
À : 'PostGIS Users Discussion' < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Objet : Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

BTree = operator

 

Which is bounding box equality

 

From: postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Pierre Racine
Sent: Monday, September 11, 2017 3:54 PM
To: PostGIS Users Discussion < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

What is the operator used by GROUP BY?

 

De : postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] De la part de Regina Obe
Envoyé : 11 septembre 2017 15:42
À : 'PostGIS Users Discussion' < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Objet : Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Hmm interesting.  I was about to say they probably have the same bounding
box but they don't quite according to box2d.  Perhaps our box2d returns
float4 but our internal box is double precision?

 

I don't know.

----------------------------------------------------------

 

WITH polys AS (

  SELECT 1 id, ST_GeomFromText('MULTIPOLYGON(((251000 5243597.61702128,

                                               251038 5243558,

                                               251056 5243527,

                                               251057.227272727 5243500,

                                               251000 5243500,

                                               251000 5243597.61702128)),

                                             ((251021 5243690,

                                               251000 5243681.87096774,

                                               251000 5243750,

                                               251250 5243750,

                                               251250 5243749.86206897,

                                               251196 5243694,

                                               251168 5243687,

                                               251021 5243690)))') geom

  UNION ALL

  SELECT 2 id, ST_GeomFromText('POLYGON((251000 5243681.87096774,

                                         251021 5243690,

                                         251168 5243687,

                                         251196 5243694,

                                         251250 5243749.86206897,

                                         251250 5243598.15,

                                         251167.008064516 5243500,

                                         251057.227272727 5243500,

                                        251056 5243527,

                                         251038 5243558,

                                         251000 5243597.61702128,

                                         251000 5243681.87096774))') geom

) 

SELECT BOX2D(geom)

FROM polys;

 

 

BOX(251000 5243500,251250 5243750)

BOX(251000 5243500,251250 5243749.86206897)

 

 

 

 

From: postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Pierre Racine
Sent: Monday, September 11, 2017 3:08 PM
To: PostGIS Users Discussion < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

I get a similar bug but now with full pledged geometries. Those two
polygons, though very different, GROUP BY together. Which mysterious PostGIS
operator is used by GROUP BY and make those two polygons equivalent?

 

  WITH polys AS (

  SELECT 1 id, ST_GeomFromText('MULTIPOLYGON(((251000 5243597.61702128,

                                               251038 5243558,

                                               251056 5243527,

                                               251057.227272727 5243500,

                                               251000 5243500,

                                               251000 5243597.61702128)),

                                             ((251021 5243690,

                                               251000 5243681.87096774,

                                               251000 5243750,

                                               251250 5243750,

                                               251250 5243749.86206897,

                                               251196 5243694,

                                               251168 5243687,

                                               251021 5243690)))') geom

  UNION ALL

  SELECT 2 id, ST_GeomFromText('POLYGON((251000 5243681.87096774,

                                         251021 5243690,

                                         251168 5243687,

                                         251196 5243694,

                                         251250 5243749.86206897,

                                         251250 5243598.15,

                                         251167.008064516 5243500,

                                         251057.227272727 5243500,

                                         251056 5243527,

                                         251038 5243558,

                                         251000 5243597.61702128,

                                         251000 5243681.87096774))') geom

) SELECT ST_AsText(geom)

  FROM polys 

  GROUP BY geom;

 

Pierre

 

De : postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] De la part de Pierre Racine
Envoyé : 29 juin 2017 15:42
À : PostGIS Users Discussion < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Objet : Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Much better


 

De : postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] De la part de Regina Obe
Envoyé : 23 juin 2017 23:26
À : 'PostGIS Users Discussion' < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Objet : Re: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Pierre,

 

Thanks for pointing out the issue.  Sandro has fixed the problem

 

 <https://trac.osgeo.org/postgis/ticket/3777#comment:17>
https://trac.osgeo.org/postgis/ticket/3777#comment:17

 

 

Can you try with your larger workload.  If you happen to have a windows box,
64-bit  pre-compiled binaries here:

 

 <http://postgis.net/windows_downloads/>
http://postgis.net/windows_downloads/

 

(I have 32-bit building turned off at moment cause it errors out on the gui
tests and haven't had a chance to troubleshoot).

 

 

If you have a patched version:

 

SELECT postgis_full_version();

 

Your version should have  r15450 (or above) for PostGIS 2.4 and r15451 (or
above)for PostGIS 2.3.

 

POSTGIS="2.3.3dev r15451" GEOS="3.6.1-CAPI-1.10.1 r4317" PROJ="Rel. 4.9.1,
04 March 2015" GDAL="GDAL 2.1.3, released 2017/20/01" LIBXML="2.7.8"
LIBJSON="0.12" RASTER

 

Thanks,

Regina

 

From: postgis-users [ <mailto:postgis-users-bounces at lists.osgeo.org>
mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Pierre Racine
Sent: Tuesday, June 20, 2017 5:42 PM
To: 'PostGIS Users Discussion' < <mailto:postgis-users at lists.osgeo.org>
postgis-users at lists.osgeo.org>
Subject: [postgis-users] Strange behavior with empty geometries GROUP BY

 

Hi,

 

When I try to count the occurrence of a number of empty geometries like
this:

 

SELECT ST_AsText(geom) txt, count(*)

FROM (

SELECT 'POINT(0 0)'::geometry geom

UNION ALL

SELECT 'POINT(0 0)'::geometry geom

UNION ALL

SELECT 'POINT(0 0)'::geometry geom

UNION ALL

SELECT 'POINT(0 1)'::geometry geom

UNION ALL

SELECT 'LINESTRING(0 0,0 1)'::geometry geom

UNION ALL

SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom

UNION ALL

SELECT 'POINT EMPTY'::geometry geom

) foo

GROUP BY geom;

 

The 'GEOMETRYCOLLECTION EMPTY' and the 'POINT EMPTY' do not aggregate and I
get:

 

txt
count

POINT(0 0)                                                         2

POINT EMPTY                                                   1

POINT(0 0)                                                         1

LINESTRING(0 0,0 1)                                      1

GEOMETRYCOLLECTION EMPTY                1

POINT(0 1)                                                         1

 

If I remove any geometry other than the two EMPTY ones like this (I removed
the first one 'POINT(0 0)'), the two EMPTY geometries DO aggregate:

 

SELECT ST_AsText(geom) txt, count(*)

FROM (

SELECT 'POINT(0 0)'::geometry geom

UNION ALL

SELECT 'POINT(0 0)'::geometry geom

UNION ALL

SELECT 'POINT(0 1)'::geometry geom

UNION ALL

SELECT 'LINESTRING(0 0,0 1)'::geometry geom

UNION ALL

SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom

UNION ALL

SELECT 'POINT EMPTY'::geometry geom

) foo

GROUP BY geom;

 

Result:

 

txt                                          count

POINT(0 0)                          2

LINESTRING(0 0,0 1)       1

POINT EMPTY                    2

POINT(0 1)                          1

 

Any idea why empty geometries aggregation seem so dependent on what others
geoms are in the table?

 

Sorry I could not build a more simple example.

 

Thanks,

 

Pierre

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20170911/c0f1778d/attachment.html>


More information about the postgis-users mailing list