[postgis-devel] Btree operators

Mark Cave-Ayland mark.cave-ayland at ilande.co.uk
Mon Apr 11 07:17:46 PDT 2016


On 11/04/16 12:06, Komяpa wrote:

> Having counter-intuitive behaviour leads to realy sublte bugs that are
> nearly impossible to debug.
> SQL was meant as human-readable, things like this one really break the
> perceprion of it as of English text and break expectations.
> 
> Distinct is (meant? felt?) to deduplicate identical records, that
> appeared possibly because of some join elsewhere in the flow (or for any
> other reason). For floating point numbers there, I don't really expect
> "nearly equals" behaviour, so ::bytea/memcmp-like comparsion for
> geometries seems sane in the case. 
> 
> When I do SELECT DISTINCT geom, I want _distinct geometry_, not
> _geometry with distinct boxes_. For distinct boxes I'd write SELECT
> DISTINCT on (ST_Envelope(geom)) geom, and that's rather rare case. These
> two being swapped really require mind-twisting, and the more
> mind-twisting it requires the less people can use it.
> 
> How about sorting by zig-zag-encoded coordinates? 
> 
> Are there any showstoppers to implement this change, except of everyone
> has to REINDEX?

Bear in mind that DISTINCT isn't quite as simple as you may imagine -
for example would you consider two geometries with the same coordinates
but in reverse order the same? Or how about two geometries that are
exactly the same but one with a 3rd dimension coordinate added to some
vertices?

Also note that memcmp() can't be directly used on floating point numbers
since it is possible to have multiple binary representations of the same
number (a quick search for memcmp floating point numbers will point you
towards some of the issues).


HTH,

Mark.




More information about the postgis-devel mailing list