<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">пн, 11 апр. 2016 г. в 17:36, Mark Cave-Ayland <<a href="mailto:mark.cave-ayland@ilande.co.uk">mark.cave-ayland@ilande.co.uk</a>>:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 11/04/16 12:06, Komяpa wrote:<br>
<br>
> Having counter-intuitive behaviour leads to realy sublte bugs that are<br>
> nearly impossible to debug.<br>
> SQL was meant as human-readable, things like this one really break the<br>
> perceprion of it as of English text and break expectations.<br>
><br>
> Distinct is (meant? felt?) to deduplicate identical records, that<br>
> appeared possibly because of some join elsewhere in the flow (or for any<br>
> other reason). For floating point numbers there, I don't really expect<br>
> "nearly equals" behaviour, so ::bytea/memcmp-like comparsion for<br>
> geometries seems sane in the case.<br>
><br>
> When I do SELECT DISTINCT geom, I want _distinct geometry_, not<br>
> _geometry with distinct boxes_. For distinct boxes I'd write SELECT<br>
> DISTINCT on (ST_Envelope(geom)) geom, and that's rather rare case. These<br>
> two being swapped really require mind-twisting, and the more<br>
> mind-twisting it requires the less people can use it.<br>
><br>
> How about sorting by zig-zag-encoded coordinates?<br>
><br>
> Are there any showstoppers to implement this change, except of everyone<br>
> has to REINDEX?<br>
<br>
Bear in mind that DISTINCT isn't quite as simple as you may imagine -<br>
for example would you consider two geometries with the same coordinates<br>
but in reverse order the same?</blockquote><div> </div><div>They're different.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Or how about two geometries that are<br>
exactly the same but one with a 3rd dimension coordinate added to some<br>
vertices?<br></blockquote><div> </div><div>They're different. </div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Also note that memcmp() can't be directly used on floating point numbers<br>
since it is possible to have multiple binary representations of the same<br>
number (a quick search for memcmp floating point numbers will point you<br>
towards some of the issues).<br></blockquote><div> </div><div>That's fine.</div><div><br></div><div>It's no different form currently recommended ::bytea or ::text, but keeps more readability in the SQL code.</div><div><br></div><div>If someone needs more sophisticated filtering, they can always do the filtering that fits them better, like ST_SnapToGrid or ST_CollectionHomogenize or ST_ForceRHR or ST_MakeValid or ST_Area(ST_Intersection(geom1, geom2))/(ST_Area(geom1)+ST_Area(geom2)) - whatever they like the best.</div><div><br></div><div>Current problem is that in current approach DISTINCT kills too much values that differ from one another.</div></div></div>