[postgis-devel] ST_OrderingEquals vs ~=

Kevin Neufeld kneufeld at refractions.net
Sat Jun 20 21:56:35 PDT 2009


Yeah, I hear you.  I was asked just the other day ... again ..."Why oh 
why does = not mean = in PostGIS?  Is it for speed?  If so, that's what 
the && is for."  All I can do is smile and shrug my shoulders.  I really 
can't say what the original reason behind it was.

It does kinda make sense that a GROUP BY geometry would use the exact 
geometry.  If one was concerned about performance, they could explicitly 
GROUP BY the bounding box of the geometry.

But you're right, who knows what's in that Pandora's box once you open it.

-- Kevin

Paul Ramsey wrote:
> We have sort of a cleavage between the people who really understand
> the use cases and the people who really understand the code. If there
> was a suite or a write up of all the ways to exercise this stuff, I
> could alter the code and do some tests. But yes, the knowledge that
> it's been this way a long time sort of paralyzes me -- who knows what
> a change would break.
>
> P
>
> On Sat, Jun 20, 2009 at 8:25 PM, Paragon Corporation <lr at pcorp.us> wrote:
>   
>> You sure it doesn't.  If it doesn't then our documentation is wrong
>>
>> http://postgis.refractions.net/documentation/manual-svn/ST_Geometry_Same.htm
>> l
>>
>> Also I thought it did because when I do this
>>
>> SELECT * FROM
>> tbl_feature_data WHERE the_geom ~=
>> '0101000020E610000009D31947329B5DC059448510E3A84240'
>>
>> I see my spatial indexes being used.
>>
>> Also when I look in the definition for gist_geometry_ops, I see it listed
>>
>> CREATE OPERATOR CLASS gist_geometry_ops DEFAULT
>>   FOR TYPE geometry USING gist AS
>>   FUNCTION 1  lwgeom_gist_consistent(internal, geometry, integer),
>>   FUNCTION 2  lwgeom_gist_union(bytea, internal),
>>   FUNCTION 3  lwgeom_gist_compress(internal),
>>   FUNCTION 4  lwgeom_gist_decompress(internal),
>>   FUNCTION 5  lwgeom_gist_penalty(internal, internal, internal),
>>   FUNCTION 6  lwgeom_gist_picksplit(internal, internal),
>>   FUNCTION 7  lwgeom_gist_same(box2d, box2d, internal)
>>   STORAGE box2d;
>>
>>  Yah the ST_Equals things was another open question that that should use an
>> index but doesn't.
>>
>> The = behavior -- I think we had big long heated arguments about this though
>> can't quite remember what if anything we decided. I think it was a stand
>> still and everyone was either afraid to touch it because its such a long
>> standing gotcha and wasn't a big deal once you know abut it.  If we change
>> this, I suspect it may slow down a lot of things and I wasn't clear if the
>> reason for its existence was for performance or because it ironically
>> doesn't use a gist index and a non-gist based operator is needed for GROUP
>> BY, UNION since in theory it shouldn't be lossy though it is really lossy.
>> Do things like UNION, GROUP BY etc require btree based operators?  It is
>> very strange that the very thing you would think would use a gist index
>> (bbox = bbox) is not defined for gist but for btree. Or at least that always
>> kept me scratching my head.
>>
>> The other side effects of changing this behavior
>> Pro - Leo and I have to rip out 2 pages from our book describing how to not
>> be a victim of the = is not = paradox (it keeps people awake)
>> Con - Leo and I will have to rip out 2 pages from our book describing how to
>> not be a victim of the = is not = paradox. (it makes PostGIS more logical
>> sounding)
>>
>>
>>
>> -----Original Message-----
>> From: postgis-devel-bounces at postgis.refractions.net
>> [mailto:postgis-devel-bounces at postgis.refractions.net] On Behalf Of Kevin
>> Neufeld
>> Sent: Saturday, June 20, 2009 10:08 PM
>> To: PostGIS Development Discussion
>> Subject: Re: [postgis-devel] ST_OrderingEquals vs ~=
>>
>> Actually, looking at it, I think this is correct.
>>
>> ~= does not use indexes.  The operator invokes 'ST_geometry_same' which
>> exposes the C function 'LWGEOM_same' found in 'lwgeom_functions_basic.c'.
>>
>> Your other comment about ST_Equals could be looked at however.
>> It currently just calls geomequals, without making use of any indexes.
>> Shouldn't the function be a simply wrapper for 'SELECT $1 & $2 AND
>> _ST_Equals($1, $2)'?
>>
>> Also, why is the = operator bound to lwgeom_eq and not LWGEOM_same?  The
>> former uses bounding boxes for comparison and the later compares exact
>> geometry equality.  I know that this operator is used in GROUP BY
>> expressions .. but would it be beneficial to GROUP BY the exact geometries
>> instead of just their bounding boxes? This is such a big PostGIS gotcha.
>>
>> Anywho, just some thoughts.
>> -- Kevin
>>
>> Paragon Corporation wrote:
>>     
>>> I think Kevin mentioned this before and he wasn't sure why either but
>>> thought I would bring it up since I'm not clear what the reason for this
>>>       
>> is.
>>     
>>> Our ST_OrderingEquals function is a short-hand for SELECT $1 && $2 AND
>>> $1 ~= $2
>>>
>>> But ~= already uses a GIST index doesn't it?  So why this extra $1 &&
>>> $2 check?
>>>
>>> Is the && faster to compute? Or have higher selectivity?
>>>
>>> Thanks,
>>> Regina
>>>
>>>
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>>
>>>       
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>     
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>   



More information about the postgis-devel mailing list