[postgis-users] Re: Re bug with GIST indexes

Mark Cave-Ayland m.cave-ayland at webbased.co.uk
Sun Mar 9 12:09:55 PST 2003


Hi Michael, Dave, Chris & Listers,

Although I'm not making use of the @ and ~ operators, i've spent a day
successfully investigating and solving the problems Michael was
experiencing, mainly for 2 reasons:

1) We have only just upgraded to PG7.3 on our production systems and so
any problems with the GiST spatial indexes may have a big impact on
performance for us.

2) It seemed like a good excuse to learn about GiST, PostgreSQL &
PostGIS internals.

So here we go.... I decided to firstly concentrate on the @ operator as
it would be likely that the two problems would be related. And it seemed
a good thing to do would be to dump the numbers going through the
geometry_contain() function in postgis_ops.c for both indexed and
sequential scans and try and determine what was going on.

This is where I got the first inkling that something was not right at
all: what I found when I re-compiled PostGIS took me completely by
surprise. While the sequential scan threw lots of nice debugging info
into a file, the index scan made none at all! So basically the
geometry_contain() routine was not being called when GiST index scans
where enabled - so where on earth was the result of the containment
being calculated???

At this point it was necessary to go back to the PostgreSQL
documentation pages and spend a couple of hours reading up on GiST
indexing and operator support. Using this I discovered that the operator
class was used to specify the link between operators and the associated
function they used in PostGIS. More reading about GiST revealed that
GiST indexes have a series of methods that must be implemented to
realise a GiST index. It appeared that the most interesting one of these
was the Consistent() function that applied an operator between two items
and returned either true or false depending upon whether the operator
condition was satisfied or not.

This led me into ggeometry_consistent() (and eventually
rtree_internal_consistent()) which showed how each numbered operator was
used to call a different function to calculate the result. I inserted
some extra debugging checks here and was pleased when re-running the
index query to see 9 entries into rtree_internal_consistent() with a
strategy of 8 indicating the @ operator. Good, so this indicated I was
in the right place.

Strategy 8 was entitled 'RTContainedByStrategyNumber' and put in a call
to a box_overlap() function. Thinking this was a wrapper to the
ggeometry_contain() function, I did a search on box_overlap() to find
out what it was doing. To my surprise, the box_overlap() function was
not in PostGIS but in the PostgreSQL source tree under
/src/backend/utils/adt/geo_ops.c!!! Then I realised what was happening:
PostGIS was using the PostgreSQL internal geometry functions to compute
index results and not its own. It did this by converting a BOX3D into a
BOX, running the internal function, and then returning the result.

I then inserted some debugging information into
rtree_internal_consistent() and found out that box_overlap was always
returning true for all 9 cases of checking the overlap using a GiST
index. A quick comparison of box_ov() and geometry_contain() showed that
they were actually different. So I proceeded to create a simple function
mca_geomtest() which performed the same test as geometry_contain() and
wrote the results of both functions out to a file. The result was that
the calculation from geometry_contain() returned 6 true results whereas
the box_overlap() calculation function returned true for all 9 results.

So I simply replaced the box_overlap() function to my new function
containing the calculation from geometry_contain(). And the result?
Bingo!


sparkytest=# explain analyze select * from foo a, foo b where a.the_geom
@ b.the_geom;
                                             QUERY PLAN
------------------------------------------------------------------------
-----------------------------
 Nested Loop  (cost=0.00..4.23 rows=1 width=72) (actual time=0.26..0.78
rows=6 loops=1)
   Join Filter: ("outer".the_geom @ "inner".the_geom)
   ->  Seq Scan on foo a  (cost=0.00..1.03 rows=3 width=36) (actual
time=0.02..0.03 rows=3 loops=1)
   ->  Seq Scan on foo b  (cost=0.00..1.03 rows=3 width=36) (actual
time=0.00..0.01 rows=3 loops=3)  Total runtime: 0.84 msec (5 rows)

sparkytest=# select * from foo a, foo b where a.the_geom @ b.the_geom;
 id |              the_geom              | id |              the_geom
----+------------------------------------+----+-------------------------
----+------------------------------------+----+-----------
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;POINT(1 1)
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;LINESTRING(1 1,2
2)
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1))
  1 | SRID=-1;LINESTRING(1 1,2 2)        |  1 | SRID=-1;LINESTRING(1 1,2
2)
  1 | SRID=-1;LINESTRING(1 1,2 2)        |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1))
  1 | SRID=-1;POLYGON((1 1,2 2,3 3,1 1)) |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1)) (6 rows)

sparkytest=# set enable_seqscan='f';
SET
sparkytest=# explain analyze select * from foo a, foo2 b where
a.the_geom @ b.the_geom;
                                                        QUERY PLAN
------------------------------------------------------------------------
---------------------------------------------------
 Nested Loop  (cost=100000000.00..100000012.11 rows=1 width=72) (actual
time=0.16..0.59 rows=6 loops=1)
   ->  Seq Scan on foo a  (cost=100000000.00..100000001.03 rows=3
width=36) (actual time=0.02..0.03 rows=3 loops=1)
   ->  Index Scan using idx_foo2_geom on foo2 b  (cost=0.00..3.68 rows=1
width=36) (actual time=0.12..0.17 rows=2 loops=3)
         Index Cond: ("outer".the_geom @ b.the_geom)
 Total runtime: 0.72 msec
(5 rows)

sparkytest=# select * from foo a, foo b where a.the_geom @ b.the_geom;
 id |              the_geom              | id |              the_geom
----+------------------------------------+----+-------------------------
----+------------------------------------+----+-----------
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;POINT(1 1)
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;LINESTRING(1 1,2
2)
  1 | SRID=-1;POINT(1 1)                 |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1))
  1 | SRID=-1;LINESTRING(1 1,2 2)        |  1 | SRID=-1;LINESTRING(1 1,2
2)
  1 | SRID=-1;LINESTRING(1 1,2 2)        |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1))
  1 | SRID=-1;POLYGON((1 1,2 2,3 3,1 1)) |  1 | SRID=-1;POLYGON((1 1,2
2,3 3,1 1)) (6 rows)


So in summary: the problem occurs because the box_overlap() function
always returns true in postgresql. However this does give rise to a
number of points:


1) Having two separate functions box_overlap() and geometry_contain()
doing the same operator is bad. It appears that some of this is legacy
code which exists from modifying contrib/rtree_gist. It is highly
recommended that a single function is written which can do both, so a
change in one place fixes it for both index and non-index scans.

2) PostGIS seems reliant upon some PostgreSQL functions for its
operation. This should probably be changed because a change in the
PostgreSQL source tree may break things for some versions, and not other
versions (as seen here).

3) Even though 2) is the case, I cannot see any difference between the
box_overlap() function between 7.2.x and 7.3.x.... so I can only assume
that some underlying data type has changed which causes this to fail?

4) It appears indexed queries are not SRID aware; I think the solution
to this is that BOX3Ds must have a SRID attached to them as opposed to
the geometry; then rtree_internal_consistent() can throw an error if the
SRID of the BOX3D and the geometry do not match. As the GISTENTRY
structure seems to return a BOX3D (ie no SRID) I can only imagine that a
blind conversion is done by convert_box3d_to_box() unless this is
trapped somewhere else.


I have uploaded the debugging modifications (and the fix for the @
operator) I made to postgis_gist_72.c and postgis_ops.c to the following
URL: http://homepages.zen.co.uk/zen9662/postgis/. Simply replace the
current CVS versions from yesterday with the above files and have a look
at the file /tmp/out.pg to see what goes on for indexed and non-indexed
queries (remember I've only been looking at the @ operator).

The reason that I haven't written and submitted a proper patch is
firstly Dave & Chris will need to check through everything here, and
secondly, I would imagine a proper patch would involve rewriting
sections of postgis_ops.c and postgis_gist_72.c to take points 1-4 into
account. However, if just a simple fix is required (like I did for the @
operator), it should just be a matter a minutes to duplicate the
function as I have done, tidy up the additions I have made, and place
the new functions somewhere more appropriate in the source tree.


Cheers,

Mark.


---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On 
> Behalf Of David Blasby
> Sent: 07 March 2003 18:17
> To: Michael O'Sullivan; chris at refractions.net; 
> postgis-users at postgis.refractions.net
> Subject: [postgis-users] Re: Re bug with GIST indexes
> 
> 
> Michael,
> 
> We have noted the problem you found (see chris's message to
> the list). 
>  We havent been able to figure out why its doing what its doing.
> 
> Basically, if you're using the index, you're getting the incorrect
> results.  Everything we've looked at indicates it should be 
> working as 
> expected, so we're a bit stuck.
> 
> Both chris and I are very busy right now, so we dont have
> much time to 
> spend tracking it down.  Hopefully things will be less busy 
> next month.
> 
> I suggest you either (1) dont use the @ and ~ operators (2) only use
> them if you havent indexed the table (3)  re-write your 
> queries so they 
> use &&  ("overlap") instead of @/~ or (4) use postgresql 7.1 or 7.2.
> 
> Not many people actually use these operators (I think you're
> one of the 
> only people who do), so no one noticed them doing something wrong.  
> 
> Next month we hope to release postgis with GEOS support -
> this will give 
> access to robust implementations of the OGC relate() 
> functions.  You'll 
> be able to determine all sort of complex relationship between two 
> geometries (not just their bounding boxes).
> 
> dave
> 
> 
> _______________________________________________
> postgis-users mailing list postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 





More information about the postgis-users mailing list