[postgis-devel] Re: [postgis-users] GIST index speed

Mon Jun 16 03:59:56 PDT 2008

Tom Lane wrote:

> After looking around a bit, I thought probably the best place to do the
> deed is in ExecIndexEvalRuntimeKeys().  The attached patch is a bit
> larger than I'd anticipated because it seems best to look up the
> possible toastability of the datatype just once per query, instead of
> every time through ExecIndexEvalRuntimeKeys().

Great! My first question was going to be along the lines of "would someone
of my experience be able to handle this?", but obviously you beat me to
it. It looks like a relatively straightforward small and non-invasive
patch which is good.

> The patch passes core and contrib regression tests in CVS HEAD, but
> I've not tried it further back.  (I can say that the patch applies
> successfully to 8.3 and 8.2 branches, but not 8.1; I didn't look at
> what it might take to make it go in 8.1 or before.)

Okay. I'm personally not too worried about 8.1, but inclusion within 8.2
and 8.3 would definitely help out a lot of people with large geometries.

> I don't have any test cases that might show performance benefits.
> I'd be willing to apply the patch to HEAD on my own authority, but
> to get it into existing release branches, I think the PostGIS crowd
> will need to make a case to pgsql-hackers that there are significant
> performance benefits here.  Hence, please test ...

(goes away and plays for a bit)

Hmmm there is still something strange going on; I can see some performance
benefit to the patch, but nothing near to the level of the mcatest()
function I was using before.

Here are the results using vanilla PostgreSQL 8.3.3 for Steve's original
query without your included patch:

postgis=# explain analyze select id,name from geography where type='Z' and
centroid && (select the_geom from geography where id=69495);
                                                                    QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using geography_centroid_index on geography 
(cost=7092.00..7100.28 rows=1 width=14) (actual time=51.911..3761.878
rows=29687 loops=1)
   Index Cond: (centroid && $0)
   Filter: ((centroid && $0) AND ((type)::text = 'Z'::text))
   InitPlan
     ->  Seq Scan on geography  (cost=0.00..7092.00 rows=1 width=4503)
(actual time=22.314..42.087 rows=1 loops=1)
           Filter: (id = 69495::numeric)
 Total runtime: 3797.354 ms
(7 rows)

postgis=# explain analyze select id,name from geography where type='Z' and
centroid && (select mcatest(the_geom) from geography where id=69495);
                                                                    QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using geography_centroid_index on geography 
(cost=7092.00..7100.28 rows=1 width=14) (actual time=58.853..271.775
rows=29687 loops=1)
   Index Cond: (centroid && $0)
   Filter: ((centroid && $0) AND ((type)::text = 'Z'::text))
   InitPlan
     ->  Seq Scan on geography  (cost=0.00..7092.00 rows=1 width=4503)
(actual time=24.640..58.501 rows=1 loops=1)
           Filter: (id = 69495::numeric)
 Total runtime: 317.961 ms
(7 rows)

And here are the results with your attached patch applied (note: I did
receive some warnings about hunk offsets during application, but the patch
did appear to apply successfully):

postgis=# explain analyze select id,name from geography where type='Z' and
centroid && (select the_geom from geography where id=69495);
                                                                    QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using geography_centroid_index on geography 
(cost=7092.00..7100.28 rows=1 width=14) (actual time=51.433..1993.498
rows=29687 loops=1)
   Index Cond: (centroid && $0)
   Filter: ((centroid && $0) AND ((type)::text = 'Z'::text))
   InitPlan
     ->  Seq Scan on geography  (cost=0.00..7092.00 rows=1 width=4503)
(actual time=30.813..50.979 rows=1 loops=1)
           Filter: (id = 69495::numeric)
 Total runtime: 2031.378 ms
(7 rows)

postgis=# explain analyze select id,name from geography where type='Z' and
centroid && (select mcatest(the_geom) from geography where id=69495);
                                                                    QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using geography_centroid_index on geography 
(cost=7092.00..7100.28 rows=1 width=14) (actual time=57.051..266.321
rows=29687 loops=1)
   Index Cond: (centroid && $0)
   Filter: ((centroid && $0) AND ((type)::text = 'Z'::text))
   InitPlan
     ->  Seq Scan on geography  (cost=0.00..7092.00 rows=1 width=4503)
(actual time=23.738..56.669 rows=1 loops=1)
           Filter: (id = 69495::numeric)
 Total runtime: 318.374 ms
(7 rows)

So the good news is that the patch has shaved nearly 50% off the overall
query time. But in comparison to my simple wrapper function that de-TOASTs
the geometry, it is still 6 times slower for an identical query plan even
with your patch included.

This seems to indicate that something else must still be at play here: can
you think of any other ways in which these two queries are still
different? Given the simplicity of the patch, I can't see that much of
this would be influenced by the overhead of the patch itself.

Many thanks,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063