[postgis-users] Strange result about spatial indexes?

Fri Aug 14 00:35:37 PDT 2009

Kevin et al,

Interesting.  What is there to discover? That you can force choice of index?

 It seems to me with the expand case, you are basically forcing the planner
to use the spatial index on A because 

ST_Expand(b.the_geom,0) doesn't match an index.

In the && case the planner has a choice of using one index or the other and
possibly both and evidentally in this case uses the wrong index.

That's why we have that double ST_Expand call in ST_Dwithin for example.

So seems to be an issue of why the planner guessed wrong in the first place.

So this seems to be a mechanism of forcing hints on the planner.

Thanks,
Regina

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Kevin
Neufeld
Sent: Friday, August 14, 2009 1:23 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Strange result about spatial indexes?

I find it amazing that this wasn't discovered and exploited long ago.  I get
the same results testing two of my own linear tables:

postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where a.the_geom &&
b.the_geom;
                                                         QUERY 
PLAN                                                        
----------------------------------------------------------------------------
------------------------------------------------
 Nested Loop  (cost=0.00..1684.19 rows=27910 width=703) (actual
time=2.046..795.320 rows=870 loops=1)
   ->  Seq Scan on foo a  (cost=0.00..1211.13 rows=13113 width=634) (actual
time=0.015..45.444 rows=13113 loops=1)
   ->  Index Scan using foo2_idx on foo2 b  (cost=0.00..0.02 rows=1
width=69) (actual time=0.047..0.050 rows=0 loops=13113)
         Index Cond: (a.the_geom && b.the_geom)
         Filter: (a.the_geom && b.the_geom)  Total runtime: 798.109 ms
(6 rows)

postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where a.the_geom &&
expand(b.the_geom, 0);
NOTICE:  LWGEOM_gist_joinsel called with arguments that are not column
references
                                                       QUERY 
PLAN                                                       
----------------------------------------------------------------------------
---------------------------------------------
 Nested Loop  (cost=0.00..2894.97 rows=59 width=703) (actual
time=0.106..65.595 rows=870 loops=1)
   ->  Seq Scan on foo2 b  (cost=0.00..20.93 rows=893 width=69) (actual
time=0.018..3.120 rows=893 loops=1)
   ->  Index Scan using foo_idx on foo a  (cost=0.00..3.20 rows=1
width=634) (actual time=0.038..0.056 rows=1 loops=893)
         Index Cond: (a.the_geom && expand(b.the_geom, 0::double precision))
         Filter: (a.the_geom && expand(b.the_geom, 0::double precision))
Total runtime: 68.424 ms

I've verified that the results are the same in both queries.

Of course, in doing this, you need to be careful you don't perform a
sequential scan through the larger table as the planner chose to do in the
first query (this quick testing database is clearly in need of tuning).

postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where b.the_geom &&
expand(a.the_geom, 0);
NOTICE:  LWGEOM_gist_joinsel called with arguments that are not column
references
                                                         QUERY 
PLAN                                                        
----------------------------------------------------------------------------
------------------------------------------------
 Nested Loop  (cost=0.00..1756.76 rows=59 width=718) (actual
time=0.195..467.116 rows=870 loops=1)
   ->  Seq Scan on foo a  (cost=0.00..1218.13 rows=13113 width=645) (actual
time=0.023..45.412 rows=13113 loops=1)
   ->  Index Scan using foo2_idx on foo2 b  (cost=0.00..0.03 rows=1
width=73) (actual time=0.023..0.024 rows=0 loops=13113)
         Index Cond: (b.the_geom && expand(a.the_geom, 0::double precision))
         Filter: (b.the_geom && expand(a.the_geom, 0::double precision))
Total runtime: 469.915 ms
(6 rows)

-- Kevin

Jose Carlos wrote:
> Hi,
> I made the table smaller for sharing. Now the table has just 1600 rows.
> The effect still remains.  I tried with others tables and other 
> PostGIS versions, I tried even just with tables with POINT features 
> getting similar results.
>
> Thanx,
> Best
>
> You can download the table from here:
> http://cid-40b8dee7bf5b661f.skydrive.live.com/self.aspx/.Public/data.t
> gz
>
> test3=# explain analyze select count(*) from suelos1 s1, suelos1 s2 
> where (s1.geom && s2.geom);
>                                                                 QUERY
> PLAN                                  
> ----------------------------------------------------------------------
> ---------------------------------------------------------------------
>  Aggregate  (cost=1095.91..1095.92 rows=1 width=0) (actual
> time=2840.187..2840.189 rows=1 loops=1)
>    ->  Nested Loop  (cost=0.00..1079.87 rows=6416 width=0) (actual
> time=0.111..2820.408 rows=13808 loops=1)
>          ->  Seq Scan on suelos1 s1  (cost=0.00..130.04 rows=1604
> width=4597) (actual time=0.009..2.580 rows=1604 loops=1)
>          ->  Index Scan using suelos1_geom on suelos1 s2
> (cost=0.00..0.58 rows=1 width=4597) (actual time=0.778..1.729 rows=9
> loops=1604)
>                Index Cond: (s1.geom && s2.geom)  Total runtime: 
> 2840.270 ms
> (6 rows)
>
> test3=# explain analyze select count(*) from suelos1 s1, suelos1 s2 
> where (st_expand(s1.geom,00) && s2.geom);
>                                                                 QUERY
> PLAN                                  
> ----------------------------------------------------------------------
> ---------------------------------------------------------------------
>  Aggregate  (cost=1087.92..1087.93 rows=1 width=0) (actual 
> time=212.789..212.790 rows=1 loops=1)
>    ->  Nested Loop  (cost=0.00..1087.89 rows=13 width=0) (actual
> time=0.090..189.842 rows=13808 loops=1)
>          ->  Seq Scan on suelos1 s1  (cost=0.00..130.04 rows=1604
> width=4597) (actual time=0.012..2.632 rows=1604 loops=1)
>          ->  Index Scan using suelos1_geom on suelos1 s2
> (cost=0.00..0.58 rows=1 width=4597) (actual time=0.043..0.084 rows=9
> loops=1604)
>                Index Cond: (st_expand(s1.geom, 0::double precision) &&
> s2.geom)
>  Total runtime: 212.864 ms
>
>
> (6 rows)
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>   
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users