[postgis-users] Strange result about spatial indexes?

Fri Aug 14 06:51:08 PDT 2009

Hi Regina,
yes I understand and agree with your point (of course st_expand does not
use spatial indexes) but if you see the first query (in the thread) I
got surprised because both queries are using the same spatial index.

Below, there is another example, this one using two tables instead just
one (both cases are using just spatial index on ttmm table)

I think the problem could be related with Ramsey explanation.
Best,
Jose Carlos

test3=# explain analyze select count(*) from suelos p1, ttmm p2 where
(st_expand(p1.geom,0) && p2.geom);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11703.74..11703.75 rows=1 width=0) (actual
time=1732.254..1732.255 rows=1 loops=1)
-> Nested Loop (cost=0.00..11703.11 rows=250 width=0) (actual
time=0.102..1627.714 rows=73047 loops=1)
-> Seq Scan on suelos p1 (cost=0.00..2491.87 rows=25487 width=6400)
(actual time=0.011..55.494 rows=25487 loops=1)
-> Index Scan using idx_ttmm_geom on ttmm p2 (cost=0.00..0.35 rows=1
width=19744) (actual time=0.031..0.045 rows=3 loops=25487)
Index Cond: (st_expand(p1.geom, 0::double precision) && p2.geom)
Total runtime: 1732.330 ms

test3=# explain analyze select count(*) from suelos p1, ttmm p2 where
(p1.geom && p2.geom);

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11712.91..11712.92 rows=1 width=0) (actual
time=27317.423..27317.425 rows=1 loops=1)
-> Nested Loop (cost=0.00..11575.68 rows=54890 width=0) (actual
time=0.136..27212.283 rows=73047 loops=1)
-> Seq Scan on suelos p1 (cost=0.00..2491.87 rows=25487 width=6400)
(actual time=0.074..60.629 rows=25487 loops=1)
-> Index Scan using idx_ttmm_geom on ttmm p2 (cost=0.00..0.34 rows=1
width=19744) (actual time=0.655..1.054 rows=3 loops=25487)
Index Cond: (p1.geom && p2.geom)
Total runtime: 27317.501 ms
(6 rows)

Paragon Corporation wrote:
> Kevin et al,
>
> Interesting.  What is there to discover? That you can force choice of index?
>
>  It seems to me with the expand case, you are basically forcing the planner
> to use the spatial index on A because 
>
> ST_Expand(b.the_geom,0) doesn't match an index.
>
> In the && case the planner has a choice of using one index or the other and
> possibly both and evidentally in this case uses the wrong index.
>
> That's why we have that double ST_Expand call in ST_Dwithin for example.
>
> So seems to be an issue of why the planner guessed wrong in the first place.
>
> So this seems to be a mechanism of forcing hints on the planner.
>
> Thanks,
> Regina
>
>  
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Kevin
> Neufeld
> Sent: Friday, August 14, 2009 1:23 AM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Strange result about spatial indexes?
>
> I find it amazing that this wasn't discovered and exploited long ago.  I get
> the same results testing two of my own linear tables:
>
> postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where a.the_geom &&
> b.the_geom;
>                                                          QUERY 
> PLAN                                                        
> ----------------------------------------------------------------------------
> ------------------------------------------------
>  Nested Loop  (cost=0.00..1684.19 rows=27910 width=703) (actual
> time=2.046..795.320 rows=870 loops=1)
>    ->  Seq Scan on foo a  (cost=0.00..1211.13 rows=13113 width=634) (actual
> time=0.015..45.444 rows=13113 loops=1)
>    ->  Index Scan using foo2_idx on foo2 b  (cost=0.00..0.02 rows=1
> width=69) (actual time=0.047..0.050 rows=0 loops=13113)
>          Index Cond: (a.the_geom && b.the_geom)
>          Filter: (a.the_geom && b.the_geom)  Total runtime: 798.109 ms
> (6 rows)
>
> postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where a.the_geom &&
> expand(b.the_geom, 0);
> NOTICE:  LWGEOM_gist_joinsel called with arguments that are not column
> references
>                                                        QUERY 
> PLAN                                                       
> ----------------------------------------------------------------------------
> ---------------------------------------------
>  Nested Loop  (cost=0.00..2894.97 rows=59 width=703) (actual
> time=0.106..65.595 rows=870 loops=1)
>    ->  Seq Scan on foo2 b  (cost=0.00..20.93 rows=893 width=69) (actual
> time=0.018..3.120 rows=893 loops=1)
>    ->  Index Scan using foo_idx on foo a  (cost=0.00..3.20 rows=1
> width=634) (actual time=0.038..0.056 rows=1 loops=893)
>          Index Cond: (a.the_geom && expand(b.the_geom, 0::double precision))
>          Filter: (a.the_geom && expand(b.the_geom, 0::double precision))
> Total runtime: 68.424 ms
>
>
> I've verified that the results are the same in both queries.
>
> Of course, in doing this, you need to be careful you don't perform a
> sequential scan through the larger table as the planner chose to do in the
> first query (this quick testing database is clearly in need of tuning).
>
> postgis=# EXPLAIN ANALYZE select * from foo a, foo2 b where b.the_geom &&
> expand(a.the_geom, 0);
> NOTICE:  LWGEOM_gist_joinsel called with arguments that are not column
> references
>                                                          QUERY 
> PLAN                                                        
> ----------------------------------------------------------------------------
> ------------------------------------------------
>  Nested Loop  (cost=0.00..1756.76 rows=59 width=718) (actual
> time=0.195..467.116 rows=870 loops=1)
>    ->  Seq Scan on foo a  (cost=0.00..1218.13 rows=13113 width=645) (actual
> time=0.023..45.412 rows=13113 loops=1)
>    ->  Index Scan using foo2_idx on foo2 b  (cost=0.00..0.03 rows=1
> width=73) (actual time=0.023..0.024 rows=0 loops=13113)
>          Index Cond: (b.the_geom && expand(a.the_geom, 0::double precision))
>          Filter: (b.the_geom && expand(a.the_geom, 0::double precision))
> Total runtime: 469.915 ms
> (6 rows)
>
>
> -- Kevin
>
> Jose Carlos wrote:
>   
>> Hi,
>> I made the table smaller for sharing. Now the table has just 1600 rows.
>> The effect still remains.  I tried with others tables and other 
>> PostGIS versions, I tried even just with tables with POINT features 
>> getting similar results.
>>
>> Thanx,
>> Best
>>
>> You can download the table from here:
>> http://cid-40b8dee7bf5b661f.skydrive.live.com/self.aspx/.Public/data.t
>> gz
>>
>> test3=# explain analyze select count(*) from suelos1 s1, suelos1 s2 
>> where (s1.geom && s2.geom);
>>                                                                 QUERY
>> PLAN                                  
>> ----------------------------------------------------------------------
>> ---------------------------------------------------------------------
>>  Aggregate  (cost=1095.91..1095.92 rows=1 width=0) (actual
>> time=2840.187..2840.189 rows=1 loops=1)
>>    ->  Nested Loop  (cost=0.00..1079.87 rows=6416 width=0) (actual
>> time=0.111..2820.408 rows=13808 loops=1)
>>          ->  Seq Scan on suelos1 s1  (cost=0.00..130.04 rows=1604
>> width=4597) (actual time=0.009..2.580 rows=1604 loops=1)
>>          ->  Index Scan using suelos1_geom on suelos1 s2
>> (cost=0.00..0.58 rows=1 width=4597) (actual time=0.778..1.729 rows=9
>> loops=1604)
>>                Index Cond: (s1.geom && s2.geom)  Total runtime: 
>> 2840.270 ms
>> (6 rows)
>>
>> test3=# explain analyze select count(*) from suelos1 s1, suelos1 s2 
>> where (st_expand(s1.geom,00) && s2.geom);
>>                                                                 QUERY
>> PLAN                                  
>> ----------------------------------------------------------------------
>> ---------------------------------------------------------------------
>>  Aggregate  (cost=1087.92..1087.93 rows=1 width=0) (actual 
>> time=212.789..212.790 rows=1 loops=1)
>>    ->  Nested Loop  (cost=0.00..1087.89 rows=13 width=0) (actual
>> time=0.090..189.842 rows=13808 loops=1)
>>          ->  Seq Scan on suelos1 s1  (cost=0.00..130.04 rows=1604
>> width=4597) (actual time=0.012..2.632 rows=1604 loops=1)
>>          ->  Index Scan using suelos1_geom on suelos1 s2
>> (cost=0.00..0.58 rows=1 width=4597) (actual time=0.043..0.084 rows=9
>> loops=1604)
>>                Index Cond: (st_expand(s1.geom, 0::double precision) &&
>> s2.geom)
>>  Total runtime: 212.864 ms
>>
>>
>> (6 rows)
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>   
>>     
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>