[postgis-users] Odd explain analyze estimates?

Ron Mayer rm_postgis at cheapcomplexdevices.com
Fri Mar 4 16:06:19 PST 2005


Short summary:

   Below is a simple standalone test case showing the problem.

strk at refractions.net wrote:
> Mark, read carefully the explain analyze outputs.
> It seems normal to me that using subqueries doesn't allow the
> optimizer to take good decisions, while the query with no subquery
> has *excellent* estimates, am I reading something wrong ?
> 
> I don't think disabling the JOINSEL will do any better...
> Anyway we should wait for next report.

I have a standalone test (the 9 sql statements
shown below) that demonstrates the problem.  This test
case should be easy for calculating estimates because
all the POINTs are in the exact same location.


This query:

   select * from t1 as a natural join t2 as b where g&&'BOX3D(0 0,100 100)'::box3d

generates perfect estimates as expected (10000 rows).  This almost identical query:

  select * from (
   select * from t1 as a natural join t2 as b where g&&'BOX3D(0 0,100 100)'::box3d
  ) as t3;

generates about as poor estimates as possible (expecting 1 row instead of 10000).
This results in the poor plan that takes an order of magnitude longer.

I believe the two SQL statements should be identical in all respects; and that
the inner query in the second select statement exactly matches the first select
statement.


======================================================================================
======================================================================================
======================================================================================

fli=# create table t1 (id int);
fli=# create table t2 (id int);
fli=# select addgeometrycolumn('','t1','g',-1,'MULTIPOINT',2);
fli=# insert into t1 select 1,'MULTIPOINT((1 1))' from generate_series(1,10000);
fli=# insert into t2 select generate_series(1,10);
fli=# analyze t1;
fli=# analyze t2;

fli=# explain analyze select * from t1 natural join t2 where g&&'BOX3D(0 0,100 100)'::box3d;
                                                                                                           QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=1.12..380.12 rows=10000 width=46) (actual time=0.277..164.177 rows=10000 loops=1)
    Hash Cond: ("outer".id = "inner".id)
    ->  Seq Scan on t1  (cost=0.00..229.00 rows=10000 width=46) (actual time=0.094..56.444 rows=10000 loops=1)
          Filter: (g && '010300000001000000050000000000000000000000000000000000000000000000000000000000000000005940000000000000594000000000000059400000000000005940000000000000000000000000000000000000000000000000'::geometry)
    ->  Hash  (cost=1.10..1.10 rows=10 width=4) (actual time=0.131..0.131 rows=0 loops=1)
          ->  Seq Scan on t2  (cost=0.00..1.10 rows=10 width=4) (actual time=0.018..0.068 rows=10 loops=1)
  Total runtime: 211.808 ms
(7 rows)

fli=# explain analyze select * from (select * from t1 natural join t2 where g&&'BOX3D(0 0,100 100)'::box3d) as a;
                                                                                                           QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.00..230.22 rows=1 width=46) (actual time=0.070..1324.751 rows=10000 loops=1)
    Join Filter: ("outer".id = "inner".id)
    ->  Seq Scan on t1  (cost=0.00..229.00 rows=1 width=46) (actual time=0.031..57.474 rows=10000 loops=1)
          Filter: (g && '010300000001000000050000000000000000000000000000000000000000000000000000000000000000005940000000000000594000000000000059400000000000005940000000000000000000000000000000000000000000000000'::geometry)
    ->  Seq Scan on t2  (cost=0.00..1.10 rows=10 width=4) (actual time=0.006..0.057 rows=10 loops=10000)
  Total runtime: 1371.737 ms
(6 rows)

======================================================================================
======================================================================================
======================================================================================


Sorry I'm not at a computer where I can recompile
easily, so I didn't get a chance to test the
"#define DEBUG_GEOMETRY_STATS 1" suggestion yet.

    Ron



More information about the postgis-users mailing list