[postgis-users] GIST index statistics

ken southerland at samsixedd.com
Wed Jul 28 11:37:09 PDT 2004


I just recently starting looking at postgis due to a failure in the
rtree indices of postgres to handle my situation.  Unfortunately, and I
guess not surprisingly since GIST indices are built on postgres's rtree
index, I experience the same problem with postgis.

I have the following table ...


test=# \d gistest
           Table "public.gistest"
    Column    |       Type       | Modifiers
--------------+------------------+-----------
 fid1         | numeric(64,0)    | not null
 fid2         | numeric(64,0)    | not null
 diagonalsize | double precision |
 geometry     | geometry         |
Indexes:
    "gistest_diagonalsize_idx" btree (diagonalsize)
    "gistest_geometry_idx" gist (geometry)
Check constraints:
    "$1" CHECK (srid(geometry) = -1)
    "$2" CHECK (geometrytype(geometry) = 'POLYGON'::text OR geometry IS
NULL)


... and it has 2898640 rows in it.  If I try the following query ...


explain analyze SELECT *
FROM gistest
WHERE geometry && GeometryFromText('BOX3D(825160.2564102565
685833.3333333333, 134839.74358974356 264166.6666666666)', -1 )
AND diagonalSize > 7638.888888888889;


... I get the following query plan ...

                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using gistest_geometry_idx on gistest  (cost=0.00..60.20
rows=1 width=278)
   Index Cond: (geometry && 'SRID=-1;BOX3D(134839.743589744
264166.666666667 0,825160.256410256 685833.333333333 0)'::geometry)
   Filter: ((diagonalsize > 7638.88888888889::double precision) AND
(geometry && 'SRID=-1;BOX3D(134839.743589744 264166.666666667
0,825160.256410256 685833.333333333 0)'::geometry))
(3 rows)


... despite the fact that condition on the geometry alone returns
2896058 rows or almost the entire table!  The statistics, despite a
vacuum analyze on the table, suggest that only 1 row will be returned!

The diagonalsize condition alone returns only 2490 rows and therefore
the query plan should use the index on this diagonalsize with a filter
based on the overlaps condition.  Yet due to the completely incorrect
statistics on the geometry column it doesn't.

Any ideas?

Ken






More information about the postgis-users mailing list