[postgis-users] GIST Index performance

ken southerland at samsixedd.com
Mon Nov 29 18:15:37 PST 2004


Hi all,

I've got a simple problem with indexing that I can't seem to overcome.

I've got a table that looks like the following (Note: I am not
displaying all of the columns, just the ones involved in this query for
simplification)...

nrgdb=# \d nrgfeature
                Table "public.nrgfeature"
     Column     |            Type             | Modifiers
----------------+-----------------------------+-----------
 fid1           | bigint                      | not null
 fid2           | bigint                      | not null
 featuretypeid  | smallint                    | not null
 diagonalsize   | double precision            |
 boundinggeom   | geometry                    |
 layerid        | integer                     |
Indexes:
    "nrgfeature_fid_index" unique, btree (fid2, fid1)
    "nrgfeature_boundinggeom_idx" gist (boundinggeom)
    "nrgfeature_diag_idx" btree (diagonalsize)
    "nrgfeature_ft_index" btree (featuretypeid)
    "nrgfeature_layer" btree (layerid)


... which uses a gist index on the boundinggeom column.  The table has
almost 1.5 million rows (1453286 to be exact).

If I run a query on the boundinggeom that intersects all of the rows in
the table, postgres still chooses the gist index (even though all rows
match!) and not another index instead and the query is really, really
slow.  If it were instead to use another index the query would be fine. 
Here is the query I am doing ...

nrgdb=# explain
nrgdb-# analyze
nrgdb-# SELECT *
nrgdb-# FROM nrgfeature
nrgdb-# WHERE boundinggeom && GeometryFromText('BOX3D(759869.7916666667
646875.0, 200130.2083333333 303125.0)', -1)
nrgdb-# AND diagonalSize > 2291.666666666667
nrgdb-# ;
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using nrgfeature_boundinggeom_idx on nrgfeature 
(cost=0.00..31.18 rows=1 width=596) (actual time=1329.945..562438.980
rows=4422 loops=1)
   Index Cond: (boundinggeom && 'SRID=-1;BOX3D(200130.208333333 303125
0,759869.791666667 646875 0)'::geometry)
   Filter: ((diagonalsize > 2291.66666666667::double precision) AND
(boundinggeom && 'SRID=-1;BOX3D(200130.208333333 303125
0,759869.791666667 646875 0)'::geometry))
 Total runtime: 562476.459 ms
(4 rows)

... As you can see it thinks only one row would be returned and yet 4422
rows are returned.  All of these rows are satisfied by the diagonalsize
condition alone and if we take out the boundinggeom condition we get
exactly what we want in this case and we get it immediately.  Of course,
we can't take out the condition in the general case.

I have set the statistics to 100 for both the columns diagonalsize and
boundinggeom (does that even work for a gist index?), revacuumed full
analyze the table, and rerun SELECT UPDATE_GEOMETRY_STATS().  This has
not helped.

Thanks for any assistance you can provide.

Ken
-- 
------s----a----m----s----i----x----e----d----d------
--

Ken Southerland
Senior Consultant
Sam Six EDD
http://www.samsixedd.com





More information about the postgis-users mailing list