[postgis] Indexing....ack!

Dave Blasby dblasby at refractions.net
Wed Aug 15 09:23:26 PDT 2001


Paul Ramsey wrote:
> 
> And how did you *do* that, grasshopper :)
> (And are we sure we want to *always* use the index?)

I changed the definition of the '&&' (BBox overlap) operator;

CREATE OPERATOR && (
   LEFTARG = GEOMETRY, RIGHTARG = GEOMETRY, PROCEDURE =
geometry_overlap,
   COMMUTATOR = '&&',
   RESTRICT = postgis_gist_sel, JOIN = positionjoinsel
);

The change is in the RESTRICT function.  Its supposed to provide an
estimate of how many rows will be returned.  The old postgresql geometry
types used a function that estimated way too many rows. The planner
thought it would be more efficient to, therefore, not use the index.

The postgis_gist_sel() function estimates a much smaller percentage of
rows returned, so the planner is much more likely to use the index.

The default estimate was, I think, 0.5%.  The new one is, I think,
0.005%.  Its kinda of silly to use a constant number, but to actually
compute an actual estimate would probably take longer than to do the
actual query.

My bet is that an index scan will always outperform a sequence scan when
the resulting query returns less than 1/2 the rows in the table. 
For cases where more than 1/2 the rows are returned, the index may
actually slow things down because postgresql will have to read both the
index from the disk as well as the actual geometries.

But, if any of the geometries are large they could be compressed or
TOASTed.  If thats the case, the index will greatly improve speed
because it can pull several BBoxes with one disk page access while the
sequence scan will have to pull the entire geometry (several disk ops
and CPU time) to check its bounding box.

A slightly more intellegent RESTRICT function would look at the total
extent of all the geometries compared to the size of the query box. 
Unfortunately, i dont know how to tag more information into the actual
GiST index tree. It would take either a read of the entire geometry
column or the entire index to compute the total extent.

dave

To unsubscribe from this group, send an email to:
postgis-unsubscribe at yahoogroups.com

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 





More information about the postgis-users mailing list