[postgis-users] Optimizer guess makes query 1000x slower?

strk strk at keybit.net
Tue Jun 15 03:21:57 PDT 2004


What postgresql version are you using ?
Postgis built against PostgreSQL <= 74 require you to run
update_geometry_stats() to build statistics used by the
selectivity estimator.

--strk;

On Tue, Jun 15, 2004 at 03:09:23AM -0700, rm_postgis at cheapcomplexdevices.com wrote:
> 
> Short summary:
>   I have a query that is trying to find points that are
>     (inside Sacramento) and 
>     (contained in a list of interesting points).
> 
>   Poor optimizer guess makes query take 202079 ms instead of 
>   281 ms if I disable nested loops.  It seems to think the 
>   geom index is much more selective than it turns out to be 
>   in practice.
> 
>   I'm using mapserver to run the query so I don't have total flexability
>   in the SQL (i.e. I don't know how to "enable_nestloop=false" in mapserver).
>   Any suggestions appreciated!
> 
> Longer:
> 
>   I have a query that is trying to find points that are
>     (inside Sacramento) and 
>     (contained in a list of interesting points).
> 
>   I have 2 tables.  One has all the points, along with their geometries.
>   The other table has simply "pointID | user" pairs that contains
>   lists of points that specific users find interesting.
> 
>   The query I'm using looks like this:
> 
>  SELECT *
>    FROM cpal_events,            -- points with gemoetry
>         cpal_selected_events    -- points of interest
>   WHERE cpal_events.nextval=cpal_selected_events.eventid  -- join
>     and cpal_selected_events.owner='user.23469878.20252'  -- for a user
>     and the_geom &&                                       -- in the area
>         setSRID( 'BOX3D(-121.5868 38.4277,-121.3361 38.6954)'::BOX3D,find_srid('','cpal_events','the_geom'));
> 
>   If I read the "explain analyze" result below, the optimizer
>   estimates that only 1 row is returned in the geometry, so it
>   does a "Nested Loop" to join with the list of "interesting" events.
> 
>   Unfortunatelly the geometry index actually returns 9570 results, so
>   the query takes hundreds of times longer than it should have.  By
>   disabling nested loops, the query takes 281 ms instead of 202079 ms!
>   (explain analyze output below).   I do "vacuum analyze" regularly.
> 
> Is there: 
>  a) any way I can hint that the GIST index isn't as selective as it thinks it is?
>  b) any way I can tell mapserver to avoid Nested Loops
>  c) something even more obvious I'm missing?
>  d) any more info I could give that would be interesting to the list?
> 
>    Thanks,
>    Ron
> 
> ================================================================================
> 
> 
> ronmap=# set enable_nestloop = true;
> SET
> ronmap=# explain analyze 
>   SELECT *
>     FROM cpal_events,cpal_selected_events
>    WHERE cpal_events.nextval=cpal_selected_events.eventid
>      and cpal_selected_events.owner='user.23469878.20252'
>      and the_geom && setSRID( 'BOX3D(-121.586865184167 38.4277618656666,-121.336132412548 38.6954249341005)'::BOX3D,find_srid('','cpal_events','the_geom'));
> 
>                                                                     QUERY PLAN                                                                     
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..218.01 rows=1 width=546) (actual time=2555.501..202056.918 rows=6357 loops=1)
>    Join Filter: ("outer".nextval = "inner".eventid)
>    ->  Index Scan using cpal_event_summary__gist on cpal_events  (cost=0.00..6.01 rows=1 width=503) (actual time=0.100..261.886 rows=9570 loops=1)
>          Index Cond: (the_geom && 'SRID=-1;BOX3D(-121.586865184167 38.4277618656666 0,-121.336132412548 38.6954249341005 0)'::geometry)
>    ->  Seq Scan on cpal_selected_events  (cost=0.00..132.50 rows=6360 width=43) (actual time=0.005..12.020 rows=6357 loops=9570)
>          Filter: ("owner" = 'user.23469878.20252'::text)
>  Total runtime: 202079.387 ms
> (7 rows)
> 
> ronmap=# set enable_nestloop = false;
> SET
> ronmap=# explain analyze
>   SELECT *
>     FROM cpal_events,cpal_selected_events
>    WHERE cpal_events.nextval=cpal_selected_events.eventid
>      and cpal_selected_events.owner='user.23469878.20252'
>      and the_geom && setSRID( 'BOX3D(-121.586865184167 38.4277618656666,-121.336132412548 38.6954249341005)'::BOX3D,find_srid('','cpal_events','the_geom'));
>                                                                        QUERY PLAN                                                                       
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=540.31..572.13 rows=1 width=546) (actual time=161.258..266.926 rows=6357 loops=1)
>    Merge Cond: ("outer".nextval = "inner".eventid)
>    ->  Sort  (cost=6.02..6.03 rows=1 width=503) (actual time=129.153..137.399 rows=9570 loops=1)
>          Sort Key: cpal_events.nextval
>          ->  Index Scan using cpal_event_summary__gist on cpal_events  (cost=0.00..6.01 rows=1 width=503) (actual time=0.103..78.088 rows=9570 loops=1)
>                Index Cond: (the_geom && 'SRID=-1;BOX3D(-121.586865184167 38.4277618656666 0,-121.336132412548 38.6954249341005 0)'::geometry)
>    ->  Sort  (cost=534.29..550.19 rows=6360 width=43) (actual time=32.047..37.490 rows=6357 loops=1)
>          Sort Key: cpal_selected_events.eventid
>          ->  Seq Scan on cpal_selected_events  (cost=0.00..132.50 rows=6360 width=43) (actual time=0.074..15.649 rows=6357 loops=1)
>                Filter: ("owner" = 'user.23469878.20252'::text)
>  Total runtime: 281.134 ms
> (11 rows)
> 
> ronmap=# 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list