[postgis-users] Index Use Craziness

Burgholzer,Robert rwburgholzer at deq.virginia.gov
Thu Mar 19 08:45:24 PDT 2009


Mose,

You are quite correct.  If I use the WKT format of the query, the
planner selects my date field, otherwise, it prefers the GIST index.

 

explain select count(*) from precip_gridded where thisdate =
'2009-01-12' and the_geom && setsrid(geomfromtext(
'POLYGON((-83.6753387451172 36.5427703857422,-83.6753387451172
39.4660148620605, -75.2407989501953 39.4660148620605,-75.2407989501953
36.5427703857422,-83.6753387451172 36.5427703857422))'), 4326);

 

                                  QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
----------------

Aggregate  (cost=584.06..584.07 rows=1 width=0)

   ->  Index Scan using pg_tdix on precip_gridded  (cost=0.00..565.66
rows=7356

width=0)

         Index Cond: (thisdate = '2009-01-12 00:00:00'::timestamp
without time zone)

         Filter: (the_geom &&
'0103000020E61000000100000005000000010000C038EB54C

00200008079454240010000C038EB54C0F9FFFF5FA6BB4340FFFFFF3F69CF52C0F9FFFF5
FA6BB434

0FFFFFF3F69CF52C00200008079454240010000C038EB54C00200008079454240'::geom
etry)

(4 rows)

 

Robert W. Burgholzer

Surface Water Modeler

Office of Water Supply and Planning

Virginia Department of Environmental Quality

rwburgholzer at deq.virginia.gov

804-698-4405

Open Source Modeling Tools:

http://sourceforge.net/projects/npsource/

  _____  

From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Mose
Sent: Wednesday, March 18, 2009 3:32 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Index Use Craziness

 

Robert,

I have been haunted by similar issues but I haven't had time to figure
out some explanation I really believe in.

Does this post with a reproducible case seem similar?  It seems like it
to me.

http://postgis.refractions.net/pipermail/postgis-devel/2008-September/00
3547.html

For now when these situations aren't working (the planner favors the
spatial index due to a weird estimate when it is really going to be
lots) I am using a constant geometry [e.g. geometryfromwkt(wkt geometry
string)] for the join condition.  I realize the feasibility of this
depends on your application, or your desire to write a user-defined
function.

Why does the planner get different estimates for the same geometry being
used in a constraint based on it coming from a table, or from a string
in the query?  I think the reason this is happening might be apparent in
lwgeom_estimate.c, but I'm open to being totally wrong :).

-Mose






On Wed, Mar 18, 2009 at 11:32 AM, Paragon Corporation <lr at pcorp.us>
wrote:

Robert,

 Which version of PostgreSQL are you running?  I wander if you are being
bitten by one of these bugs

--the description doesn't quite match any of these 2 - but it could be
another side effect of these
http://www.postgresql.org/docs/8.3/static/release-8-3-7.html

Fix planner problem with sub-SELECT in the output list of a larger
subquery
(Tom)

Or
http://www.postgresql.org/docs/8.3/static/release-8-3-6.html
Fix planner misestimation of selectivity when transitive equality is
applied
to an outer-join clause (Tom)

This could result in bad plans for queries like ... from a left join b
on
a.a1 = b.b1 where a.a1 = 42 ...



Hope that helps,
Regina


-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of

Burgholzer,Robert
Sent: Wednesday, March 18, 2009 9:05 AM
To: PostGIS Users Discussion

Subject: [postgis-users] Index Use Craziness

I am having much trouble getting postgres to use the proper index.

I have a table with about 20 million daily rainfall observation points,
with
between 0 and 19,000 points on any given day.  In order to speed up
querying, I have created a GIST index on the geometry column, and an
index
on the date column.

What I want to do is to summarize the observed rainfall in another shape
table on any given day.  If I just want to query points by date, this
uses
the date index, and performs very quickly (see "CASE 1" below).

When I try to introduce a geometry operator, the GIST index takes over,
and
the index on the "thisdate" column is ignored.  The weird thing is, I
can
hear it churning the heck out of my hard drive checking on all the
points.
(See "CASE 2" below).

I even tried to sub-query the points for the given date, but the index
did
not seem to be fooled ("CASE 3").  It gave me the exact same query
plan!!  I
am unsure if I really understand the way a sub-query functions given
this
result.

Any help would be appreciated!

Robert

CASE 1:
wsp=# explain select count(*) from precip_gridded where thisdate =
'2009-01-12';

                             QUERY PLAN
------------------------------------------------------------------------
----
 Aggregate  (cost=479.43..479.44 rows=1 width=0)
  ->  Index Scan using pg_tdix on precip_gridded  (cost=0.00..451.92
rows=11004
 width=0)
        Index Cond: (thisdate = '2009-01-12 00:00:00'::timestamp without
time zone)
(3 rows)


CASE 2:
explain select count(*) from precip_gridded where thisdate =
'2009-01-12'
and the_geom && setsrid((select extent(the_geom) from proj_seggroups
where
gid = 448),4326);

                             QUERY PLAN
------------------------------------------------------------------------
----
 Aggregate  (cost=100.23..100.24 rows=1 width=0)
  InitPlan
    ->  Aggregate  (cost=5.87..5.88 rows=1 width=2807)
          ->  Index Scan using psg_giix on proj_seggroups
(cost=0.00..5.87 rows=1 width=2807)
                Index Cond: (gid = 448)
  ->  Index Scan using pg_gix on precip_gridded  (cost=0.00..94.35
rows=1 width=0)
        Index Cond: (the_geom && setsrid(($0)::geometry, 4326))
        Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without
time
zone) AND (the_geom && setsrid(($0)::geometry, 4326)))
(8 rows)


CASE 3: Sub-query by date to try and force the use of date index wsp=#
explain select count(a.*) from (select * from precip_gridded where
thisdate
= '2009-01-12') as a where a.the_geom && setsrid((select
extent(the_geom) from proj_seggroups where gid = 448),4326);

                             QUERY PLAN

------------------------------------------------------------------------
----
 Aggregate  (cost=100.23..100.24 rows=1 width=81)
  InitPlan
    ->  Aggregate  (cost=5.87..5.88 rows=1 width=2807)
          ->  Index Scan using psg_giix on proj_seggroups
(cost=0.00..5.87 rows=1 width=2807)
                Index Cond: (gid = 448)
  ->  Index Scan using pg_gix on precip_gridded  (cost=0.00..94.35
rows=1 width=81)
        Index Cond: (the_geom && setsrid(($0)::geometry, 4326))
        Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without
time
zone) AND (the_geom && setsrid(($0)::geometry, 4326)))
(8 rows)


Robert W. Burgholzer
Surface Water Modeler
Office of Water Supply and Planning
Virginia Department of Environmental Quality
rwburgholzer at deq.virginia.gov
804-698-4405
Open Source Modeling Tools:
http://sourceforge.net/projects/npsource/


_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20090319/c4fbdd11/attachment.html>


More information about the postgis-users mailing list