[postgis-users] Index Use Craziness
Burgholzer,Robert
rwburgholzer at deq.virginia.gov
Thu Mar 19 08:45:24 PDT 2009
Mose,
You are quite correct. If I use the WKT format of the query, the
planner selects my date field, otherwise, it prefers the GIST index.
explain select count(*) from precip_gridded where thisdate =
'2009-01-12' and the_geom && setsrid(geomfromtext(
'POLYGON((-83.6753387451172 36.5427703857422,-83.6753387451172
39.4660148620605, -75.2407989501953 39.4660148620605,-75.2407989501953
36.5427703857422,-83.6753387451172 36.5427703857422))'), 4326);
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------
Aggregate (cost=584.06..584.07 rows=1 width=0)
-> Index Scan using pg_tdix on precip_gridded (cost=0.00..565.66
rows=7356
width=0)
Index Cond: (thisdate = '2009-01-12 00:00:00'::timestamp
without time zone)
Filter: (the_geom &&
'0103000020E61000000100000005000000010000C038EB54C
00200008079454240010000C038EB54C0F9FFFF5FA6BB4340FFFFFF3F69CF52C0F9FFFF5
FA6BB434
0FFFFFF3F69CF52C00200008079454240010000C038EB54C00200008079454240'::geom
etry)
(4 rows)
Robert W. Burgholzer
Surface Water Modeler
Office of Water Supply and Planning
Virginia Department of Environmental Quality
rwburgholzer at deq.virginia.gov
804-698-4405
Open Source Modeling Tools:
http://sourceforge.net/projects/npsource/
_____
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Mose
Sent: Wednesday, March 18, 2009 3:32 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Index Use Craziness
Robert,
I have been haunted by similar issues but I haven't had time to figure
out some explanation I really believe in.
Does this post with a reproducible case seem similar? It seems like it
to me.
http://postgis.refractions.net/pipermail/postgis-devel/2008-September/00
3547.html
For now when these situations aren't working (the planner favors the
spatial index due to a weird estimate when it is really going to be
lots) I am using a constant geometry [e.g. geometryfromwkt(wkt geometry
string)] for the join condition. I realize the feasibility of this
depends on your application, or your desire to write a user-defined
function.
Why does the planner get different estimates for the same geometry being
used in a constraint based on it coming from a table, or from a string
in the query? I think the reason this is happening might be apparent in
lwgeom_estimate.c, but I'm open to being totally wrong :).
-Mose
On Wed, Mar 18, 2009 at 11:32 AM, Paragon Corporation <lr at pcorp.us>
wrote:
Robert,
Which version of PostgreSQL are you running? I wander if you are being
bitten by one of these bugs
--the description doesn't quite match any of these 2 - but it could be
another side effect of these
http://www.postgresql.org/docs/8.3/static/release-8-3-7.html
Fix planner problem with sub-SELECT in the output list of a larger
subquery
(Tom)
Or
http://www.postgresql.org/docs/8.3/static/release-8-3-6.html
Fix planner misestimation of selectivity when transitive equality is
applied
to an outer-join clause (Tom)
This could result in bad plans for queries like ... from a left join b
on
a.a1 = b.b1 where a.a1 = 42 ...
Hope that helps,
Regina
-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Burgholzer,Robert
Sent: Wednesday, March 18, 2009 9:05 AM
To: PostGIS Users Discussion
Subject: [postgis-users] Index Use Craziness
I am having much trouble getting postgres to use the proper index.
I have a table with about 20 million daily rainfall observation points,
with
between 0 and 19,000 points on any given day. In order to speed up
querying, I have created a GIST index on the geometry column, and an
index
on the date column.
What I want to do is to summarize the observed rainfall in another shape
table on any given day. If I just want to query points by date, this
uses
the date index, and performs very quickly (see "CASE 1" below).
When I try to introduce a geometry operator, the GIST index takes over,
and
the index on the "thisdate" column is ignored. The weird thing is, I
can
hear it churning the heck out of my hard drive checking on all the
points.
(See "CASE 2" below).
I even tried to sub-query the points for the given date, but the index
did
not seem to be fooled ("CASE 3"). It gave me the exact same query
plan!! I
am unsure if I really understand the way a sub-query functions given
this
result.
Any help would be appreciated!
Robert
CASE 1:
wsp=# explain select count(*) from precip_gridded where thisdate =
'2009-01-12';
QUERY PLAN
------------------------------------------------------------------------
----
Aggregate (cost=479.43..479.44 rows=1 width=0)
-> Index Scan using pg_tdix on precip_gridded (cost=0.00..451.92
rows=11004
width=0)
Index Cond: (thisdate = '2009-01-12 00:00:00'::timestamp without
time zone)
(3 rows)
CASE 2:
explain select count(*) from precip_gridded where thisdate =
'2009-01-12'
and the_geom && setsrid((select extent(the_geom) from proj_seggroups
where
gid = 448),4326);
QUERY PLAN
------------------------------------------------------------------------
----
Aggregate (cost=100.23..100.24 rows=1 width=0)
InitPlan
-> Aggregate (cost=5.87..5.88 rows=1 width=2807)
-> Index Scan using psg_giix on proj_seggroups
(cost=0.00..5.87 rows=1 width=2807)
Index Cond: (gid = 448)
-> Index Scan using pg_gix on precip_gridded (cost=0.00..94.35
rows=1 width=0)
Index Cond: (the_geom && setsrid(($0)::geometry, 4326))
Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without
time
zone) AND (the_geom && setsrid(($0)::geometry, 4326)))
(8 rows)
CASE 3: Sub-query by date to try and force the use of date index wsp=#
explain select count(a.*) from (select * from precip_gridded where
thisdate
= '2009-01-12') as a where a.the_geom && setsrid((select
extent(the_geom) from proj_seggroups where gid = 448),4326);
QUERY PLAN
------------------------------------------------------------------------
----
Aggregate (cost=100.23..100.24 rows=1 width=81)
InitPlan
-> Aggregate (cost=5.87..5.88 rows=1 width=2807)
-> Index Scan using psg_giix on proj_seggroups
(cost=0.00..5.87 rows=1 width=2807)
Index Cond: (gid = 448)
-> Index Scan using pg_gix on precip_gridded (cost=0.00..94.35
rows=1 width=81)
Index Cond: (the_geom && setsrid(($0)::geometry, 4326))
Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without
time
zone) AND (the_geom && setsrid(($0)::geometry, 4326)))
(8 rows)
Robert W. Burgholzer
Surface Water Modeler
Office of Water Supply and Planning
Virginia Department of Environmental Quality
rwburgholzer at deq.virginia.gov
804-698-4405
Open Source Modeling Tools:
http://sourceforge.net/projects/npsource/
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20090319/c4fbdd11/attachment.html>
More information about the postgis-users
mailing list