[postgis-users] more query optimization
Robert W. Burgholzer
rburgholzer at maptech-inc.com
Tue Jun 1 06:41:27 PDT 2004
I am trying to do a join using views, and am having some serious
performance issues as a result.
I have a postgis table, with soil map shapes and ID's. I have three other
tables that when linked together in a view give the names and
characteristics of the soils in the soil map, and the variation of the
characteristics by soil layer, indexed by a key they share with the soil
shape table.
All of these layers can be queried fairly quickly, although my soil map
table has around a half million entries in it, with about 4,000 unique ID's
that map to the other two data tables. When I just pull the soil layer
shapes into my arcview view (using Armin Burger's avpgconn) without joining
on the property data, the process takes about 10-20 seconds depending on
the width of the view. When I pull it with one soil layer (the top layer
only) properties view joined, I get about 10 minutes as the result, and
when I join both the top soils layer view and the second soil layer view my
query time is about 15-20 minutes.
I surmise from the query explain output that the join on the key properties
between these layers is occurring before the soil shape records are weeded
out to the spatial extent (which is a very low cost selection since they
are indexed), thus, all half million records are being joined first, then
the ones in the spatial extent are being grabbed. What I would like to do
is reverse this order. Does anyone have any idea how to do this?
Thanks in advance.
The results of EXPLAIN commands on the two views, and on the query that
joins them are as follows:
*************************************
***** the soil shape query *******
Data_Soils=# explain select * from ssurgo_shp where "MUKEY" = '118512';
QUERY
PLAN
------------------------------------------------------------------------------------
Index Scan using ssmukey_ix on ssurgo_shp (cost=0.00..1074.82 rows=280
width=227)
Index Cond: ("MUKEY" = '118512'::bpchar)
(2 rows)
*************************************
**** the soil property query *****
Data_Soils=# explain select * from ssurgoprops_l1 where mukey = '118512';
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------
Subquery Scan ssurgoprops_l1 (cost=0.00..172.51 rows=1 width=87)
-> Aggregate (cost=0.00..172.51 rows=1 width=87)
-> Group (cost=0.00..172.39 rows=3 width=87)
-> Nested Loop (cost=0.00..172.38 rows=3 width=87)
Join Filter: ("outer".hydgrp = "inner".hydgrp)
-> Nested Loop (cost=0.00..109.91 rows=2 width=63)
-> Index Scan using compmukey_ix on
component (cost=0.00..20.79 rows=5 width=34)
Index Cond: (mukey = '118512'::character
varying)
-> Index Scan using ch_cokey_ix on
chorizon (cost=0.00..18.18 rows=1 width=29)
Index Cond: (chorizon.cokey = "outer".cokey)
Filter: (desgnvert = 1)
-> Seq Scan on hspfinfilt (cost=0.00..20.00
rows=1000 width=24)
*************************************
**** the combination query ******
Data_Soils=# explain SELECT
musym,mukey,infilt,depth,clay_pct,sand_pct,ksat,thetasat,thetafc,hydgrp,uzsn,lzsn,the_geom
FROM soilpropsdd_l1 WHERE the_geom && GeometryFromText('BOX3D(-77.765306
36.944388,-77.408085 37.306579)'::box3d,-1);
QUERY PLAN
--------------------------------------------------------------------------------
-------------------------------------------------
Nested Loop (cost=4843.38..5106.68 rows=2 width=303)
Join Filter: ("outer".mukey = "inner".mukey)
-> Merge Join (cost=2544.81..2547.60 rows=1 width=158)
Merge Cond: ("outer".mukey = "inner".mukey)
-> Sort (cost=11.78..11.79 rows=2 width=77)
Sort Key: ssurgo_shp.mukey
-> Index Scan using soildd_ix on
ssurgo_shp (cost=0.00..11.77 r
ows=2 width=77)
Index Cond: (wkb_geometry &&
'SRID=-1;BOX3D(-77.765306 36.9
44388 0,-77.408085 37.306579 0)'::geometry)
-> Sort (cost=2533.03..2534.41 rows=553 width=87)
Sort Key: ssurgoprops_l2.mukey
-> Subquery Scan ssurgoprops_l2 (cost=2273.01..2507.86
rows=553
width=87)
-> Aggregate (cost=2273.01..2507.86 rows=553 width=87)
-> Group (cost=2273.01..2300.64 rows=5526
width=87)
-> Sort (cost=2273.01..2286.82
rows=5526 widt
h=87)
Sort Key: component.mukey
-> Merge
Join (cost=1840.18..1929.52 ro
ws=5526 width=87)
Merge Cond: ("outer".hydgrp =
"inne
r".hydgrp)
-> Sort
(cost=1770.35..1777.99 ro
ws=3053 width=63)
Sort Key: component.hydgrp
-> Hash
Join (cost=847.30..
1593.65 rows=3053 width=63)
Hash Cond:
("outer".cok
ey = "inner".cokey)
-> Seq Scan on
chorizo
n (cost=0.00..685.29 rows=3053 width=29)
Filter:
(desgnver
t = 2)
-> Hash
(cost=473.64.
.473.64 rows=8364 width=34)
-> Seq
Scan on c
omponent (cost=0.00..473.64 rows=8364 width=34)
-> Sort (cost=69.83..72.33
rows=1
000 width=24)
Sort Key: hspfinfilt.hydgrp
-> Seq Scan on
hspfinfilt (
cost=0.00..20.00 rows=1000 width=24)
-> Subquery Scan ssurgoprops_l1 (cost=2298.57..2551.64 rows=595 width=87)
-> Aggregate (cost=2298.57..2551.64 rows=595 width=87)
-> Group (cost=2298.57..2328.34 rows=5955 width=87)
-> Sort (cost=2298.57..2313.45 rows=5955 width=87)
Sort Key: component.mukey
-> Merge Join (cost=1829.33..1925.21
rows=5955 widt
h=87)
Merge Cond: ("outer".hydgrp = "inner".hydgrp)
-> Sort (cost=1759.50..1767.72
rows=3290 widt
h=63)
Sort Key: component.hydgrp
-> Hash
Join (cost=816.22..1567.31 rows
=3290 width=63)
Hash Cond: ("outer".cokey =
"inner"
.cokey)
-> Seq Scan on
chorizon (cost=0.0
0..685.29 rows=3290 width=29)
Filter: (desgnvert = 1)
-> Hash
(cost=473.64..473.64 rows
=8364 width=34)
-> Seq Scan on
component (c
ost=0.00..473.64 rows=8364 width=34)
-> Sort (cost=69.83..72.33 rows=1000
width=24
)
Sort Key: hspfinfilt.hydgrp
-> Seq Scan on
hspfinfilt (cost=0.00..2
0.00 rows=1000 width=24)
(46 rows)
Robert Burgholzer
Environmental Engineer
MapTech Inc.
http://www.maptech-inc.com/
More information about the postgis-users
mailing list