[postgis-users] more query optimization

Robert W. Burgholzer rburgholzer at maptech-inc.com
Tue Jun 1 06:41:27 PDT 2004


I am trying to do a join using views, and am having some serious 
performance issues as a result.

I have a postgis table, with soil map shapes and ID's. I have three other 
tables that when linked together in a view give the names and 
characteristics of the soils in the soil map, and the variation of the 
characteristics by soil layer, indexed by a key they share with the soil 
shape table.

All of these layers can be queried fairly quickly, although my soil map 
table has around a half million entries in it, with about 4,000 unique ID's 
that map to the other two data tables. When I just pull the soil layer 
shapes into my arcview view (using Armin Burger's avpgconn) without joining 
on the property data, the process takes about 10-20 seconds depending on 
the width of the view. When I pull it with one soil layer (the top layer 
only) properties view joined, I get about 10 minutes as the result, and 
when I join both the top soils layer view and the second soil layer view my 
query time is about 15-20 minutes.

I surmise from the query explain output that the join on the key properties 
between these layers is occurring before the soil shape records are weeded 
out to the spatial extent (which is a very low cost selection since they 
are indexed), thus, all half million records are being joined first, then 
the ones in the spatial extent are being grabbed. What I would like to do 
is reverse this order. Does anyone have any idea how to do this?

Thanks in advance.

The results of EXPLAIN commands on the two views, and on the query that 
joins them are as follows:

*************************************
***** the soil shape query *******
Data_Soils=# explain select * from ssurgo_shp where "MUKEY" = '118512';
                                      QUERY 
PLAN
------------------------------------------------------------------------------------
  Index Scan using ssmukey_ix on ssurgo_shp  (cost=0.00..1074.82 rows=280 
width=227)
    Index Cond: ("MUKEY" = '118512'::bpchar)
(2 rows)


*************************************
**** the soil property query *****
Data_Soils=# explain select * from ssurgoprops_l1 where mukey = '118512';
                                                   QUERY 
PLAN
--------------------------------------------------------------------------------------------------------------
  Subquery Scan ssurgoprops_l1  (cost=0.00..172.51 rows=1 width=87)
    ->  Aggregate  (cost=0.00..172.51 rows=1 width=87)
          ->  Group  (cost=0.00..172.39 rows=3 width=87)
                ->  Nested Loop  (cost=0.00..172.38 rows=3 width=87)
                      Join Filter: ("outer".hydgrp = "inner".hydgrp)
                      ->  Nested Loop  (cost=0.00..109.91 rows=2 width=63)
                            ->  Index Scan using compmukey_ix on 
component  (cost=0.00..20.79 rows=5 width=34)
                                  Index Cond: (mukey = '118512'::character 
varying)
                            ->  Index Scan using ch_cokey_ix on 
chorizon  (cost=0.00..18.18 rows=1 width=29)
                                  Index Cond: (chorizon.cokey = "outer".cokey)
                                  Filter: (desgnvert = 1)
                      ->  Seq Scan on hspfinfilt  (cost=0.00..20.00 
rows=1000 width=24)


*************************************
**** the combination query ******
Data_Soils=# explain SELECT 
musym,mukey,infilt,depth,clay_pct,sand_pct,ksat,thetasat,thetafc,hydgrp,uzsn,lzsn,the_geom 
FROM soilpropsdd_l1 WHERE the_geom && GeometryFromText('BOX3D(-77.765306 
36.944388,-77.408085 37.306579)'::box3d,-1);
                                                            QUERY PLAN

--------------------------------------------------------------------------------
-------------------------------------------------
  Nested Loop  (cost=4843.38..5106.68 rows=2 width=303)
    Join Filter: ("outer".mukey = "inner".mukey)
    ->  Merge Join  (cost=2544.81..2547.60 rows=1 width=158)
          Merge Cond: ("outer".mukey = "inner".mukey)
          ->  Sort  (cost=11.78..11.79 rows=2 width=77)
                Sort Key: ssurgo_shp.mukey
                ->  Index Scan using soildd_ix on 
ssurgo_shp  (cost=0.00..11.77 r
ows=2 width=77)
                      Index Cond: (wkb_geometry && 
'SRID=-1;BOX3D(-77.765306 36.9
44388 0,-77.408085 37.306579 0)'::geometry)
          ->  Sort  (cost=2533.03..2534.41 rows=553 width=87)
                Sort Key: ssurgoprops_l2.mukey
                ->  Subquery Scan ssurgoprops_l2  (cost=2273.01..2507.86 
rows=553
  width=87)
                      ->  Aggregate  (cost=2273.01..2507.86 rows=553 width=87)
                            ->  Group  (cost=2273.01..2300.64 rows=5526 
width=87)
                                  ->  Sort  (cost=2273.01..2286.82 
rows=5526 widt
h=87)
                                        Sort Key: component.mukey
                                        ->  Merge 
Join  (cost=1840.18..1929.52 ro
ws=5526 width=87)
                                              Merge Cond: ("outer".hydgrp = 
"inne
r".hydgrp)
                                              ->  Sort 
(cost=1770.35..1777.99 ro
ws=3053 width=63)
                                                    Sort Key: component.hydgrp
                                                    ->  Hash 
Join  (cost=847.30..
1593.65 rows=3053 width=63)
                                                          Hash Cond: 
("outer".cok
ey = "inner".cokey)
                                                          ->  Seq Scan on 
chorizo
n  (cost=0.00..685.29 rows=3053 width=29)
                                                                Filter: 
(desgnver
t = 2)
                                                          ->  Hash 
(cost=473.64.
.473.64 rows=8364 width=34)
                                                                ->  Seq 
Scan on c
omponent  (cost=0.00..473.64 rows=8364 width=34)
                                              ->  Sort  (cost=69.83..72.33 
rows=1
000 width=24)
                                                    Sort Key: hspfinfilt.hydgrp
                                                    ->  Seq Scan on 
hspfinfilt  (
cost=0.00..20.00 rows=1000 width=24)
    ->  Subquery Scan ssurgoprops_l1  (cost=2298.57..2551.64 rows=595 width=87)
          ->  Aggregate  (cost=2298.57..2551.64 rows=595 width=87)
                ->  Group  (cost=2298.57..2328.34 rows=5955 width=87)
                      ->  Sort  (cost=2298.57..2313.45 rows=5955 width=87)
                            Sort Key: component.mukey
                            ->  Merge Join  (cost=1829.33..1925.21 
rows=5955 widt
h=87)
                                  Merge Cond: ("outer".hydgrp = "inner".hydgrp)
                                  ->  Sort  (cost=1759.50..1767.72 
rows=3290 widt
h=63)
                                        Sort Key: component.hydgrp
                                        ->  Hash 
Join  (cost=816.22..1567.31 rows
=3290 width=63)
                                              Hash Cond: ("outer".cokey = 
"inner"
.cokey)
                                              ->  Seq Scan on 
chorizon  (cost=0.0
0..685.29 rows=3290 width=29)
                                                    Filter: (desgnvert = 1)
                                              ->  Hash 
(cost=473.64..473.64 rows
=8364 width=34)
                                                    ->  Seq Scan on 
component  (c
ost=0.00..473.64 rows=8364 width=34)
                                  ->  Sort  (cost=69.83..72.33 rows=1000 
width=24
)
                                        Sort Key: hspfinfilt.hydgrp
                                        ->  Seq Scan on 
hspfinfilt  (cost=0.00..2
0.00 rows=1000 width=24)
(46 rows)




Robert Burgholzer
Environmental Engineer
MapTech Inc.
http://www.maptech-inc.com/ 




More information about the postgis-users mailing list