[postgis-users] pg7.4.1 performance problem

Tue Feb 17 22:29:26 PST 2004

Hiya PostGISers,

I have a performance problem on a recently migrated 7.3.2 database to 
7.4.1.  Fully patched RH 8.  PostGIS 0.8.1, GEOS 1

Here is my first query that looks good:

explain analyze 
  SELECT asbinary(force_collection(force_2d(geo)),'NDR'),oid::text from  
     (select 
         d.oid as oid, d.rainfall / 25.4 as rainfall, 
         h.the_geom as geo from 
         daily_rainfall_2003 d , hrap_polygons h 
      WHERE d.hrap_i = h.hrap_i and d.valid = '2003-11-25') as foo ;

QUERY PLAN                                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=2082.92..3259.71 rows=2627 width=252) (actual 
time=13.800..13.809 rows=1 loops=1)
   Merge Cond: ("outer".hrap_i = "inner".hrap_i)
   ->  Sort  (cost=2082.92..2101.70 rows=7510 width=6) (actual 
time=0.302..0.332 rows=66 loops=1)
         Sort Key: d.hrap_i
         ->  Index Scan using daily_rainfall_valid_idx on 
daily_rainfall_2003 d  (cost=0.00..1599.48 rows=7510 width=6) (actual 
time=0.047..0.210 rows=66 loops=1)
               Index Cond: ("valid" = '2003-11-25'::date)
   ->  Index Scan using hrap_polygons_hrap_i_idx on hrap_polygons h  
(cost=0.00..1073.16 rows=7988 width=252) (actual time=0.013..8.753 
rows=5102 loops=1)
 Total runtime: 13.931 ms
(8 rows)

-----------------------------
Now we add in the bounding box constraint that mapserver generates and we 
get
------------------------------

explain analyze
  SELECT asbinary(force_collection(force_2d(geo)),'NDR'),oid::text from  
     (select 
         d.oid as oid, d.rainfall / 25.4 as rainfall, 
         h.the_geom as geo from 
         daily_rainfall_2003 d , hrap_polygons h 
      WHERE d.hrap_i = h.hrap_i and d.valid = '2003-11-25') as foo 
   WHERE geo && 
setSRID('BOX3D(159999.999983024 4424687.49997101,
               760000.000004331 4875312.5)'::BOX3D, 26915 );

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Nested Loop  (cost=0.00..2786.49 rows=1 width=252) (actual 
time=1109.977..1778.253 rows=1 loops=1)
   Join Filter: ("inner".hrap_i = "outer".hrap_i)
   ->  Index Scan using hrap_polygons_hrap_i_idx on hrap_polygons h  
(cost=0.00..1093.13 rows=1 width=252) (actual time=0.053..23.555 rows=7988 
loops=1)
         Filter: (the_geom && 'SRID=26915;BOX3D(159999.999983024 
4424687.49997101 0,760000.000004331 4875312.5 0)'::geometry)
   ->  Index Scan using daily_rainfall_valid_idx on daily_rainfall_2003 d  
(cost=0.00..1599.48 rows=7510 width=6) (actual time=0.009..0.158 rows=66 
loops=7988)
         Index Cond: ("valid" = '2003-11-25'::date)
 Total runtime: 1778.341 ms

Whoaaa, this isn't good.  This is just a modest example.  Queries on dates 
with more data basically don't return (impatience on my part).  Well, 
before I ask the pgsql-performance folks, any thoughts from the PostGIS 
side?  

Thanks,
  Daryl

-- 
/**
 * Daryl Herzmann (akrherz at iastate.edu)
 * Program Assistant -- Iowa Environmental Mesonet
 * http://mesonet.agron.iastate.edu
 */