[postgis-users] Query performance help

Wed Mar 23 06:54:47 PST 2005

Hi all,

This is slightly OT, but I just migrated from MySQL to postgres/postgis 
and I'm having a problem with query performance. Any thoughts would be 
appreciated as I feel this is just something dumb because don't know the 
ins and outs of postgres yet.

I have three tables abbreviated but like (if you are familiar with it, 
these are from the census summary file info) and my goal is to generate 
thematic maps. In this case of zipcode tabulation areas (zcta):

table: sf3geo
   zcta5
   sumlev
   logrecno

table: dp3
   logrecno
   lots of attribute cols

table: zcta
   zcta5
   the_geom

So I need to select all the sf3geo.sumlev=871 to just get the zcta 
records and join it to dp3 based on logrecno and join it to zcta based 
on zcta5.

Just trying to do the first join is a killer!

explain select b.zcta5, round(a.dp30063/200001*8) as income from sf3geo 
b, dp3 a where b.sumlev=871 and b.logrecno=a.logrecno;

Nested Loop  (cost=0.00..460665.14 rows=132237 width=8)
   ->  Index Scan using sf3geo_sumlev_idx on sf3geo b 
(cost=0.00..29652.79 rows=7491 width=15)
         Index Cond: ((sumlev)::text = '871'::text)
   ->  Index Scan using dp3_logrecno_idx on dp3 a  (cost=0.00..57.14 
rows=18 width=15)
         Index Cond: (("outer".logrecno)::text = (a.logrecno)::text)

In fact, just counting the rows is slow:

explain select count(*) from sf3geo where sumlev=871;

Aggregate  (cost=29671.52..29671.52 rows=1 width=0)
   ->  Index Scan using sf3geo_sumlev_idx on sf3geo 
(cost=0.00..29652.79 rows=7491 width=0)
         Index Cond: ((sumlev)::text = '871'::text)

So I am wondering what am I missing that make this go faster? Can using 
view improve performance? For example by pre-filtering the sf3geo to 
only show the sumlev=871 records?

I did vaccuum analyze all the tables, but I changed some indexes so may 
be I need to do that again.

-Steve