[postgis-users] Query performance help
Paul Ramsey
pramsey at refractions.net
Wed Mar 23 08:10:11 PST 2005
Some potential gotchas:
- I have been suckered in the past when it turned out I had mismatched
types in the columns I was joining. (an int8 on one side and an int4 on
the other). This caused postgresql to choose a terrible query plan.
- select count(*) is hacked (as are many things) in mysql to make it
"fast". if you want a fast row count in postgresql you have to use
"select count(primary_key_column)".
Finally, useful testing:
- use "explain analyze <query>" to compare the expected and realized
times for various parts of your query. If you find places the query
planner is making bad choices, it will either (a) show you a place you
screwed up your data load or (b) hints to optimize around.
P
Stephen Woodbridge wrote:
> This is slightly OT, but I just migrated from MySQL to postgres/postgis
> and I'm having a problem with query performance. Any thoughts would be
> appreciated as I feel this is just something dumb because don't know the
> ins and outs of postgres yet.
>
> I have three tables abbreviated but like (if you are familiar with it,
> these are from the census summary file info) and my goal is to generate
> thematic maps. In this case of zipcode tabulation areas (zcta):
>
> table: sf3geo
> zcta5
> sumlev
> logrecno
>
> table: dp3
> logrecno
> lots of attribute cols
>
> table: zcta
> zcta5
> the_geom
>
> So I need to select all the sf3geo.sumlev=871 to just get the zcta
> records and join it to dp3 based on logrecno and join it to zcta based
> on zcta5.
> Just trying to do the first join is a killer!
>
> explain select b.zcta5, round(a.dp30063/200001*8) as income from sf3geo
> b, dp3 a where b.sumlev=871 and b.logrecno=a.logrecno;
>
> Nested Loop (cost=0.00..460665.14 rows=132237 width=8)
> -> Index Scan using sf3geo_sumlev_idx on sf3geo b
> (cost=0.00..29652.79 rows=7491 width=15)
> Index Cond: ((sumlev)::text = '871'::text)
> -> Index Scan using dp3_logrecno_idx on dp3 a (cost=0.00..57.14
> rows=18 width=15)
> Index Cond: (("outer".logrecno)::text = (a.logrecno)::text)
>
> In fact, just counting the rows is slow:
>
> explain select count(*) from sf3geo where sumlev=871;
>
> Aggregate (cost=29671.52..29671.52 rows=1 width=0)
> -> Index Scan using sf3geo_sumlev_idx on sf3geo (cost=0.00..29652.79
> rows=7491 width=0)
> Index Cond: ((sumlev)::text = '871'::text)
>
> So I am wondering what am I missing that make this go faster? Can using
> view improve performance? For example by pre-filtering the sf3geo to
> only show the sumlev=871 records?
>
> I did vaccuum analyze all the tables, but I changed some indexes so may
> be I need to do that again.
>
> -Steve
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list