[postgis-users] caching computations ?
Andi Vajda
vajda at metaweb.com
Fri Sep 12 23:18:57 PDT 2008
On Sat, 13 Sep 2008, Paragon Corporation wrote:
> This is a bit of an annoying fact that was brought up way way way back.
>
> The issue is that by wrapping your call in a function, you are forcing the
> planner to materialize the table. If its not wrapped in a function, then
> the planner has the luxury of deciding whether to materialize or not.
I wish there was a way to force the materialization explicitely.
Doing it via functions is the next best thing, I guess.
Actually, functions provide a neat way here to cache away frequently used
transformed shapes. Your tip about STABLE should help a bunch too.
> Try this query and see if it comes closer to your function example.
> SELECT t.typeid, count(*)
> FROM fbtype t, fbgeom g,
> (SELECT ST_Simplify(shape, 1.0) AS shape
> FROM fbgeom
> WHERE topicid='9202a8c04000641f800000000000d0f6' ORDER BY
> ST_Simplify(shape, 1.0)) AS location
> WHERE t.topicid=g.topicid
> AND t.typeid='9202a8c04000641f8000000000000669'
> AND ST_Within(g.shape, location.shape)
> GROUP BY 1
> ORDER BY 2 DESC;
I'm afraid I don't understand this query. It invokes ST_Simplify() twice,
sorting on the second invocation ?
> Also by the way your SQL function is marked as VOLATILE (well that's the
> default if not explicitly stated) - if you plan to move this into the SELECT
> to run against all topicids. May help to mark it as STABLE.
Thanks for the tip !
Andi..
>
> Hope that helps,
> Regina
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Andi
> Vajda
> Sent: Friday, September 12, 2008 9:15 PM
> To: PostGIS Users
> Subject: [postgis-users] caching computations ?
>
>
> The following two queries are functionally equivalent in that they return
> the same result. They're also equivalent in structure. Except that the first
> one uses a subquery in the from clause and the second one a function
> executing the same query as that subquery. Both use the geo index,
> fbgeom_shape_idx.
>
> The first one runs in 246536ms, the second one in 1745ms.
>
> Is it because in the latter, the ST_Simplify() is computed only once per
> shape considered whereas in the former at every test in the join ?
>
> Thanks !
>
> Andi..
>
> EXPLAIN ANALYZE
> SELECT t.typeid, count(*)
> FROM fbtype t, fbgeom g,
> (SELECT ST_Simplify(shape, 1.0) AS shape
> FROM fbgeom
> WHERE topicid='9202a8c04000641f800000000000d0f6') AS location
> WHERE t.topicid=g.topicid
> AND t.typeid='9202a8c04000641f8000000000000669'
> AND ST_Within(g.shape, location.shape)
> GROUP BY 1
> ORDER BY 2 DESC;
>
> Sort (cost=32.41..32.41 rows=1 width=33) (actual
> time=246536.348..246536.349 rows=1 loops=1)
> Sort Key: (count(*))
> Sort Method: quicksort Memory: 25kB
> -> GroupAggregate (cost=0.00..32.40 rows=1 width=33) (actual
> time=246536.331..246536.332 rows=1 loops=1)
> -> Nested Loop (cost=0.00..32.38 rows=1 width=33) (actual
> time=72.844..246531.640 rows=5596 loops=1)
> -> Nested Loop (cost=0.00..21.07 rows=1 width=32) (actual
> time=0.426..245923.937 rows=40387 loops=1)
> Join Filter: _st_within(g.shape,
> st_simplify(fbgeom.shape, 1::double precision))
> -> Index Scan using fbgeom_topicid_idx on fbgeom
> (cost=0.00..8.53 rows=1 width=123) (actual time=0.074..0.112 rows=8 loops=1)
> Index Cond: (topicid =
> '9202a8c04000641f800000000000d0f6'::bpchar)
> -> Index Scan using fbgeom_shape_idx on fbgeom g
> (cost=0.00..12.50 rows=2 width=155) (actual time=44.315..15368.010 rows=6108
> loops=8)
> Index Cond: (g.shape &&
> st_simplify(fbgeom.shape, 1::double precision))
> Filter: (g.shape && st_simplify(fbgeom.shape,
> 1::double precision))
> -> Index Scan using fbtype_pkey on fbtype t
> (cost=0.00..11.27 rows=3 width=66) (actual time=0.014..0.014 rows=0
> loops=40387)
> Index Cond: ((t.topicid = g.topicid) AND (t.typeid =
> '9202a8c04000641f8000000000000669'::bpchar))
> Total runtime: 246536.530 ms
>
> versus:
>
> CREATE OR REPLACE FUNCTION gix_simplify(character(32), double precision)
> RETURNS SETOF geometry as $$
> SELECT ST_Simplify(shape, $2) FROM fbgeom WHERE topicid=$1; $$ LANGUAGE
> SQL;
>
> EXPLAIN ANALYZE
> SELECT t.typeid, count(*)
> FROM fbtype t, fbgeom g,
> gix_simplify('9202a8c04000641f800000000000d0f6', 1.0) AS location
> WHERE t.topicid=g.topicid
> AND t.typeid='9202a8c04000641f8000000000000669'
> AND ST_Within(g.shape, location)
> GROUP BY 1
> ORDER BY 2 DESC;
>
> Sort (cost=20304.25..20304.27 rows=6 width=33) (actual
> time=1757.072..1757.072 rows=1 loops=1)
> Sort Key: (count(*))
> Sort Method: quicksort Memory: 25kB
> -> GroupAggregate (cost=0.00..20304.18 rows=6 width=33) (actual
> time=1757.053..1757.053 rows=1 loops=1)
> -> Nested Loop (cost=0.00..20302.23 rows=374 width=33) (actual
> time=69.671..1753.826 rows=5596 loops=1)
> -> Nested Loop (cost=0.00..11726.96 rows=809 width=32)
> (actual time=3.663..1304.404 rows=40387 loops=1)
> Join Filter: _st_within(g.shape, location.location)
> -> Function Scan on gix_simplify location
> (cost=0.00..260.00 rows=1000 width=32) (actual time=3.311..3.325 rows=8
> loops=1)
> -> Index Scan using fbgeom_shape_idx on fbgeom g
> (cost=0.00..11.44 rows=2 width=155) (actual time=43.784..71.728 rows=6108
> loops=8)
> Index Cond: (g.shape && location.location)
> Filter: (g.shape && location.location)
> -> Index Scan using fbtype_pkey on fbtype t
> (cost=0.00..10.56 rows=3 width=66) (actual time=0.010..0.011 rows=0
> loops=40387)
> Index Cond: ((t.topicid = g.topicid) AND (t.typeid =
> '9202a8c04000641f8000000000000669'::bpchar))
> Total runtime: 1757.231 ms
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list