[postgis-users] caching computations ?

Paragon Corporation lr at pcorp.us
Sat Sep 13 07:04:24 PDT 2008



> Try this query and see if it comes closer to your function example.
>     SELECT t.typeid, count(*)
>     FROM fbtype t, fbgeom g,
>          (SELECT ST_Simplify(shape, 1.0) AS shape
>           FROM fbgeom
>           WHERE topicid='9202a8c04000641f800000000000d0f6' ORDER BY 
> ST_Simplify(shape, 1.0)) AS location
>     WHERE t.topicid=g.topicid
>     AND t.typeid='9202a8c04000641f8000000000000669'
>     AND ST_Within(g.shape, location.shape)
>     GROUP BY 1
>     ORDER BY 2 DESC;

> I'm afraid I don't understand this query. It invokes ST_Simplify() twice,
sorting on the second invocation ?

So did you try the above?  It seems counter intuitive, but the simplify
would only run once per invocation so not as bad looking as it appears.

The idea of the ORDER BY  is that it forces the planner into thinking
materializing is cheaper than not.  It seems to work for me about 7 out o
10.

Mark Cave-Ayland had mentioned this trick and explained the concept a long
long time ago.  Wish I could remember the post and when, and I thought -
that looks counterintuitive, but it works 7 out of 10 times I have tried it.

I suppose you could try ordering by a constant, but I think the planner is
smart enough to see thru that.  Ordering by the simplify I think may also
have the side effect of making the planner more likely pull records in page
order (maybe not though since it probably does that already by the index).

Hope that helps,
Regina

> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andi Vajda
> Sent: Friday, September 12, 2008 9:15 PM
> To: PostGIS Users
> Subject: [postgis-users] caching computations ?
>
>
> The following two queries are functionally equivalent in that they 
> return the same result. They're also equivalent in structure. Except 
> that the first one uses a subquery in the from clause and the second 
> one a function executing the same query as that subquery. Both use the 
> geo index, fbgeom_shape_idx.
>
> The first one runs in 246536ms, the second one in 1745ms.
>
> Is it because in the latter, the ST_Simplify() is computed only once 
> per shape considered whereas in the former at every test in the join ?
>
> Thanks !
>
> Andi..
>
> EXPLAIN ANALYZE
>     SELECT t.typeid, count(*)
>     FROM fbtype t, fbgeom g,
>          (SELECT ST_Simplify(shape, 1.0) AS shape
>           FROM fbgeom
>           WHERE topicid='9202a8c04000641f800000000000d0f6') AS location
>     WHERE t.topicid=g.topicid
>     AND t.typeid='9202a8c04000641f8000000000000669'
>     AND ST_Within(g.shape, location.shape)
>     GROUP BY 1
>     ORDER BY 2 DESC;
>
> Sort  (cost=32.41..32.41 rows=1 width=33) (actual
> time=246536.348..246536.349 rows=1 loops=1)
>    Sort Key: (count(*))
>    Sort Method:  quicksort  Memory: 25kB
>    ->  GroupAggregate  (cost=0.00..32.40 rows=1 width=33) (actual
> time=246536.331..246536.332 rows=1 loops=1)
>          ->  Nested Loop  (cost=0.00..32.38 rows=1 width=33) (actual 
> time=72.844..246531.640 rows=5596 loops=1)
>                ->  Nested Loop  (cost=0.00..21.07 rows=1 width=32) 
> (actual
> time=0.426..245923.937 rows=40387 loops=1)
>                      Join Filter: _st_within(g.shape, 
> st_simplify(fbgeom.shape, 1::double precision))
>                      ->  Index Scan using fbgeom_topicid_idx on fbgeom
> (cost=0.00..8.53 rows=1 width=123) (actual time=0.074..0.112 rows=8
loops=1)
>                            Index Cond: (topicid =
> '9202a8c04000641f800000000000d0f6'::bpchar)
>                      ->  Index Scan using fbgeom_shape_idx on fbgeom g 
> (cost=0.00..12.50 rows=2 width=155) (actual time=44.315..15368.010 
> rows=6108
> loops=8)
>                            Index Cond: (g.shape && 
> st_simplify(fbgeom.shape, 1::double precision))
>                            Filter: (g.shape && 
> st_simplify(fbgeom.shape, 1::double precision))
>                ->  Index Scan using fbtype_pkey on fbtype t
> (cost=0.00..11.27 rows=3 width=66) (actual time=0.014..0.014 rows=0
> loops=40387)
>                      Index Cond: ((t.topicid = g.topicid) AND 
> (t.typeid =
> '9202a8c04000641f8000000000000669'::bpchar))
>  Total runtime: 246536.530 ms
>
> versus:
>
> CREATE OR REPLACE FUNCTION gix_simplify(character(32), double 
> precision) RETURNS SETOF geometry as $$
>     SELECT ST_Simplify(shape, $2) FROM fbgeom WHERE topicid=$1; $$ 
> LANGUAGE SQL;
>
> EXPLAIN ANALYZE
>     SELECT t.typeid, count(*)
>     FROM fbtype t, fbgeom g,
>          gix_simplify('9202a8c04000641f800000000000d0f6', 1.0) AS location
>     WHERE t.topicid=g.topicid
>     AND t.typeid='9202a8c04000641f8000000000000669'
>     AND ST_Within(g.shape, location)
>     GROUP BY 1
>     ORDER BY 2 DESC;
>
> Sort  (cost=20304.25..20304.27 rows=6 width=33) (actual
> time=1757.072..1757.072 rows=1 loops=1)
>    Sort Key: (count(*))
>    Sort Method:  quicksort  Memory: 25kB
>    ->  GroupAggregate  (cost=0.00..20304.18 rows=6 width=33) (actual
> time=1757.053..1757.053 rows=1 loops=1)
>          ->  Nested Loop  (cost=0.00..20302.23 rows=374 width=33) 
> (actual
> time=69.671..1753.826 rows=5596 loops=1)
>                ->  Nested Loop  (cost=0.00..11726.96 rows=809 
> width=32) (actual time=3.663..1304.404 rows=40387 loops=1)
>                      Join Filter: _st_within(g.shape, location.location)
>                      ->  Function Scan on gix_simplify location 
> (cost=0.00..260.00 rows=1000 width=32) (actual time=3.311..3.325 
> rows=8
> loops=1)
>                      ->  Index Scan using fbgeom_shape_idx on fbgeom g
> (cost=0.00..11.44 rows=2 width=155) (actual time=43.784..71.728 
> rows=6108
> loops=8)
>                            Index Cond: (g.shape && location.location)
>                            Filter: (g.shape && location.location)
>                ->  Index Scan using fbtype_pkey on fbtype t
> (cost=0.00..10.56 rows=3 width=66) (actual time=0.010..0.011 rows=0
> loops=40387)
>                      Index Cond: ((t.topicid = g.topicid) AND 
> (t.typeid =
> '9202a8c04000641f8000000000000669'::bpchar))
>  Total runtime: 1757.231 ms
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users





More information about the postgis-users mailing list