[postgis-users] caching computations ?

Andi Vajda vajda at metaweb.com
Sat Sep 13 12:08:55 PDT 2008


On Sat, 13 Sep 2008, Paragon Corporation wrote:

>> Try this query and see if it comes closer to your function example.
>>     SELECT t.typeid, count(*)
>>     FROM fbtype t, fbgeom g,
>>          (SELECT ST_Simplify(shape, 1.0) AS shape
>>           FROM fbgeom
>>           WHERE topicid='9202a8c04000641f800000000000d0f6' ORDER BY
>> ST_Simplify(shape, 1.0)) AS location
>>     WHERE t.topicid=g.topicid
>>     AND t.typeid='9202a8c04000641f8000000000000669'
>>     AND ST_Within(g.shape, location.shape)
>>     GROUP BY 1
>>     ORDER BY 2 DESC;
>
>> I'm afraid I don't understand this query. It invokes ST_Simplify() twice,
> sorting on the second invocation ?
>
> So did you try the above?  It seems counter intuitive, but the simplify
> would only run once per invocation so not as bad looking as it appears.

I tried your query and it works. The ORDER BY trick does indeed nudge the 
planner in the right direction. I replaced 'ORDER BY ST_Simplify(shape, 
1.0)' with 'ORDER BY shape' and it still works the same magic :)

Thanks !

Andi..

> The idea of the ORDER BY  is that it forces the planner into thinking
> materializing is cheaper than not.  It seems to work for me about 7 out o
> 10.
>
> Mark Cave-Ayland had mentioned this trick and explained the concept a long
> long time ago.  Wish I could remember the post and when, and I thought -
> that looks counterintuitive, but it works 7 out of 10 times I have tried it.
>
> I suppose you could try ordering by a constant, but I think the planner is
> smart enough to see thru that.  Ordering by the simplify I think may also
> have the side effect of making the planner more likely pull records in page
> order (maybe not though since it probably does that already by the index).
>
> Hope that helps,
> Regina
>
>> -----Original Message-----
>> From: postgis-users-bounces at postgis.refractions.net
>> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
>> Andi Vajda
>> Sent: Friday, September 12, 2008 9:15 PM
>> To: PostGIS Users
>> Subject: [postgis-users] caching computations ?
>>
>>
>> The following two queries are functionally equivalent in that they
>> return the same result. They're also equivalent in structure. Except
>> that the first one uses a subquery in the from clause and the second
>> one a function executing the same query as that subquery. Both use the
>> geo index, fbgeom_shape_idx.
>>
>> The first one runs in 246536ms, the second one in 1745ms.
>>
>> Is it because in the latter, the ST_Simplify() is computed only once
>> per shape considered whereas in the former at every test in the join ?
>>
>> Thanks !
>>
>> Andi..
>>
>> EXPLAIN ANALYZE
>>     SELECT t.typeid, count(*)
>>     FROM fbtype t, fbgeom g,
>>          (SELECT ST_Simplify(shape, 1.0) AS shape
>>           FROM fbgeom
>>           WHERE topicid='9202a8c04000641f800000000000d0f6') AS location
>>     WHERE t.topicid=g.topicid
>>     AND t.typeid='9202a8c04000641f8000000000000669'
>>     AND ST_Within(g.shape, location.shape)
>>     GROUP BY 1
>>     ORDER BY 2 DESC;
>>
>> Sort  (cost=32.41..32.41 rows=1 width=33) (actual
>> time=246536.348..246536.349 rows=1 loops=1)
>>    Sort Key: (count(*))
>>    Sort Method:  quicksort  Memory: 25kB
>>    ->  GroupAggregate  (cost=0.00..32.40 rows=1 width=33) (actual
>> time=246536.331..246536.332 rows=1 loops=1)
>>          ->  Nested Loop  (cost=0.00..32.38 rows=1 width=33) (actual
>> time=72.844..246531.640 rows=5596 loops=1)
>>                ->  Nested Loop  (cost=0.00..21.07 rows=1 width=32)
>> (actual
>> time=0.426..245923.937 rows=40387 loops=1)
>>                      Join Filter: _st_within(g.shape,
>> st_simplify(fbgeom.shape, 1::double precision))
>>                      ->  Index Scan using fbgeom_topicid_idx on fbgeom
>> (cost=0.00..8.53 rows=1 width=123) (actual time=0.074..0.112 rows=8
> loops=1)
>>                            Index Cond: (topicid =
>> '9202a8c04000641f800000000000d0f6'::bpchar)
>>                      ->  Index Scan using fbgeom_shape_idx on fbgeom g
>> (cost=0.00..12.50 rows=2 width=155) (actual time=44.315..15368.010
>> rows=6108
>> loops=8)
>>                            Index Cond: (g.shape &&
>> st_simplify(fbgeom.shape, 1::double precision))
>>                            Filter: (g.shape &&
>> st_simplify(fbgeom.shape, 1::double precision))
>>                ->  Index Scan using fbtype_pkey on fbtype t
>> (cost=0.00..11.27 rows=3 width=66) (actual time=0.014..0.014 rows=0
>> loops=40387)
>>                      Index Cond: ((t.topicid = g.topicid) AND
>> (t.typeid =
>> '9202a8c04000641f8000000000000669'::bpchar))
>>  Total runtime: 246536.530 ms
>>
>> versus:
>>
>> CREATE OR REPLACE FUNCTION gix_simplify(character(32), double
>> precision) RETURNS SETOF geometry as $$
>>     SELECT ST_Simplify(shape, $2) FROM fbgeom WHERE topicid=$1; $$
>> LANGUAGE SQL;
>>
>> EXPLAIN ANALYZE
>>     SELECT t.typeid, count(*)
>>     FROM fbtype t, fbgeom g,
>>          gix_simplify('9202a8c04000641f800000000000d0f6', 1.0) AS location
>>     WHERE t.topicid=g.topicid
>>     AND t.typeid='9202a8c04000641f8000000000000669'
>>     AND ST_Within(g.shape, location)
>>     GROUP BY 1
>>     ORDER BY 2 DESC;
>>
>> Sort  (cost=20304.25..20304.27 rows=6 width=33) (actual
>> time=1757.072..1757.072 rows=1 loops=1)
>>    Sort Key: (count(*))
>>    Sort Method:  quicksort  Memory: 25kB
>>    ->  GroupAggregate  (cost=0.00..20304.18 rows=6 width=33) (actual
>> time=1757.053..1757.053 rows=1 loops=1)
>>          ->  Nested Loop  (cost=0.00..20302.23 rows=374 width=33)
>> (actual
>> time=69.671..1753.826 rows=5596 loops=1)
>>                ->  Nested Loop  (cost=0.00..11726.96 rows=809
>> width=32) (actual time=3.663..1304.404 rows=40387 loops=1)
>>                      Join Filter: _st_within(g.shape, location.location)
>>                      ->  Function Scan on gix_simplify location
>> (cost=0.00..260.00 rows=1000 width=32) (actual time=3.311..3.325
>> rows=8
>> loops=1)
>>                      ->  Index Scan using fbgeom_shape_idx on fbgeom g
>> (cost=0.00..11.44 rows=2 width=155) (actual time=43.784..71.728
>> rows=6108
>> loops=8)
>>                            Index Cond: (g.shape && location.location)
>>                            Filter: (g.shape && location.location)
>>                ->  Index Scan using fbtype_pkey on fbtype t
>> (cost=0.00..10.56 rows=3 width=66) (actual time=0.010..0.011 rows=0
>> loops=40387)
>>                      Index Cond: ((t.topicid = g.topicid) AND
>> (t.typeid =
>> '9202a8c04000641f8000000000000669'::bpchar))
>>  Total runtime: 1757.231 ms
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



More information about the postgis-users mailing list