[postgis-devel] cost of some functions (e.g. st_within) might be too high after 3ddc11483b

Tomas Vondra tomas.vondra at 2ndquadrant.com
Mon Sep 26 12:15:40 PDT 2022


Hi Paul!

On 9/26/22 18:29, Paul Ramsey wrote:
> Dynamic costing based on column size (main determinant of costs)
> would be interesting. However, in the meanwhile it's possible we just
> need to tweak the constants we have.
> 
> I need to check the dates, but one reason function costs got plumped
> waaaay up is that it was quite hard to get a parallel plan out of
> PgSQL without hitting it on the head with a hammer, and being
> CPU-bound a lot of the time, it was/is frequently good to get a
> parallel plan. The difficulty of getting a parallel plan went down
> with PgSQL 12, so maybe we need a switch in version, or just a
> general reduction in some of the more extreme costs.
> 

Yeah. Although this is on 12, and the commit actually was for 12, as it
says:

   #if POSTGIS_PGSQL_VERSION >= 120
   ...

but maybe I'm missing something.

> How far down did you need to drop costs to get your particular query
> flipping into a smarter plan again?
> 

That's a great question. I haven't tried that, and I don't have access
to the system at the moment - I'll check if I can get it again.

However, let me show the simplified plans (just the interesting nodes),
which should be good enough to make some guesses I guess:


9.5 / default

 Limit  (cost=0.42..81859.79 rows=200 width=64)
        (actual time=1.237..36.674 rows=200 loops=1)
   ->  Index Scan using my_index on public.my_table sc
       (cost=0.42..168630.72 rows=412 width=64)
       (actual time=1.236..36.632 rows=200 loops=1)

9.5 / enable_indexscan=off

 Limit  (cost=136244.33..136244.83 rows=200 width=64)
        (actual time=1371.485..1371.562 rows=200 loops=1)
   ->  Sort  (cost=136244.33..136245.36 rows=412 width=64)
             (actual time=1371.482..1371.533 rows=200 loops=1)
         ->  Bitmap Heap Scan on public.my_table sc
             (cost=1527.10..136226.53 rows=412 width=64)
             (actual time=178.270..1369.562 rows=934 loops=1)
               ->  Bitmap Index Scan on my_index
                   (cost=0.00..1527.00 rows=40258 width=0)
                   (actual time=47.968..47.968 rows=10481 loops=1)

12 / default

 Limit  (cost=204639.22..204662.56 rows=200 width=72)
        (actual time=1282.065..1282.732 rows=200 loops=1)
   ->  Gather Merge  (cost=204639.22..204670.49 rows=268 width=72)
                     (actual time=1282.062..1282.700 rows=200 loops=1)
         ->  Sort  (cost=203639.20..203639.53 rows=134 width=72)
                   (actual time=1257.098..1257.141 rows=169 loops=3)
               ->  Parallel Bitmap Heap Scan on public.my_table sc
                   (cost=122331.92..203634.46 rows=134 width=72)
                   (actual time=1217.578..1256.377 rows=311 loops=3)

12 / no parallelism

 Limit  (cost=288022.28..288022.78 rows=200 width=72)
        (actual time=1115.497..1115.562 rows=200 loops=1)
   ->  Sort  (cost=288022.28..288023.08 rows=322 width=72)
             (actual time=1115.495..1115.527 rows=200 loops=1)
         ->  Bitmap Heap Scan on public.my_table sc
             (cost=122331.92..288008.86 rows=322 width=72)
             (actual time=1069.847..1114.376 rows=934 loops=1)
               ->  BitmapAnd
                   (cost=122331.92..122331.92 rows=5756 width=0)
                   (actual time=1065.825..1065.827 rows=0 loops=1)
                     ->  Bitmap Index Scan on my_index
                         (cost=0.00..1642.97 rows=40255 width=0)
                         (actual time=6.927..6.928 rows=10481 loops=1)
                     ->  Bitmap Index Scan on my_index_2
                         (cost=0.00..120688.54 rows=5755997 width=0)
                         (actual time=1056.243..1056.244 rows=...)

12 / no parallelism, enable_bitmapscan=off

 Limit  (cost=0.42..723882.31 rows=200 width=72)
        (actual time=4.007..21.422 rows=200 loops=1)
   ->  Index Scan using my_index on public.my_table sc
       (cost=0.42..1165450.26 rows=322 width=72)
       (actual time=4.005..21.312 rows=200 loops=1)

I think we're really comparing the first+last plans on pg12, so 204662
vs.723882. From that, about 600k is due to the increased function cost
(about 1M for the node, but we only need 200 rows of 322 because of the
LIMIT clause). So it's 120k + 600k, and we need to get below 204k. Which
means cutting 600k to ~80k, so about 1/10 of the cost.

So I assume with the function cost 1000 this would still pick the right
plan. Of course, maybe they should tune the random_page_cost a bit, in
which case the function cost could stay higher.

I'll see if I can get access to the system again and experiment a bit.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


More information about the postgis-devel mailing list