[postgis-devel] cost of some functions (e.g. st_within) might be too high after 3ddc11483b

Darafei "Komяpa" Praliaskouski me at komzpa.net
Sat Sep 24 12:40:24 PDT 2022


Hi,

Bitmap scans in the plans are usually a sign of a system's random_page_cost
being too high or/and work_mem being too low.

In general, I'd say it is more about incorrect costing of Bitmap scans in
postgres itself - there is an assumption buried deep into it that bitmap
scan with per-page bitmap and with per-tuple bitmap will cost similar,
while actually it should be st_within_cost*n_tuples_per_page times higher.
This can sometimes be fought by higher work_mem but if someone gets this
addressed upstream this will be nice.

For costing of postgis functions, there was a research that measuret real
cost, unlike the rest of postgres where costs are guesswork:
https://github.com/postgis/postgis/pull/104


сб, 24 вер 2022, 23:13 карыстальнік Tomas Vondra <
tomas.vondra at 2ndquadrant.com> напісаў:

> Hi,
>
> We've been investigating a customer issue, where queries got much slower
> due to a query plan change after upgrading from PG 9.5 with PostGIS 2.5
> to PG 12 with PostGIS 3.1.
>
> The exact query and query plan is not very complicated or important -
> it's a simple non-join query, with st_within() in a WHERE condition (and
> a LIMIT on top), and the cardinality estimates are pretty spot on.
>
> The interesting bit is that on 9.5 the GIST index is accessed like this:
>
>    ->  Index Scan using my_gist__index on public.my_table sc
>        (cost=0.42..168630.72 rows=412 width=64) ...
>
> while on PG12 the index scan is costed like this:
>
>    ->  Index Scan using my_gist_index on public.my_table sc
>        (cost=0.42..1165450.26 rows=322 width=72) ...
>
> Yup, that's a 1M difference in cost, because in cost_index we do this:
>
>     cpu_run_cost += cpu_per_tuple * tuples_fetched;
>
> with tuples_fetched being ~40k rows, and the cpu_per_tuple jumps from
> 0.29 to more than 25.0. This happens because of st_within() in the WHERE
> condition, and procost getting increased from 100 to 10000 in commit
>
>   commit 3ddc11483b2266d3a914ecfdeb3e2a343e0f6dd4
>   Author:     Paul Ramsey <pramsey at cleverelephant.ca> []
>   AuthorDate: Fri Mar 8 21:48:08 2019 +0000
>   CommitDate: Fri Mar 8 21:48:08 2019 +0000
>
>       Add costs to functions in the form of macros:
>       _COST_LOW, _COST_MEDIUM, _COST_HIGH
>       Macros are set in sqldefines.h.in and are switched
>       on PgSQL version. Prior to PG12, costs should be
>       small or nonexistent, to avoid SQL inlining issues.
>       After Pg12, costs can be higher to hopefully
>       force more parallelism in plans.
>       Closes #4344
>
> This however makes the index scan too expensive, and makes the optimizer
> to pick parallel query with a bitmap index scan instead. Which runs in
> about 1000 ms, while the original plan (after pushing the right enable_
> GUCs) runs in ~10ms.
>
> I agree the idea of commit 3ddc11483b might be right in principle, but I
> wonder if it went a bit too far - maybe jump from 100 to 10000 is a bit
> too much? I tried searching the postgis-devel archives for explanation
> why this is the right increase, but I haven't found anything.
>
> Of course, I can do
>
>     ALTER FUNCTION  st_within(geometry,geometry) COST 100
>
> to revert to the original cost, but doing that on every PostGIS install
> seems a bit impractical. Furthermore, there seem to be about ~120
> functions with this cost, and I'm not sure I want to tweak all of them.
>
> Opinions?
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20220924/6262d1c7/attachment.htm>


More information about the postgis-devel mailing list