[postgis-devel] cost of some functions (e.g. st_within) might be too high after 3ddc11483b

Tomas Vondra tomas.vondra at 2ndquadrant.com
Mon Sep 26 09:10:16 PDT 2022


On 9/26/22 11:31, Raúl Marín wrote:
> Hi,
> 
> The main problem is that the cost of functions are a static value (same
> for all inputs) and that makes it impossible for Postgis to set it to a
> value that's ok for all inputs. For example, st_within could be a really
> fast operation (if the bounding box don't fit), it could be a point in
> polygon calculation inside Postgis or more complex operation with a
> multigeometry inside another multigeometry in GEOS. Add this to the fact
> that it might or might not hit several caches to speed up the process
> (detoasting tuples or preparing the geometries in GEOS) and the
> possibilities are endless.
> 

That's true - the function costing logic is limited, mostly because the
amount of information available during planning is limited. We don't
know which particular values we'll see, so we can't know how long the
values will be, if they intersect with the bounding box, etc.

But isn't that a very different argument from what the commit 3ddc11483b
talks about. That only mentions inlining vs. not inlining.

> To support something remotely close to this, Postgis would need to
> calculate and report costs to Postgresql based on statistics (geometry
> types, geometry sizes and so on) and, apart from being quite a bit of
> work it's not possible right now AFAIK. The last time I looked into this
> I came up with a guesstimation based on whether deserialization happened
> or not and how fast or slow the operation was against a baseline, but I
> only did it for unary functions and left variadic ones, like st_within,
> untouched. You can see it https://github.com/postgis/postgis/pull/556.
> 

Thanks.

> If somebody comes up with a way to improve this in a generic way, that'd
> be great.
> 

Not sure what improvement you have in mind. I see PostGIS is already
using the planner support function, and we have SupportRequestCost so I
guess it should be possible to tweak the costing based on e.g. average
column length, or take a closer look on the MCV values (but the really
large ones won't be there, so maybe that's futile).

I don't think we can do much more, because there's not much reliable
information available :-(

> 
> At some points I added some tweaks to functions costs in
> https://github.com/postgis/postgis/pull/556. The problem with
> 

I guess this got sent too early?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


More information about the postgis-devel mailing list