[postgis-devel] COSTS

Paul Ramsey pramsey at cleverelephant.ca
Mon Mar 11 10:54:06 PDT 2019


Hm, looks like we’ll end up hoist on our own petard: if we go with high costs we can force parallel plans for larger more complex objects but at the expense of simple ones and vice versa. Potentially end up having to go All The Way and also implement dynamic costing in conjunction with a stats gatherer that also looks up average object size.

P.

> On Mar 11, 2019, at 8:16 AM, Raúl Marín Rodríguez <rmrodriguez at carto.com> wrote:
> 
>> What led you to feel those costs were “too high”.?
> 
> In sort, benchmarks. The longer explanation below:
> 
> I followed a simple process to detect these cases. I have a benchmark
> that runs 70-80 cases (different queries, different nº of geometries,
> different geometry types) so I run it with the default config and then
> disabling parallelism, and then checked which cases were faster when
> parallelism was disabled.
> 
> Once I had those cases, I checked the plans and verified that they
> were using parallel workers too soon. In both cases the geometries
> were single points (simplify is a no-op and intersects a fast pip) so
> reducing the cost made sense and removed the parallelism for those
> plans. This change didn't have any impact in the benchmarks with lines
> and polygons but ideally we'd need to test with an even bigger amount
> of cases.
> 
> The queries and plans are:
> 
> * https://gist.github.com/Algunenano/ff13511295d0548f4ea6f705dc700d36
> 
> In this case the geometries are all points, and the amount of point
> inside the bbox are 5104. Sequential plan takes 12ms, the parallel one
> takes 33ms.
> 
> * https://gist.github.com/Algunenano/d8767f178f7bc3826f1fc8ddfd291e29
> 
> In this case, having a high cost ST_Intersects makes that part of the
> plan parallel and the query takes 48 ms vs 27 without parallelism.
> The geometries left after the filter are 4623 points.
> 
> 
> 
> 
> On Mon, Mar 11, 2019 at 3:32 PM Paul Ramsey <pramsey at cleverelephant.ca> wrote:
>> 
>> Yeah, I think we’re going to be on the horns of a dilemma… we practically have to trick the planner into doing what we want…
>> One thing I remain unable to trick the planner into doing is a spatial join. Takes a cost of a million on ST_Intersects() to get a spatial join, on a pretty large could relations. Other stuff works, but it still takes quite high costs, like to get ST_Area to kick a parallel plan in takes a cost of 100.
>> 
>> What led you to feel those costs were “too high”.?
>> 
>> P
>> 
>>> On Mar 11, 2019, at 3:45 AM, Raúl Marín Rodríguez <rmrodriguez at carto.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Recently I've been doing some tests with CARTO's costs in Postgis and
>>> found that some of the costs we had were too high in some cases (bit
>>> point datasets):
>>> - ST_Simplify: I reduced it from 512 to 64.
>>> - ST_Intersects: I reduced it from 512 to 128.
>>> 
>>> They are now setup with cost _COST_HIGH (10000) which seems really
>>> excesive, but I'll wait until I can test with PG12 + trunk to test it.
>>> 
>>> On Fri, Mar 8, 2019 at 10:54 PM Paul Ramsey <pramsey at cleverelephant.ca> wrote:
>>>> 
>>>> So, FYI for those working against PG12, I have added in costing, and upped the costs for functions in PG12 substantially. So, let’s see if (a) something breaks around planning and/or (b) parallelism improves with these new costs?
>>>> 
>>>> https://trac.osgeo.org/postgis/ticket/4344
>>>> 
>>>> The knobs for moving costs around globally are in
>>>> 
>>>> https://github.com/postgis/postgis/blob/239f2286989135b9a3156ce7789f45f09fcf0d3b/postgis/sqldefines.h.in#L24-L41
>>>> 
>>>> P.
>>>> _______________________________________________
>>>> postgis-devel mailing list
>>>> postgis-devel at lists.osgeo.org
>>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>> 
>>> 
>>> 
>>> --
>>> Raúl Marín Rodríguez
>>> carto.com
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>> 
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
> 
> 
> 
> --
> Raúl Marín Rodríguez
> carto.com
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list