[PostGIS] #5984: PostGIS selectivity is screwing up queries and forcing it to choose a spatial index when it's inappropriate
PostGIS
trac at osgeo.org
Thu Sep 11 12:25:51 PDT 2025
#5984: PostGIS selectivity is screwing up queries and forcing it to choose a
spatial index when it's inappropriate
-----------------------+---------------------------
Reporter: robe | Owner: pramsey
Type: defect | Status: new
Priority: critical | Milestone: PostGIS 3.4.5
Component: postgis | Version: 3.5.x
Resolution: | Keywords:
-----------------------+---------------------------
Comment (by pramsey):
So I threw it into the debugger and watched, and two fact emerge.
* First, the estimator is exiting early and returning 0.0 because the
query box does not even intersect the bounds of the stats histogram. This
is because we are pruning the size of the histogram by removing outliers
from the stats sample.
* Second, if I force the function to return a non-zero estimate, the
planner kicks out the good plan. This does not work for arbitrarily small
non-zero estimates though. An estimate of 0.00001 works, but an estimate
of 0.0000001 does not.
So, I would say that the estimator is in fact "mostly right" and I am not
sure how much to put a thumb on its scale.
I have been thinking that in place of a histogram, with all the complex
tuning we use to build it, a kdtree or rtree might work better,
encapsulating the variability of density into a structure of finite size.
Hard call though. I suppose another hack would be to take all the samples
that are outliers, and put them in an "outlier count" field, and then
divide that by the sample size, and any query box that falls outside the
histogram gets that estimate. Or there both a histogram bounds and a
sample bounds, and if you're outside the sample bounds you get a zero, but
if you're outside the histogram bounds you get the outlier proportion.
More hacks on the hacks.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5984#comment:6>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list