[PostGIS] #5984: PostGIS selectivity is screwing up queries and forcing it to choose a spatial index when it's inappropriate

PostGIS trac at osgeo.org
Thu Sep 11 12:25:51 PDT 2025


#5984: PostGIS selectivity is screwing up queries and forcing it to choose a
spatial index when it's inappropriate
-----------------------+---------------------------
  Reporter:  robe      |      Owner:  pramsey
      Type:  defect    |     Status:  new
  Priority:  critical  |  Milestone:  PostGIS 3.4.5
 Component:  postgis   |    Version:  3.5.x
Resolution:            |   Keywords:
-----------------------+---------------------------
Comment (by pramsey):

 So I threw it into the debugger and watched, and two fact emerge.
 * First, the estimator is exiting early and returning 0.0 because the
 query box does not even intersect the bounds of the stats histogram. This
 is because we are pruning the size of the histogram by removing outliers
 from the stats sample.
 * Second, if I force the function to return a non-zero estimate, the
 planner kicks out the good plan. This does not work for arbitrarily small
 non-zero estimates though. An estimate of 0.00001 works, but an estimate
 of 0.0000001 does not.
 So, I would say that the estimator is in fact "mostly right" and I am not
 sure how much to put a thumb on its scale.
 I have been thinking that in place of a histogram, with all the complex
 tuning we use to build it, a kdtree or rtree might work better,
 encapsulating the variability of density into a structure of finite size.
 Hard call though. I suppose another hack would be to take all the samples
 that are outliers, and put them in an "outlier count" field, and then
 divide that by the sample size, and any query box that falls outside the
 histogram gets that estimate. Or there both a histogram bounds and a
 sample bounds, and if you're outside the sample bounds you get a zero, but
 if you're outside the histogram bounds you get the outlier proportion.
 More hacks on the hacks.
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5984#comment:6>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list