[PostGIS] #5984: PostGIS selectivity is screwing up queries and forcing it to choose a spatial index when it's inappropriate

PostGIS trac at osgeo.org
Wed Sep 17 18:48:28 PDT 2025


#5984: PostGIS selectivity is screwing up queries and forcing it to choose a
spatial index when it's inappropriate
-----------------------+---------------------------
  Reporter:  robe      |      Owner:  pramsey
      Type:  defect    |     Status:  new
  Priority:  critical  |  Milestone:  PostGIS 3.4.5
 Component:  postgis   |    Version:  3.5.x
Resolution:            |   Keywords:
-----------------------+---------------------------
Comment (by robe):

 Replying to [comment:9 Bas Couwenberg]:
 > Replying to [comment:7 robe]:
 > > and the explain plan for both ST_Contains and ST_Intersects are
 decent, don't use the spatial index
 >
 > Not sure why you're using ST_Intersects. ST_Contains is used to remove
 results of nodes outside the bounding box as I'm only interested in
 connected ways within the province/municipality/settlement being
 processed. Ways outside the bounding box find their way into the dataset
 because a part of their ways are within the bounding box and all nodes in
 those ways are downloaded to prevent incomplete ways.


 We were using ST_Intersects because the config of operator && has always
 used  gserialized_gist_sel_2d, gserialized_gist_joinsel_2d
 even in 3.3, so in theory you should have seen the same poor behavior in
 3.3 with ST_Intersects. If you didn't then it would suggest something
 wrong with our selectivity calcs in later versions.

 Anyway Bas can you run this query in your bad db


 {{{
 SELECT _postgis_stats('nodes', 'geom','2') ;
 }}}


 As mentioned I get this which as you can see has a very bizarre size
 element of "size":[-2147483648,-2147483648]


 {{{
 {"ndims":2,"size":[-2147483648,-2147483648],"extent":{"min":[3.29076,50.7371],"max":[7.2349,53.565]},"table_features":133392880,"sample_features":30000,
 "not_null_features":30000,"histogram_features":30000,"histogram_cells":0,"cells_covered":0}
 }}}

 I had given Paul the full dataset and he couldn't replicate the issue in
 his PostgreSQL 15/17 3.5.

 But I just noticed my full dataset in my 3.4 / PostgreSQL 16 I just
 reloaded doesn't  have the same bizarre number and the selectivity while
 not 0.7 is not 0 either.

 {{{
 SELECT postgis_full_version(), version();
 SELECT _postgis_stats('test_nodes_full', 'geom','2');
 SELECT _postgis_selectivity ('test_nodes_full', 'geom',
 ST_GeomFromEWKT('SRID=4326;POLYGON((6.11982 52.6121955,6.11982
 53.2038323,7.0927397 53.2038323,7.0927397 52.6121955,6.11982
 52.6121955))'), '2');
 }}}


 {{{
 POSTGIS="3.4.2 c19ce56" [EXTENSION] PGSQL="160" GEOS="3.15.0dev-
 CAPI-1.21.0" (compiled against GEOS 3.12.1) PROJ="9.6.0
 NETWORK_ENABLED=OFF URL_ENDPOINT=https://cdn.proj.org
 USER_WRITABLE_DIRECTORY=/tmp/proj DATABASE_PATH=/usr/share/proj/proj.db"
 LIBXML="2.9.14" LIBJSON="0.18" LIBPROTOBUF="1.5.1" WAGYU="0.5.0
 (Internal)" PostgreSQL 16.10 (Debian
 16.10-2.pgdg13+~20250911.0703.g93b6b46) on x86_64-pc-linux-gnu, compiled
 by gcc (Debian 14.2.0-19) 14.2.0, 64-bit

 {"ndims":2,"size":[1,1],"extent":{"min":[3.34909,50.7187],"max":[7.23163,53.7212]},"table_features":133455600,"sample_features":30000,"not_null_features":30000,"histogram_features":30000,"histogram_cells":1,"cells_covered":30000}

 0.04937831217447917
 }}}

 I dropped my other dbs so will need to reload them.
 The only thing really special about the 3.6 db I have is that I loaded it
 with your OSM script.  So I'm wondering if maybe the OSM is hitting on
 something that is corrupting the indexes in some way or screwing up the
 selectivity stats. Maybe the order of the data load perhaps

 Anyway I'm going to create a fresh 3.6 db and copy the data over to it and
 see if it exhibits the same bizarre behavior as my OSM loaded db.
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5984#comment:10>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list