[postgis-tickets] [PostGIS] #3739: ST_Within Not using index

PostGIS trac at osgeo.org
Fri Apr 28 07:15:12 PDT 2017


#3739: ST_Within Not using index
--------------------------+---------------------------
  Reporter:  postgispaul  |      Owner:  pramsey
      Type:  defect       |     Status:  new
  Priority:  medium       |  Milestone:  PostGIS 2.3.3
 Component:  postgis      |    Version:  2.3.x
Resolution:               |   Keywords:
--------------------------+---------------------------

Comment (by postgispaul):

 Here are the results:
 {{{
 explain analyze
 SELECT  ap.tags->'ISO3166-2' iso,s.admin_level admlev, s.way, s.osm_id
 FROM planet_osm_polygon s
 JOIN planet_osm_polygon ap on ST_Within(s.way,ap.way)
 WHERE  ap.tags?'ISO3166-2' AND ap.tags->'ISO3166-2' in ('US-NJ')  AND
 s.admin_level ~ '^\d+$';

 Nested Loop  (cost=0.27..9.08 rows=1 width=128) (actual
 time=3.006..849.464 rows=488 loops=1)
   ->  Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
 (cost=0.13..2.34 rows=1 width=64) (actual time=0.014..0.017 rows=1
 loops=1)
         Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
   ->  Index Scan using planet_osm_polygon_index on planet_osm_polygon s
 (cost=0.15..6.73 rows=1 width=96) (actual time=2.864..839.118 rows=488
 loops=1)
         Index Cond: (ap.way ~ way)
         Filter: (((admin_level)::text ~ '^\d+$'::text) AND
 _st_contains(ap.way, way))
         Rows Removed by Filter: 3124
 Planning time: 0.317 ms
 Execution time: 851.518 ms
 }}}

 and
 {{{
 explain analyze
 SELECT  ap.tags->'ISO3166-2' iso,s.admin_level admlev, s.way, s.osm_id
 FROM planet_osm_polygon s
 JOIN planet_osm_polygon ap on ap.way ~ s.way and ST_Within(s.way,ap.way)
 WHERE  ap.tags?'ISO3166-2' AND ap.tags->'ISO3166-2' in ('US-NJ')  AND
 s.admin_level ~ '^\d+$';

 Nested Loop  (cost=0.27..4.98 rows=1 width=128) (actual
 time=2.895..842.255 rows=488 loops=1)
   ->  Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
 (cost=0.13..2.34 rows=1 width=64) (actual time=0.013..0.016 rows=1
 loops=1)
         Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
   ->  Index Scan using planet_osm_polygon_index on planet_osm_polygon s
 (cost=0.15..2.62 rows=1 width=96) (actual time=2.721..832.200 rows=488
 loops=1)
         Index Cond: ((ap.way ~ way) AND (ap.way ~ way))
         Filter: (((admin_level)::text ~ '^\d+$'::text) AND
 _st_contains(ap.way, way))
         Rows Removed by Filter: 3124
 Planning time: 0.333 ms
 Execution time: 844.151 ms
 }}}

 So results are different at my end, for this case there doesn't seem to be
 an issue.
 I did reduce the random_page_cost to 1 (all previous queries have been run
 with this cost).
 Changing the random_page_cost to the default 4 gave me this:
 {{{
 Nested Loop  (cost=4.30..28.67 rows=1 width=128) (actual
 time=4.821..873.925 rows=488 loops=1)
   ->  Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
 (cost=0.13..8.14 rows=1 width=64) (actual time=0.075..0.079 rows=1
 loops=1)
         Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
   ->  Bitmap Heap Scan on planet_osm_polygon s  (cost=4.18..20.51 rows=1
 width=96) (actual time=4.705..863.589 rows=488 loops=1)
         Recheck Cond: (ap.way ~ way)
         Filter: (((admin_level)::text ~ '^\d+$'::text) AND
 _st_contains(ap.way, way))
         Rows Removed by Filter: 3124
         Heap Blocks: exact=1098
         ->  Bitmap Index Scan on planet_osm_polygon_index
 (cost=0.00..4.18 rows=4 width=0) (actual time=1.424..1.424 rows=3612
 loops=1)
               Index Cond: (ap.way ~ way)
 Planning time: 0.332 ms
 Execution time: 876.048 ms
 }}}
 and
 {{{
 Nested Loop  (cost=0.27..16.58 rows=1 width=128) (actual
 time=2.963..854.142 rows=488 loops=1)
   ->  Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
 (cost=0.13..8.14 rows=1 width=64) (actual time=0.015..0.017 rows=1
 loops=1)
         Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
   ->  Index Scan using planet_osm_polygon_index on planet_osm_polygon s
 (cost=0.15..8.42 rows=1 width=96) (actual time=2.798..844.017 rows=488
 loops=1)
         Index Cond: ((ap.way ~ way) AND (ap.way ~ way))
         Filter: (((admin_level)::text ~ '^\d+$'::text) AND
 _st_contains(ap.way, way))
         Rows Removed by Filter: 3124
 Planning time: 0.307 ms
 Execution time: 856.015 ms
 }}}
 so apart from the random_page_cost: no differences between our servers.

 regarding your remarks

   If your bounding boxes are big enough (which is a especially big problem
 when using long lat projection and comparing countries), it could dismiss
 using an index entirely because it thinks its useless, never even bothers
 costing. The extra ~ would just add some extra incentive to make it
 consider the plan. pramsey can describe the innards better but I do think
 the histogram is used in some shape or form for the planner to decide if a
 plan is even worth considering.

 The bbox of south korea should be selective enough for the polygons of the
 whole world, I even set the analyze parameter (the one stating how many
 records to fetch for statistics) to the max.

 And as for the extra ~ : I thought it wasn't possible to influence the
 planner (other than disabling specific actions)?

 Note also that if the extra ~ is added, that condition is added twice to
 the index condition

--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/3739#comment:21>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list