[postgis-tickets] [PostGIS] #3739: ST_Within Not using index
PostGIS
trac at osgeo.org
Fri Apr 28 07:15:12 PDT 2017
#3739: ST_Within Not using index
--------------------------+---------------------------
Reporter: postgispaul | Owner: pramsey
Type: defect | Status: new
Priority: medium | Milestone: PostGIS 2.3.3
Component: postgis | Version: 2.3.x
Resolution: | Keywords:
--------------------------+---------------------------
Comment (by postgispaul):
Here are the results:
{{{
explain analyze
SELECT ap.tags->'ISO3166-2' iso,s.admin_level admlev, s.way, s.osm_id
FROM planet_osm_polygon s
JOIN planet_osm_polygon ap on ST_Within(s.way,ap.way)
WHERE ap.tags?'ISO3166-2' AND ap.tags->'ISO3166-2' in ('US-NJ') AND
s.admin_level ~ '^\d+$';
Nested Loop (cost=0.27..9.08 rows=1 width=128) (actual
time=3.006..849.464 rows=488 loops=1)
-> Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
(cost=0.13..2.34 rows=1 width=64) (actual time=0.014..0.017 rows=1
loops=1)
Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
-> Index Scan using planet_osm_polygon_index on planet_osm_polygon s
(cost=0.15..6.73 rows=1 width=96) (actual time=2.864..839.118 rows=488
loops=1)
Index Cond: (ap.way ~ way)
Filter: (((admin_level)::text ~ '^\d+$'::text) AND
_st_contains(ap.way, way))
Rows Removed by Filter: 3124
Planning time: 0.317 ms
Execution time: 851.518 ms
}}}
and
{{{
explain analyze
SELECT ap.tags->'ISO3166-2' iso,s.admin_level admlev, s.way, s.osm_id
FROM planet_osm_polygon s
JOIN planet_osm_polygon ap on ap.way ~ s.way and ST_Within(s.way,ap.way)
WHERE ap.tags?'ISO3166-2' AND ap.tags->'ISO3166-2' in ('US-NJ') AND
s.admin_level ~ '^\d+$';
Nested Loop (cost=0.27..4.98 rows=1 width=128) (actual
time=2.895..842.255 rows=488 loops=1)
-> Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
(cost=0.13..2.34 rows=1 width=64) (actual time=0.013..0.016 rows=1
loops=1)
Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
-> Index Scan using planet_osm_polygon_index on planet_osm_polygon s
(cost=0.15..2.62 rows=1 width=96) (actual time=2.721..832.200 rows=488
loops=1)
Index Cond: ((ap.way ~ way) AND (ap.way ~ way))
Filter: (((admin_level)::text ~ '^\d+$'::text) AND
_st_contains(ap.way, way))
Rows Removed by Filter: 3124
Planning time: 0.333 ms
Execution time: 844.151 ms
}}}
So results are different at my end, for this case there doesn't seem to be
an issue.
I did reduce the random_page_cost to 1 (all previous queries have been run
with this cost).
Changing the random_page_cost to the default 4 gave me this:
{{{
Nested Loop (cost=4.30..28.67 rows=1 width=128) (actual
time=4.821..873.925 rows=488 loops=1)
-> Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
(cost=0.13..8.14 rows=1 width=64) (actual time=0.075..0.079 rows=1
loops=1)
Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
-> Bitmap Heap Scan on planet_osm_polygon s (cost=4.18..20.51 rows=1
width=96) (actual time=4.705..863.589 rows=488 loops=1)
Recheck Cond: (ap.way ~ way)
Filter: (((admin_level)::text ~ '^\d+$'::text) AND
_st_contains(ap.way, way))
Rows Removed by Filter: 3124
Heap Blocks: exact=1098
-> Bitmap Index Scan on planet_osm_polygon_index
(cost=0.00..4.18 rows=4 width=0) (actual time=1.424..1.424 rows=3612
loops=1)
Index Cond: (ap.way ~ way)
Planning time: 0.332 ms
Execution time: 876.048 ms
}}}
and
{{{
Nested Loop (cost=0.27..16.58 rows=1 width=128) (actual
time=2.963..854.142 rows=488 loops=1)
-> Index Scan using polygon_iso3166_2_idx on planet_osm_polygon ap
(cost=0.13..8.14 rows=1 width=64) (actual time=0.015..0.017 rows=1
loops=1)
Index Cond: ((tags -> 'ISO3166-2'::text) = 'US-NJ'::text)
-> Index Scan using planet_osm_polygon_index on planet_osm_polygon s
(cost=0.15..8.42 rows=1 width=96) (actual time=2.798..844.017 rows=488
loops=1)
Index Cond: ((ap.way ~ way) AND (ap.way ~ way))
Filter: (((admin_level)::text ~ '^\d+$'::text) AND
_st_contains(ap.way, way))
Rows Removed by Filter: 3124
Planning time: 0.307 ms
Execution time: 856.015 ms
}}}
so apart from the random_page_cost: no differences between our servers.
regarding your remarks
If your bounding boxes are big enough (which is a especially big problem
when using long lat projection and comparing countries), it could dismiss
using an index entirely because it thinks its useless, never even bothers
costing. The extra ~ would just add some extra incentive to make it
consider the plan. pramsey can describe the innards better but I do think
the histogram is used in some shape or form for the planner to decide if a
plan is even worth considering.
The bbox of south korea should be selective enough for the polygons of the
whole world, I even set the analyze parameter (the one stating how many
records to fetch for statistics) to the max.
And as for the extra ~ : I thought it wasn't possible to influence the
planner (other than disabling specific actions)?
Note also that if the extra ~ is added, that condition is added twice to
the index condition
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/3739#comment:21>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list