[postgis-tickets] [PostGIS] #5186: PostGIS performance regressions on identical queries
PostGIS
trac at osgeo.org
Tue Jul 12 04:56:08 PDT 2022
#5186: PostGIS performance regressions on identical queries
---------------------+---------------------------
Reporter: dracos | Owner: pramsey
Type: defect | Status: new
Priority: medium | Milestone: PostGIS 3.2.2
Component: postgis | Version: 3.1.x
Keywords: |
---------------------+---------------------------
Hi, we have recently upgraded a server from PostgreSQL 9.6 and PostGIS
2.3.1 (debian stretch) to PostgreSQL 13 and PostGIS 3.1.1 (debian
bullseye). Everything went well (thanks :) ), but two queries we had both
started performing worse.
The first case involved a CTE that (since PostgreSQL 12, see
https://paquier.xyz/postgresql-2/postgres-12-with-materialize/ ) was being
'inlined' - adding MATERIALIZED restored the old fast behaviour. The
second did not involve a CTE, but rewriting the query so that it did use
one (and again, with MATERIALIZED) got the performance back to what it was
previously.
Below I provide query plans for that second non-CTE query on old and new
versions. The reason I'm opening this here rather than PostgreSQL might
well be wrong, apologies if so, but I wasn't sure what was involved in
deciding what the query planner did, and I wondered if there was something
PostGIS could do in this situation - in the last two query plans, the
difference appears to be (again, sorry if misunderstood) st_transform
isn't called once up front but is called every iteration. The cost is far
higher for the non-materialized version.
PostgreSQL 9.6 query plan:
{{{
mapit=# explain analyze
SELECT * FROM "mapit_postcode"
WHERE (
ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from
mapit_geometry where area_id=2577 group by area_id), 4326))
AND
location && ST_Transform((select ST_Collect(polygon) from mapit_geometry
where area_id=2577 group by area_id), 4326)
)
LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=25.19..38.88 rows=1 width=43) (actual time=32.082..32.082
rows=1 loops=1)
InitPlan 1 (returns $0)
-> GroupAggregate (cost=0.42..12.27 rows=2 width=36) (actual
time=0.570..0.570 rows=1 loops=1)
Group Key: mapit_geometry.area_id
-> Index Scan using mapit_geometry_area_id on mapit_geometry
(cost=0.42..12.23 rows=2 width=13783) (actual time=0.007..0.007 rows=1
loops=1)
Index Cond: (area_id = 2577)
InitPlan 2 (returns $1)
-> GroupAggregate (cost=0.42..12.27 rows=2 width=36) (actual
time=0.282..0.282 rows=1 loops=1)
Group Key: mapit_geometry_1.area_id
-> Index Scan using mapit_geometry_area_id on mapit_geometry
mapit_geometry_1 (cost=0.42..12.23 rows=2 width=13783) (actual
time=0.019..0.020 rows=1 loops=1)
Index Cond: (area_id = 2577)
-> Index Scan using postcodes_postcode_location_id on mapit_postcode
(cost=0.66..849.44 rows=62 width=43) (actual time=32.080..32.080 rows=1
loops=1)
Index Cond: (location && st_transform($1, 4326))
Filter: st_coveredby(location, st_transform($0, 4326))
Planning time: 0.428 ms
Execution time: 36.386 ms
(16 rows)
}}}
PostgreSQL 13 query plan on the same query:
{{{
mapit=# explain analyze
SELECT * FROM "mapit_postcode"
WHERE (
ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from
mapit_geometry where area_id=2577 group by area_id), 4326))
AND
location && ST_Transform((select ST_Collect(polygon) from mapit_geometry
where area_id=2577 group by area_id), 4326)
)
LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=90.23..148.25 rows=1 width=43) (actual
time=9843.067..9843.072 rows=1 loops=1)
InitPlan 1 (returns $0)
-> GroupAggregate (cost=0.42..19.91 rows=3 width=36) (actual
time=1.473..1.475 rows=1 loops=1)
Group Key: mapit_geometry.area_id
-> Index Scan using mapit_geometry_area_id on mapit_geometry
(cost=0.42..15.75 rows=3 width=13437) (actual time=0.012..0.016 rows=1
loops=1)
Index Cond: (area_id = 2577)
InitPlan 2 (returns $1)
-> GroupAggregate (cost=0.42..19.91 rows=3 width=36) (actual
time=1.354..1.355 rows=1 loops=1)
Group Key: mapit_geometry_1.area_id
-> Index Scan using mapit_geometry_area_id on mapit_geometry
mapit_geometry_1 (cost=0.42..15.75 rows=3 width=13437) (actual
time=0.026..0.029 rows=1 loops=1)
Index Cond: (area_id = 2577)
-> Index Scan using postcodes_postcode_location_id on mapit_postcode
(cost=50.41..108.43 rows=1 width=43) (actual time=9843.065..9843.065
rows=1 loops=1)
Index Cond: ((location @ st_transform($0, 4326)) AND (location &&
st_transform($1, 4326)))
Filter: st_coveredby(location, st_transform($0, 4326))
Rows Removed by Filter: 406
Planning Time: 0.294 ms
Execution Time: 9844.084 ms
(17 rows)
}}}
Here is PostgreSQL 13 on a non-materialized CTE version of the same query:
{{{
mapit=# explain analyze WITH target AS (
SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
where area_id=2577 group by area_id), 4326) AS polygon )
SELECT "mapit_postcode"."id", "mapit_postcode"."postcode",
"mapit_postcode"."location"::bytea
FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon)
LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=45.32..583.80 rows=1 width=43) (actual
time=9297.474..9297.479 rows=1 loops=1)
InitPlan 1 (returns $0)
-> GroupAggregate (cost=0.42..19.91 rows=3 width=36) (actual
time=2.005..2.008 rows=1 loops=1)
Group Key: mapit_geometry.area_id
-> Index Scan using mapit_geometry_area_id on mapit_geometry
(cost=0.42..15.75 rows=3 width=13437) (actual time=0.026..0.029 rows=1
loops=1)
Index Cond: (area_id = 2577)
-> Index Scan using postcodes_postcode_location_id on mapit_postcode
(cost=25.41..105027.91 rows=195 width=43) (actual time=9297.473..9297.473
rows=1 loops=1)
Index Cond: (location @ st_transform($0, 4326))
Filter: st_coveredby(location, st_transform($0, 4326))
Rows Removed by Filter: 406
Planning Time: 0.330 ms
Execution Time: 9297.806 ms
(12 rows)
}}}
And here is the PostgreSQL 13 query plan on my rewritten query:
{{{
mapit=# explain analyze WITH target AS MATERIALIZED (
SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
where area_id=2577 group by area_id), 4326) AS polygon )
SELECT "mapit_postcode"."id", "mapit_postcode"."postcode",
"mapit_postcode"."location"::bytea
FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon)
LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=120.01..148.00 rows=1 width=43) (actual time=48.101..48.107
rows=1 loops=1)
CTE target
-> Result (cost=19.91..44.92 rows=1 width=32) (actual
time=39.050..39.053 rows=1 loops=1)
InitPlan 1 (returns $0)
-> GroupAggregate (cost=0.42..19.91 rows=3 width=36)
(actual time=0.986..0.988 rows=1 loops=1)
Group Key: mapit_geometry.area_id
-> Index Scan using mapit_geometry_area_id on
mapit_geometry (cost=0.42..15.75 rows=3 width=13437) (actual
time=0.014..0.017 rows=1 loops=1)
Index Cond: (area_id = 2577)
-> Nested Loop (cost=75.09..54659.02 rows=1950 width=43) (actual
time=48.099..48.101 rows=1 loops=1)
-> CTE Scan on target (cost=0.00..0.02 rows=1 width=32) (actual
time=39.171..39.171 rows=1 loops=1)
-> Bitmap Heap Scan on mapit_postcode (cost=75.09..54413.30
rows=195 width=43) (actual time=8.909..8.910 rows=1 loops=1)
Filter: st_coveredby(location, target.polygon)
Heap Blocks: exact=1
-> Bitmap Index Scan on postcodes_postcode_location_id
(cost=0.00..75.04 rows=1950 width=0) (actual time=8.355..8.355 rows=29757
loops=1)
Index Cond: (location @ target.polygon)
Planning Time: 0.299 ms
Execution Time: 48.335 ms
(17 rows)
}}}
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5186>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list