[postgis-tickets] [PostGIS] #5186: PostGIS performance regressions on identical queries

PostGIS trac at osgeo.org
Tue Jul 12 04:56:08 PDT 2022


#5186: PostGIS performance regressions on identical queries
---------------------+---------------------------
 Reporter:  dracos   |      Owner:  pramsey
     Type:  defect   |     Status:  new
 Priority:  medium   |  Milestone:  PostGIS 3.2.2
Component:  postgis  |    Version:  3.1.x
 Keywords:           |
---------------------+---------------------------
 Hi, we have recently upgraded a server from PostgreSQL 9.6 and PostGIS
 2.3.1 (debian stretch) to PostgreSQL 13 and PostGIS 3.1.1 (debian
 bullseye). Everything went well (thanks :) ), but two queries we had both
 started performing worse.

 The first case involved a CTE that (since PostgreSQL 12, see
 https://paquier.xyz/postgresql-2/postgres-12-with-materialize/ ) was being
 'inlined' - adding MATERIALIZED restored the old fast behaviour. The
 second did not involve a CTE, but rewriting the query so that it did use
 one (and again, with MATERIALIZED) got the performance back to what it was
 previously.

 Below I provide query plans for that second non-CTE query on old and new
 versions. The reason I'm opening this here rather than PostgreSQL might
 well be wrong, apologies if so, but I wasn't sure what was involved in
 deciding what the query planner did, and I wondered if there was something
 PostGIS could do in this situation - in the last two query plans, the
 difference appears to be (again, sorry if misunderstood) st_transform
 isn't called once up front but is called every iteration. The cost is far
 higher for the non-materialized version.

 PostgreSQL 9.6 query plan:

 {{{
 mapit=# explain analyze
 SELECT * FROM "mapit_postcode"
 WHERE (
 ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from
 mapit_geometry where area_id=2577 group by area_id), 4326))
 AND
 location && ST_Transform((select ST_Collect(polygon) from mapit_geometry
 where area_id=2577 group by area_id), 4326)
 )
 LIMIT 1;
 QUERY PLAN
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=25.19..38.88 rows=1 width=43) (actual time=32.082..32.082
 rows=1 loops=1)
    InitPlan 1 (returns $0)
      ->  GroupAggregate  (cost=0.42..12.27 rows=2 width=36) (actual
 time=0.570..0.570 rows=1 loops=1)
            Group Key: mapit_geometry.area_id
            ->  Index Scan using mapit_geometry_area_id on mapit_geometry
 (cost=0.42..12.23 rows=2 width=13783) (actual time=0.007..0.007 rows=1
 loops=1)
                  Index Cond: (area_id = 2577)
    InitPlan 2 (returns $1)
      ->  GroupAggregate  (cost=0.42..12.27 rows=2 width=36) (actual
 time=0.282..0.282 rows=1 loops=1)
            Group Key: mapit_geometry_1.area_id
            ->  Index Scan using mapit_geometry_area_id on mapit_geometry
 mapit_geometry_1  (cost=0.42..12.23 rows=2 width=13783) (actual
 time=0.019..0.020 rows=1 loops=1)
                  Index Cond: (area_id = 2577)
    ->  Index Scan using postcodes_postcode_location_id on mapit_postcode
 (cost=0.66..849.44 rows=62 width=43) (actual time=32.080..32.080 rows=1
 loops=1)
          Index Cond: (location && st_transform($1, 4326))
          Filter: st_coveredby(location, st_transform($0, 4326))
  Planning time: 0.428 ms
  Execution time: 36.386 ms
 (16 rows)

 }}}

 PostgreSQL 13 query plan on the same query:

 {{{
 mapit=# explain analyze
 SELECT * FROM "mapit_postcode"
 WHERE (
 ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from
 mapit_geometry where area_id=2577 group by area_id), 4326))
 AND
 location && ST_Transform((select ST_Collect(polygon) from mapit_geometry
 where area_id=2577 group by area_id), 4326)
 )
 LIMIT 1;
 QUERY PLAN
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=90.23..148.25 rows=1 width=43) (actual
 time=9843.067..9843.072 rows=1 loops=1)
    InitPlan 1 (returns $0)
      ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual
 time=1.473..1.475 rows=1 loops=1)
            Group Key: mapit_geometry.area_id
            ->  Index Scan using mapit_geometry_area_id on mapit_geometry
 (cost=0.42..15.75 rows=3 width=13437) (actual time=0.012..0.016 rows=1
 loops=1)
                  Index Cond: (area_id = 2577)
    InitPlan 2 (returns $1)
      ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual
 time=1.354..1.355 rows=1 loops=1)
            Group Key: mapit_geometry_1.area_id
            ->  Index Scan using mapit_geometry_area_id on mapit_geometry
 mapit_geometry_1  (cost=0.42..15.75 rows=3 width=13437) (actual
 time=0.026..0.029 rows=1 loops=1)
                  Index Cond: (area_id = 2577)
    ->  Index Scan using postcodes_postcode_location_id on mapit_postcode
 (cost=50.41..108.43 rows=1 width=43) (actual time=9843.065..9843.065
 rows=1 loops=1)
          Index Cond: ((location @ st_transform($0, 4326)) AND (location &&
 st_transform($1, 4326)))
          Filter: st_coveredby(location, st_transform($0, 4326))
          Rows Removed by Filter: 406
  Planning Time: 0.294 ms
  Execution Time: 9844.084 ms
 (17 rows)
 }}}

 Here is PostgreSQL 13 on a non-materialized CTE version of the same query:

 {{{
 mapit=# explain analyze WITH target AS  (
     SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
             where area_id=2577 group by area_id), 4326) AS polygon )
 SELECT "mapit_postcode"."id", "mapit_postcode"."postcode",
 "mapit_postcode"."location"::bytea
   FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon)
 LIMIT 1;
 QUERY PLAN
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=45.32..583.80 rows=1 width=43) (actual
 time=9297.474..9297.479 rows=1 loops=1)
    InitPlan 1 (returns $0)
      ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual
 time=2.005..2.008 rows=1 loops=1)
            Group Key: mapit_geometry.area_id
            ->  Index Scan using mapit_geometry_area_id on mapit_geometry
 (cost=0.42..15.75 rows=3 width=13437) (actual time=0.026..0.029 rows=1
 loops=1)
                  Index Cond: (area_id = 2577)
    ->  Index Scan using postcodes_postcode_location_id on mapit_postcode
 (cost=25.41..105027.91 rows=195 width=43) (actual time=9297.473..9297.473
 rows=1 loops=1)
          Index Cond: (location @ st_transform($0, 4326))
          Filter: st_coveredby(location, st_transform($0, 4326))
          Rows Removed by Filter: 406
  Planning Time: 0.330 ms
  Execution Time: 9297.806 ms
 (12 rows)
 }}}


 And here is the PostgreSQL 13 query plan on my rewritten query:

 {{{
 mapit=# explain analyze WITH target AS MATERIALIZED (
     SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
             where area_id=2577 group by area_id), 4326) AS polygon )
 SELECT "mapit_postcode"."id", "mapit_postcode"."postcode",
 "mapit_postcode"."location"::bytea
   FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon)
 LIMIT 1;
 QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=120.01..148.00 rows=1 width=43) (actual time=48.101..48.107
 rows=1 loops=1)
    CTE target
      ->  Result  (cost=19.91..44.92 rows=1 width=32) (actual
 time=39.050..39.053 rows=1 loops=1)
            InitPlan 1 (returns $0)
              ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36)
 (actual time=0.986..0.988 rows=1 loops=1)
                    Group Key: mapit_geometry.area_id
                    ->  Index Scan using mapit_geometry_area_id on
 mapit_geometry  (cost=0.42..15.75 rows=3 width=13437) (actual
 time=0.014..0.017 rows=1 loops=1)
                          Index Cond: (area_id = 2577)
    ->  Nested Loop  (cost=75.09..54659.02 rows=1950 width=43) (actual
 time=48.099..48.101 rows=1 loops=1)
          ->  CTE Scan on target  (cost=0.00..0.02 rows=1 width=32) (actual
 time=39.171..39.171 rows=1 loops=1)
          ->  Bitmap Heap Scan on mapit_postcode  (cost=75.09..54413.30
 rows=195 width=43) (actual time=8.909..8.910 rows=1 loops=1)
                Filter: st_coveredby(location, target.polygon)
                Heap Blocks: exact=1
                ->  Bitmap Index Scan on postcodes_postcode_location_id
 (cost=0.00..75.04 rows=1950 width=0) (actual time=8.355..8.355 rows=29757
 loops=1)
                      Index Cond: (location @ target.polygon)
  Planning Time: 0.299 ms
  Execution Time: 48.335 ms
 (17 rows)
 }}}
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5186>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list