[postgis-users] Problem with Nested Loop in a query with postgis
Regina Obe
lr at pcorp.us
Mon Jul 17 12:40:18 PDT 2023
I looked at our list of items between 3.1.7 and 3.1.8
https://git.osgeo.org/gitea/postgis/postgis/src/branch/stable-3.1/NEWS
and can’t find any that would cause what you describe below.
It’s possible whatever was changed we didn’t consider strong enough to put in the news.
I haven’t checked the git repo yet to see what changed.
Usually these switches in plans are often triggered by changes in costs of the functions or just data and it is possible we adjusted some of the costs on these functions.
I was thinking the 10 to 14 switch might have been caused by the function instrumentation changes.
But that wouldn’t explain your experiencing the same issue on 14 3.1.7 vs 14 3.1.8
If you still have both running, can you give us the output
SELECT postgis_full_version(), version();
So we can confirm we are truly comparing only a difference in PostGIS version and not the libraries used or PostgreSQL minor version.
Thanks,
Regina
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Anderson Mallmann
Sent: Monday, July 17, 2023 10:23 AM
To: postgis-users at lists.osgeo.org; lviecelli199 at gmail.com; torres.glauco at gmail.com
Subject: [postgis-users] Problem with Nested Loop in a query with postgis
Hi,
Here at the company I work for, we upgraded a postgres database from version 10 to 14 and postgis 2.5.3 to 3.3.3 and in a specific query we noticed a huge performance decrease between these versions. This query was running less than 1s and now it takes more than 50s.
Now we are trying to figure out what the real problem is and in which version this started to happen. Some of our discoveries so far:
1) The query runs fine until postgis 3.1.7
2) In postgis 3.1.8 the query execution changed from Merge Semi Join to Nested Loop Semi Join.
3) Running this query on postgis >= 3.1.8 disabling nested loop with “set enable_nestloop = false” the query runs as before without any problem.
We don’t know why the plan changed in postgis >= 3.1.8 to use Nested Loop, but even in earlier versions if any query like this uses nested loop, the performance drops.
Query:
SELECT pols.pol_id, pols.type, kml, json_build_object('type', 'Feature', 'geometry', st_asgeojson(pols.geom)::jsonb) as geoJson, pols.status pol_status, pols.owner_id pol_owner, pols.details, pols.creation_date as pol_creation_date, pols.country as country, pols.state as state, pols.city as city, pols.label as label FROM polygons pols WHERE pols.owner_id is null AND pols.status = 'ACTIVE' AND pols.type in ('REGION') AND pols.details->>'state' in (SELECT s.state FROM states s WHERE ST_DWITHIN(Geography(ST_Transform(s.geom,4326)), ST_GeographyFromText('POINT(-43.113826 -22.9022)'),'15000')) AND (ST_DWithin(ST_GeographyFromText('POINT(-43.113826 -22.9022)'), Geography(pols.geom),'15000')) AND (ST_MaxDistance(ST_GeomFromText('POINT(-43.113826 -22.9022)', 4326), pols.geom) * 111195) <= '15000'
Here are the explains for this query:
pg 14 - postgis 3.1.7
Merge Semi Join (cost=15.83..16.04 rows=1 width=263) (actual time=529.041..586.241 rows=416 loops=1)
Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label
Merge Cond: (((pols.details ->> 'state'::text)) = (s.state)::text)
Buffers: shared hit=3193
-> Sort (cost=4.49..4.50 rows=1 width=4656) (actual time=244.333..244.367 rows=416 loops=1)
Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, ((pols.details ->> 'state'::text))
Sort Key: ((pols.details ->> 'state'::text))
Sort Method: quicksort Memory: 1314kB
Buffers: shared hit=2131
-> Index Scan using polygons_geography_state_idx on region.polygons pols (cost=0.28..4.48 rows=1 width=4656) (actual time=10.207..243.079 rows=416 loops=1)
Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, (pols.details ->> 'state'::text)
Index Cond: (geography(pols.geom) && '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography)
Filter: (('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography && _st_expand(geography(pols.geom), '15000'::double precision)) AND _st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, geography(pols.geom), '15000'::double precision, true) AND ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry, st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double precision))
Rows Removed by Filter: 395
Buffers: shared hit=2125
-> Sort (cost=11.34..11.36 rows=9 width=3) (actual time=284.570..284.570 rows=1 loops=1)
Output: s.state
Sort Key: s.state
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1062
-> Seq Scan on region.states s (cost=0.00..11.19 rows=9 width=3) (actual time=180.759..284.564 rows=1 loops=1)
Output: s.state
Filter: st_dwithin(geography(st_transform(s.geom, 4326)), '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision, true)
Rows Removed by Filter: 26
Buffers: shared hit=1062
Query Identifier: -6178717669455780324
Planning:
Buffers: shared hit=481
Planning Time: 31.139 ms
Execution Time: 586.869 ms
-----------
pg 14 - postgis 3.1.8
Nested Loop Semi Join (cost=0.67..111.50 rows=1 width=263) (actual time=179.622..59918.477 rows=416 loops=1)
Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label
Join Filter: ((pols.details ->> 'state'::text) = (s.state)::text)
Buffers: shared hit=72116 read=1075
-> Index Scan using polygons_geography_state_idx on region.polygons pols (cost=0.41..55.77 rows=1 width=4656) (actual time=9.477..221.758 rows=416 loops=1)
Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, pols.last_updated
Index Cond: (geography(pols.geom) && _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision))
Filter: (st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, geography(pols.geom), '15000'::double precision, true) AND ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry, st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double precision))
Rows Removed by Filter: 395
Buffers: shared hit=269 read=692
-> Index Scan using states_geography_idx on region.states s (cost=0.26..54.37 rows=1 width=3) (actual time=142.676..142.676 rows=1 loops=416)
Output: s.state_code, s.state, s.country, s.geom
Index Cond: (geography(st_transform(s.geom, 4326)) && _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision))
Filter: st_dwithin(geography(st_transform(s.geom, 4326)), '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision, true)
Rows Removed by Filter: 1
Buffers: shared hit=70988 read=171
Query Identifier: -6178717669455780324
Planning:
Buffers: shared hit=401 read=45
Planning Time: 30.729 ms
Execution Time: 59919.343 ms
Regards,
Anderson M.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20230717/da260059/attachment.htm>
More information about the postgis-users
mailing list