[postgis-users] Problem with Nested Loop in a query with postgis

Anderson Mallmann a.a.mallmann at gmail.com
Thu Jul 20 14:13:21 PDT 2023


I restored a backup of this database in pg10 2.5.3 and updated postgis to
3.0.7, at first the query used the Merge Semi Join and the performance was
fine, so I force to use Nested Loop and perforce was dropped.
So i think our problem was that postgres changed the plan for this
query and st_dwithin(geography, geography) in loop didn't perform well, but
I still don't understand why postgres is choosing to run this query with
Nested Loop.

I did other tests with a subset of the data and with a similar query using
nested loop the performance was not good in 2.5 or 3.0.
I uploaded the subset to this repository
https://github.com/aamallmann/postgis-region-db if you want to test.

Em seg., 17 de jul. de 2023 às 16:40, Regina Obe <lr at pcorp.us> escreveu:

> I looked at our list of items between 3.1.7 and 3.1.8
>
>
>
> https://git.osgeo.org/gitea/postgis/postgis/src/branch/stable-3.1/NEWS
>
>
>
> and can’t find any that would cause what you describe below.
>
> It’s possible whatever was changed we didn’t consider strong enough to put
> in the news.
>
>
>
> I haven’t checked the git repo yet to see what changed.
>
>
>
> Usually these switches in plans are often triggered by changes in costs of
> the functions or just data and it is possible we adjusted some of the costs
> on these functions.
>
>
>
> I was thinking the 10 to 14 switch might have been caused by the function
> instrumentation changes.
>
> But that wouldn’t explain your experiencing the same issue on 14 3.1.7 vs
> 14 3.1.8
>
>
>
> If you still have both running, can you give us the output
>
>
>
> SELECT postgis_full_version(), version();
>
>
>
> So we can confirm we are truly comparing only a difference in PostGIS
> version and not the libraries used or PostgreSQL minor version.
>
>
>
> Thanks,
>
> Regina
>
>
>
>
>
> *From:* postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] *On
> Behalf Of *Anderson Mallmann
> *Sent:* Monday, July 17, 2023 10:23 AM
> *To:* postgis-users at lists.osgeo.org; lviecelli199 at gmail.com;
> torres.glauco at gmail.com
> *Subject:* [postgis-users] Problem with Nested Loop in a query with
> postgis
>
>
>
> Hi,
>
> Here at the company I work for, we upgraded a postgres database from
> version 10 to 14 and postgis 2.5.3 to 3.3.3 and in a specific query we
> noticed a huge performance decrease between these versions. This query was
> running less than 1s and now it takes more than 50s.
> Now we are trying to figure out what the real problem is and in which
> version this started to happen. Some of our discoveries so far:
>
> 1) The query runs fine until postgis 3.1.7
> 2) In postgis 3.1.8 the query execution changed from Merge Semi Join to
> Nested Loop Semi Join.
> 3) Running this query on postgis >= 3.1.8 disabling nested loop with “set
> enable_nestloop = false” the query runs as before without any problem.
>
> We don’t know why the plan changed in postgis >= 3.1.8 to use Nested Loop,
> but even in earlier versions if any query like this uses nested loop, the
> performance drops.
>
> Query:
> SELECT pols.pol_id, pols.type, kml, json_build_object('type', 'Feature',
> 'geometry', st_asgeojson(pols.geom)::jsonb) as geoJson, pols.status
> pol_status, pols.owner_id pol_owner, pols.details, pols.creation_date as
> pol_creation_date, pols.country as country, pols.state as state, pols.city
> as city, pols.label as label FROM polygons pols WHERE pols.owner_id is null
> AND pols.status = 'ACTIVE' AND pols.type in ('REGION') AND
> pols.details->>'state' in (SELECT s.state FROM states s WHERE
> ST_DWITHIN(Geography(ST_Transform(s.geom,4326)),
> ST_GeographyFromText('POINT(-43.113826 -22.9022)'),'15000')) AND
> (ST_DWithin(ST_GeographyFromText('POINT(-43.113826 -22.9022)'),
> Geography(pols.geom),'15000')) AND
> (ST_MaxDistance(ST_GeomFromText('POINT(-43.113826 -22.9022)', 4326),
> pols.geom) * 111195) <= '15000'
>
> Here are the explains for this query:
>
> pg 14 - postgis 3.1.7
>  Merge Semi Join  (cost=15.83..16.04 rows=1 width=263) (actual
> time=529.041..586.241 rows=416 loops=1)
>    Output: pols.pol_id, pols.type, pols.kml, json_build_object('type',
> 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb),
> pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country,
> pols.state, pols.city, pols.label
>    Merge Cond: (((pols.details ->> 'state'::text)) = (s.state)::text)
>    Buffers: shared hit=3193
>    ->  Sort  (cost=4.49..4.50 rows=1 width=4656) (actual
> time=244.333..244.367 rows=416 loops=1)
>          Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status,
> pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state,
> pols.city, pols.label, ((pols.details ->> 'state'::text))
>          Sort Key: ((pols.details ->> 'state'::text))
>          Sort Method: quicksort  Memory: 1314kB
>          Buffers: shared hit=2131
>          ->  Index Scan using polygons_geography_state_idx on
> region.polygons pols  (cost=0.28..4.48 rows=1 width=4656) (actual
> time=10.207..243.079 rows=416 loops=1)
>                Output: pols.pol_id, pols.type, pols.kml, pols.geom,
> pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country,
> pols.state, pols.city, pols.label, (pols.details ->> 'state'::text)
>                Index Cond: (geography(pols.geom) &&
> '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography)
>                Filter:
> (('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography &&
> _st_expand(geography(pols.geom), '15000'::double precision)) AND
> _st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> geography(pols.geom), '15000'::double precision, true) AND
> ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry,
> st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double
> precision))
>                Rows Removed by Filter: 395
>                Buffers: shared hit=2125
>    ->  Sort  (cost=11.34..11.36 rows=9 width=3) (actual
> time=284.570..284.570 rows=1 loops=1)
>          Output: s.state
>          Sort Key: s.state
>          Sort Method: quicksort  Memory: 25kB
>          Buffers: shared hit=1062
>          ->  Seq Scan on region.states s  (cost=0.00..11.19 rows=9
> width=3) (actual time=180.759..284.564 rows=1 loops=1)
>                Output: s.state
>                Filter: st_dwithin(geography(st_transform(s.geom, 4326)),
> '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> '15000'::double precision, true)
>                Rows Removed by Filter: 26
>                Buffers: shared hit=1062
>  Query Identifier: -6178717669455780324
>  Planning:
>    Buffers: shared hit=481
>  Planning Time: 31.139 ms
>  Execution Time: 586.869 ms
>
> -----------
>
> pg 14 - postgis 3.1.8
>  Nested Loop Semi Join  (cost=0.67..111.50 rows=1 width=263) (actual
> time=179.622..59918.477 rows=416 loops=1)
>    Output: pols.pol_id, pols.type, pols.kml, json_build_object('type',
> 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb),
> pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country,
> pols.state, pols.city, pols.label
>    Join Filter: ((pols.details ->> 'state'::text) = (s.state)::text)
>    Buffers: shared hit=72116 read=1075
>    ->  Index Scan using polygons_geography_state_idx on region.polygons
> pols  (cost=0.41..55.77 rows=1 width=4656) (actual time=9.477..221.758
> rows=416 loops=1)
>          Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status,
> pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state,
> pols.city, pols.label, pols.last_updated
>          Index Cond: (geography(pols.geom) &&
> _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> '15000'::double precision))
>          Filter:
> (st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> geography(pols.geom), '15000'::double precision, true) AND
> ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry,
> st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double
> precision))
>          Rows Removed by Filter: 395
>          Buffers: shared hit=269 read=692
>    ->  Index Scan using states_geography_idx on region.states s
>  (cost=0.26..54.37 rows=1 width=3) (actual time=142.676..142.676 rows=1
> loops=416)
>          Output: s.state_code, s.state, s.country, s.geom
>          Index Cond: (geography(st_transform(s.geom, 4326)) &&
> _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> '15000'::double precision))
>          Filter: st_dwithin(geography(st_transform(s.geom, 4326)),
> '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography,
> '15000'::double precision, true)
>          Rows Removed by Filter: 1
>          Buffers: shared hit=70988 read=171
>  Query Identifier: -6178717669455780324
>  Planning:
>    Buffers: shared hit=401 read=45
>  Planning Time: 30.729 ms
>  Execution Time: 59919.343 ms
>
> Regards,
> Anderson M.
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20230720/8aa01985/attachment.htm>


More information about the postgis-users mailing list