[postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade
Tom Kazimiers
tom at voodoo-arts.net
Tue Feb 4 13:01:45 PST 2020
>> Thanks for the report. I've ticketed as a bug -
>> https://trac.osgeo.org/postgis/ticket/4635
Thanks for filing the ticket Regina!
On Tue, Feb 04, 2020 at 11:49:44AM -0800, Paul Ramsey wrote:
>OK, cleaning this out a little.
>Ticket for reference https://trac.osgeo.org/postgis/ticket/4635#comment:2
"Good" to know that you seem to be able to reproduce this slow-down.
>First, drop all the rest of your testing SQL and just test the contents
>of the bb_edge CTE. That seems to be sufficient, and it’s the only
>spatial part.
Good point, I should have restricted it to that from beginning.
>Second, for testing 3.0, drop the use of the &&& operator and see what happens with just the ST_DWithin3D function call, which should be throwing an index op in there implicitly.
>
>SELECT Count(*) FROM (
> SELECT te.id, te.parent_id
> FROM treenode_edge te
> WHERE ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
> ST_MakePoint(471548.0, 290140.0, 160420.0),
> ST_MakePoint(542460.0, 290140.0, 160420.0),
> ST_MakePoint(542460.0, 330140.0, 160420.0),
> ST_MakePoint(471548.0, 330140.0, 160420.0),
> ST_MakePoint(471548.0, 290140.0, 160420.0)]::geometry[])),
> 20.0)
>) a;
>
>We are definitely seeing different plans in that CTE.
This is what I get in the test database with 12/3:
Aggregate (cost=506.38..506.41 rows=1 width=8) (actual time=1.530..1.530 rows=1 loops=1)
Buffers: shared hit=198
-> Index Scan using treenode_edge_gix on treenode_edge te (cost=0.53..506.29 rows=33 width=0) (actual time=0.168..1.512 rows=143 loops=1)
Index Cond: (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
Buffers: shared hit=198
Planning Time: 0.654 ms
Execution Time: 1.583 ms
Dropping the &&& operator still allows the use of the index it seems and
is noticeably faster. (In my original query, I used both &&& and
ST_3DDWithin, because I wanted to reduce the number of false positives
that I would get with only one of them in some densely populated areas.
The &&& would make sure every result object bounding box intersects with
the query bounding box and the ST_3DDWithin would ensure I don't get
edges that are actually only passing by somewhere close, but not
actually intersecting the field of view. The line/geometry used with
ST_3DDWithin is in fact a flat cut through the center of the query
bounding box. Therefore I believe I still need the &&& in production,
unfortunately.)
This is what I get for the above query with the test database in the
11/2.5 setup:
Finalize Aggregate (cost=10302.14..10302.17 rows=1 width=8) (actual time=43.245..43.245 rows=1 loops=1)
Buffers: shared hit=3768
-> Gather (cost=10301.91..10302.14 rows=2 width=8) (actual time=43.125..47.544 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3768
-> Partial Aggregate (cost=9301.91..9301.94 rows=1 width=8) (actual time=27.190..27.190 rows=1 loops=3)
Buffers: shared hit=3768
-> Parallel Bitmap Heap Scan on treenode_edge te (cost=693.46..9299.66 rows=899 width=0) (actual time=4.058..27.175 rows=48 loops=3)
Recheck Cond: (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
Filter: ((project_id = 1) AND ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
Rows Removed by Filter: 10800
Heap Blocks: exact=2171
Buffers: shared hit=3768
-> Bitmap Index Scan on treenode_edge_2d_gist (cost=0.00..692.92 rows=32352 width=0) (actual time=5.430..5.430 rows=32543 loops=1)
Index Cond: (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
Buffers: shared hit=261
Planning Time: 22.836 ms
Execution Time: 47.885 ms
>We are definitely seeing different plans in that CTE.
Out of curiosity, what is the plan you are seeing?
>Testing for 12/3:
>
>12/3 is picking an index scan on the spatial index, which is turning out to be slower.
>You can adjust the COST of the ST_3DDWithin down: default cost is 10000.
>Where does the plan change as you adjust it down?
>ALTER FUNCTION ST_3DDWithin COST 9000;
Once I set it to 608 and lower, this is what I get:
Aggregate (cost=59.66..59.69 rows=1 width=8) (actual time=1.436..1.436 rows=1 loops=1)
Buffers: shared hit=183
-> Bitmap Heap Scan on treenode_edge te (cost=2.18..59.58 rows=33 width=0) (actual time=1.055..1.418 rows=143 loops=1)
Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
Heap Blocks: exact=119
Buffers: shared hit=183
-> Bitmap Index Scan on treenode_edge_gix (cost=0.00..2.17 rows=19 width=0) (actual time=1.027..1.027 rows=143 loops=1)
Index Cond: (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
Buffers: shared hit=64
Planning Time: 0.703 ms
Execution Time: 1.516 ms
The planner doesn't come up with any other plan if I set it to anything
lower than 608.
>A big change between 2.5 and 3.0 was costing on spatial functions, so
>seeing different plans is not surprising, especially since the
>mechanisms for building the plan have changed so much (SQL wrapper
>functions vs Pg12 support functions). It’s possible that with the cost
>of the function set so high, and actually being seen by the planner
>now, we’re getting an index scan that, in this case, is less efficient
>than the alternative.
I see, thanks for the explanation. In my the 12/3 version of my
production database (50 million edges), I can't reproduce the cost based
plan change from above though. No matter how low I set the function cost
there, I always get the spatial scan in the plan.
Cheers,
Tom
More information about the postgis-users
mailing list