[postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade

Tom Kazimiers tom at voodoo-arts.net
Tue Feb 4 13:01:45 PST 2020


>> Thanks for the report.  I've ticketed as a bug - 
>> https://trac.osgeo.org/postgis/ticket/4635

Thanks for filing the ticket Regina!

On Tue, Feb 04, 2020 at 11:49:44AM -0800, Paul Ramsey wrote:
>OK, cleaning this out a little.
>Ticket for reference https://trac.osgeo.org/postgis/ticket/4635#comment:2

"Good" to know that you seem to be able to reproduce this slow-down.

>First, drop all the rest of your testing SQL and just test the contents 
>of the bb_edge CTE. That seems to be sufficient, and it’s the only 
>spatial part.

Good point, I should have restricted it to that from beginning.

>Second, for testing 3.0, drop the use of the &&& operator and see what happens with just the ST_DWithin3D function call, which should be throwing an index op in there implicitly.
>
>SELECT Count(*) FROM (
> SELECT te.id, te.parent_id
>    FROM treenode_edge te
>    WHERE ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
>        ST_MakePoint(471548.0,  290140.0,    160420.0),
>        ST_MakePoint(542460.0, 290140.0,    160420.0),
>        ST_MakePoint(542460.0, 330140.0, 160420.0),
>        ST_MakePoint(471548.0,  330140.0, 160420.0),
>        ST_MakePoint(471548.0,  290140.0,    160420.0)]::geometry[])),
>        20.0)
>) a;
>
>We are definitely seeing different plans in that CTE.

This is what I get in the test database with 12/3:

  Aggregate  (cost=506.38..506.41 rows=1 width=8) (actual time=1.530..1.530 rows=1 loops=1)
    Buffers: shared hit=198
    ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.53..506.29 rows=33 width=0) (actual time=0.168..1.512 rows=143 loops=1)
          Index Cond: (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
          Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
          Buffers: shared hit=198
  Planning Time: 0.654 ms
  Execution Time: 1.583 ms

Dropping the &&& operator still allows the use of the index it seems and 
is noticeably faster. (In my original query, I used both &&& and 
ST_3DDWithin, because I wanted to reduce the number of false positives 
that I would get with only one of them in some densely populated areas.  
The &&& would make sure every result object bounding box intersects with 
the query bounding box and the ST_3DDWithin would ensure I don't get 
edges that are actually only passing by somewhere close, but not 
actually intersecting the field of view. The line/geometry used with 
ST_3DDWithin is in fact a flat cut through the center of the query 
bounding box. Therefore I believe I still need the &&& in production, 
unfortunately.)

This is what I get for the above query with the test database in the 
11/2.5 setup:

  Finalize Aggregate  (cost=10302.14..10302.17 rows=1 width=8) (actual time=43.245..43.245 rows=1 loops=1)
    Buffers: shared hit=3768
    ->  Gather  (cost=10301.91..10302.14 rows=2 width=8) (actual time=43.125..47.544 rows=3 loops=1)
          Workers Planned: 2
          Workers Launched: 2
          Buffers: shared hit=3768
          ->  Partial Aggregate  (cost=9301.91..9301.94 rows=1 width=8) (actual time=27.190..27.190 rows=1 loops=3)
                Buffers: shared hit=3768
                ->  Parallel Bitmap Heap Scan on treenode_edge te  (cost=693.46..9299.66 rows=899 width=0) (actual time=4.058..27.175 rows=48 loops=3)
                      Recheck Cond: (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
                      Filter: ((project_id = 1) AND ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
                      Rows Removed by Filter: 10800
                      Heap Blocks: exact=2171
                      Buffers: shared hit=3768
                      ->  Bitmap Index Scan on treenode_edge_2d_gist  (cost=0.00..692.92 rows=32352 width=0) (actual time=5.430..5.430 rows=32543 loops=1)
                            Index Cond: (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
                            Buffers: shared hit=261
  Planning Time: 22.836 ms
  Execution Time: 47.885 ms

>We are definitely seeing different plans in that CTE.

Out of curiosity, what is the plan you are seeing?

>Testing for 12/3:
>
>12/3 is picking an index scan on the spatial index, which is turning out to be slower.
>You can adjust the COST of the ST_3DDWithin down: default cost is 10000.
>Where does the plan change as you adjust it down?
>ALTER FUNCTION ST_3DDWithin COST 9000;

Once I set it to 608 and lower, this is what I get:

  Aggregate  (cost=59.66..59.69 rows=1 width=8) (actual time=1.436..1.436 rows=1 loops=1)
    Buffers: shared hit=183
    ->  Bitmap Heap Scan on treenode_edge te  (cost=2.18..59.58 rows=33 width=0) (actual time=1.055..1.418 rows=143 loops=1)
          Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
          Heap Blocks: exact=119
          Buffers: shared hit=183
          ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..2.17 rows=19 width=0) (actual time=1.027..1.027 rows=143 loops=1)
                Index Cond: (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
                Buffers: shared hit=64
  Planning Time: 0.703 ms
  Execution Time: 1.516 ms

The planner doesn't come up with any other plan if I set it to anything 
lower than 608.

>A big change between 2.5 and 3.0 was costing on spatial functions, so 
>seeing different plans is not surprising, especially since the 
>mechanisms for building the plan have changed so much (SQL wrapper 
>functions vs Pg12 support functions). It’s possible that with the cost 
>of the function set so high, and actually being seen by the planner 
>now, we’re getting an index scan that, in this case, is less efficient 
>than the alternative.

I see, thanks for the explanation. In my the 12/3 version of my 
production database (50 million edges), I can't reproduce the cost based 
plan change from above though. No matter how low I set the function cost 
there, I always get the spatial scan in the plan.

Cheers,
Tom


More information about the postgis-users mailing list