[postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade

Tom Kazimiers tom at voodoo-arts.net
Sun Feb 2 21:00:34 PST 2020


Hi Regina and Paul,

On Sun, Feb 02, 2020 at 07:42:06PM -0800, Paul Ramsey wrote:
>> On Feb 2, 2020, at 7:37 PM, Regina Obe <lr at pcorp.us> wrote:
>>
>> If it does, can you try changing the function to your old 11 definition and see if that makes the answers the same.  Then at least we'll know it's the change in definition and can work from there.

Thanks for your replies. As far as I can tell, the definition I have for 
ST_3DDWithin is the first version Regina posted (i.e. PostGIS 3.0):

# \df+ ST_3DDWithin
                                                                                                 List of functions
    Schema |     Name     | Result data type |               Argument data types                | Type | Volatility | Parallel |  Owner   | Security | Access privileges | Language |   Source code    | Description 
   --------+--------------+------------------+--------------------------------------------------+------+------------+----------+----------+----------+-------------------+----------+------------------+-------------
    public | st_3ddwithin | boolean          | geom1 geometry, geom2 geometry, double precision | func | immutable  | safe     | postgres | invoker  |                   | c        | LWGEOM_dwithin3d | 

For the sake of completeness, this is my exact PostGIS version in my Postgres 
12.1 environment:

# select PostGIS_full_version();
                                                                                         postgis_full_version
   -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    POSTGIS="3.0.0 r17983" [EXTENSION] PGSQL="120" GEOS="3.7.1-CAPI-1.11.1 27a5e771" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.3" LIBJSON="0.11.99" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)"

With this and the default ST_3DDWithin version, my query plan is:

  Limit  (cost=32.79..230.14 rows=54 width=54) (actual time=193.840..202.393 rows=1569 loops=1)
    Buffers: shared hit=14391
    CTE bb_edge
      ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.67..28.71 rows=1 width=16) (actual time=0.809..192.138 rows=1004 loops=1)
            Index Cond: ((edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) AND (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision)))
            Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=6546
    ->  Nested Loop  (cost=4.08..201.42 rows=54 width=54) (actual time=193.837..202.205 rows=1569 loops=1)
          Buffers: shared hit=14391
          ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual time=193.796..194.101 rows=1570 loops=1)
                Group Key: bb_edge.id
                Buffers: shared hit=6546
                ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual time=0.817..193.016 rows=2060 loops=1)
                      Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1 width=8) (actual time=0.816..192.664 rows=1004 loops=1)
                            Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06 rows=1 width=8) (actual time=0.002..0.130 rows=1004 loops=1)
                      ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8) (actual time=0.008..0.015 rows=52 loops=1)
                            ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
          ->  Index Scan using treenode_id_pkey on treenode t1  (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570)
                Index Cond: (id = bb_edge.id)
                Buffers: shared hit=7845
  Planning Time: 1.211 ms
  Execution Time: 202.791 ms

If I run it again, the query time drops by ~15ms. Once I apply the older
definition of ST_3DDWithin, I can't observer any other change in the timing.

The index condition does indeed change back to a version wihout the
st_expand(), but timing wise it doesn't seem to have any real effect:

  Limit  (cost=4483.06..4680.40 rows=54 width=54) (actual time=186.089..194.138 rows=1569 loops=1)
    Buffers: shared hit=14391
    CTE bb_edge
      ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.42..4478.98 rows=1 width=16) (actual time=0.880..184.426 rows=1004 loops=1)
            Index Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
            Filter: ((project_id = 1) AND (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry) AND ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=6546
    ->  Nested Loop  (cost=4.08..201.42 rows=54 width=54) (actual time=186.087..193.954 rows=1569 loops=1)
          Buffers: shared hit=14391
          ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual time=186.051..186.330 rows=1570 loops=1)
                Group Key: bb_edge.id
                Buffers: shared hit=6546
                ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual time=0.888..185.322 rows=2060 loops=1)
                      Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1 width=8) (actual time=0.887..184.984 rows=1004 loops=1)
                            Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06 rows=1 width=8) (actual time=0.002..0.126 rows=1004 loops=1)
                      ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8) (actual time=0.006..0.013 rows=52 loops=1)
                            ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
          ->  Index Scan using treenode_id_pkey on treenode t1  (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570)
                Index Cond: (id = bb_edge.id)
                Buffers: shared hit=7845
  Planning Time: 1.004 ms
  Execution Time: 194.332 ms

And for comparison, this is the index scan in the 11/2.5 plan:

      ->  Index Scan using treenode_edge_gix on treenode_edge te  
      (cost=0.42..344.70 rows=1 width=16) (actual time=0.569..26.662 rows=1004 loops=1)
            Index Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
            Filter: ((edge &&
'0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000
000080940341'::geometry) AND (project_id = 1) AND
('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B51141000
0000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000
000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=1616

To me it seems the index condition and the filter are the same now and 
aren't actually the main source of the change.

>If this is the case, can you share your data or some mangled version 
>that exhibits the same performance gradient change from 2.5 to 3? We’ll 
>need something to profile to figure out where the extra time is going…

I think this should be possible, I try to find a small public example 
data set where I can recreate this problem. The one I am working with 
contains unpublished data and might be too big to transfer easily.

If you have any other ideas, I am happy to test them in the meantime.

Cheers,
Tom


More information about the postgis-users mailing list