[postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade

Regina Obe lr at pcorp.us
Sun Feb 2 23:46:59 PST 2020


It could be the change in logic of _ST_3DDWithin. That was changed to support more 3d types, so that might have caused a slow down elsewhere.
If you can provide a dataset that would be great.  I'll try to do some benchmarks on some 3d sets I have lying around. 

Thanks,
Regina

-----Original Message-----
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Tom Kazimiers
Sent: Monday, February 3, 2020 12:01 AM
To: PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade

Hi Regina and Paul,

On Sun, Feb 02, 2020 at 07:42:06PM -0800, Paul Ramsey wrote:
>> On Feb 2, 2020, at 7:37 PM, Regina Obe <lr at pcorp.us> wrote:
>>
>> If it does, can you try changing the function to your old 11 definition and see if that makes the answers the same.  Then at least we'll know it's the change in definition and can work from there.

Thanks for your replies. As far as I can tell, the definition I have for ST_3DDWithin is the first version Regina posted (i.e. PostGIS 3.0):

# \df+ ST_3DDWithin
                                                                                                 List of functions
    Schema |     Name     | Result data type |               Argument data types                | Type | Volatility | Parallel |  Owner   | Security | Access privileges | Language |   Source code    | Description 
   --------+--------------+------------------+--------------------------------------------------+------+------------+----------+----------+----------+-------------------+----------+------------------+-------------
    public | st_3ddwithin | boolean          | geom1 geometry, geom2 geometry, double precision | func | immutable  | safe     | postgres | invoker  |                   | c        | LWGEOM_dwithin3d | 

For the sake of completeness, this is my exact PostGIS version in my Postgres
12.1 environment:

# select PostGIS_full_version();
                                                                                         postgis_full_version
   -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    POSTGIS="3.0.0 r17983" [EXTENSION] PGSQL="120" GEOS="3.7.1-CAPI-1.11.1 27a5e771" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.3" LIBJSON="0.11.99" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)"

With this and the default ST_3DDWithin version, my query plan is:

  Limit  (cost=32.79..230.14 rows=54 width=54) (actual time=193.840..202.393 rows=1569 loops=1)
    Buffers: shared hit=14391
    CTE bb_edge
      ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.67..28.71 rows=1 width=16) (actual time=0.809..192.138 rows=1004 loops=1)
            Index Cond: ((edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry) AND (edge &&& st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision)))
            Filter: ((project_id = 1) AND st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=6546
    ->  Nested Loop  (cost=4.08..201.42 rows=54 width=54) (actual time=193.837..202.205 rows=1569 loops=1)
          Buffers: shared hit=14391
          ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual time=193.796..194.101 rows=1570 loops=1)
                Group Key: bb_edge.id
                Buffers: shared hit=6546
                ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual time=0.817..193.016 rows=2060 loops=1)
                      Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1 width=8) (actual time=0.816..192.664 rows=1004 loops=1)
                            Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06 rows=1 width=8) (actual time=0.002..0.130 rows=1004 loops=1)
                      ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8) (actual time=0.008..0.015 rows=52 loops=1)
                            ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
          ->  Index Scan using treenode_id_pkey on treenode t1  (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570)
                Index Cond: (id = bb_edge.id)
                Buffers: shared hit=7845
  Planning Time: 1.211 ms
  Execution Time: 202.791 ms

If I run it again, the query time drops by ~15ms. Once I apply the older definition of ST_3DDWithin, I can't observer any other change in the timing.

The index condition does indeed change back to a version wihout the st_expand(), but timing wise it doesn't seem to have any real effect:

  Limit  (cost=4483.06..4680.40 rows=54 width=54) (actual time=186.089..194.138 rows=1569 loops=1)
    Buffers: shared hit=14391
    CTE bb_edge
      ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.42..4478.98 rows=1 width=16) (actual time=0.880..184.426 rows=1004 loops=1)
            Index Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
            Filter: ((project_id = 1) AND (edge && '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry) AND ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=6546
    ->  Nested Loop  (cost=4.08..201.42 rows=54 width=54) (actual time=186.087..193.954 rows=1569 loops=1)
          Buffers: shared hit=14391
          ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual time=186.051..186.330 rows=1570 loops=1)
                Group Key: bb_edge.id
                Buffers: shared hit=6546
                ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual time=0.888..185.322 rows=2060 loops=1)
                      Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1 width=8) (actual time=0.887..184.984 rows=1004 loops=1)
                            Buffers: shared hit=6546
                      ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06 rows=1 width=8) (actual time=0.002..0.126 rows=1004 loops=1)
                      ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8) (actual time=0.006..0.013 rows=52 loops=1)
                            ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
          ->  Index Scan using treenode_id_pkey on treenode t1  (cost=0.56..3.60 rows=1 width=54) (actual time=0.004..0.004 rows=1 loops=1570)
                Index Cond: (id = bb_edge.id)
                Buffers: shared hit=7845
  Planning Time: 1.004 ms
  Execution Time: 194.332 ms

And for comparison, this is the index scan in the 11/2.5 plan:

      ->  Index Scan using treenode_edge_gix on treenode_edge te  
      (cost=0.42..344.70 rows=1 width=16) (actual time=0.569..26.662 rows=1004 loops=1)
            Index Cond: (edge &&& '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
            Filter: ((edge &&
'0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000
000080940341'::geometry) AND (project_id = 1) AND
('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B51141000
0000020950341'::geometry && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000
000020950341'::geometry, '20'::double precision))
            Rows Removed by Filter: 5
            Buffers: shared hit=1616

To me it seems the index condition and the filter are the same now and aren't actually the main source of the change.

>If this is the case, can you share your data or some mangled version 
>that exhibits the same performance gradient change from 2.5 to 3? We ll 
>need something to profile to figure out where the extra time is going 

I think this should be possible, I try to find a small public example data set where I can recreate this problem. The one I am working with contains unpublished data and might be too big to transfer easily.

If you have any other ideas, I am happy to test them in the meantime.

Cheers,
Tom
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-users



More information about the postgis-users mailing list