[postgis-tickets] [PostGIS] #4635: ST_3DDwithin speed slow down between 2.5 and 3.0

PostGIS trac at osgeo.org
Tue Feb 4 09:11:11 PST 2020


#4635: ST_3DDwithin speed slow down between 2.5 and 3.0
----------------------+---------------------------
  Reporter:  robe     |      Owner:  pramsey
      Type:  defect   |     Status:  new
  Priority:  high     |  Milestone:  PostGIS 3.0.1
 Component:  postgis  |    Version:  3.0.x
Resolution:           |   Keywords:
----------------------+---------------------------
Description changed by robe:

Old description:

> I haven't checked this out but as reported on postgis-users mailing list
> I suspect the culprit is postgis 3.0 and not the 11 -> 12 move because
> changing the signatures to not use the function instrumentation did not
> help.
> So I'm suspecting its the changes in the 3D functions we did to support
> additional types.
>
> https://lists.osgeo.org/pipermail/postgis-users/2020-February/043850.html
>
> Sample data set to exercise:
> https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
> test.pgsql
>

>  Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3
> upgrade
>
> For completeness, news repeated
>

> ----
> I took a bit of published data (in my case neuron morphologies from [0])
> and created an isolated database that only contained the minimal
> versions of the tables referenced in this query. Both tables have 328738
> rows and I crated all the indices that are there in the original tables.
> With this, I believe, the effect is visible. The 12MB SQL dump (in
> Postgres custom format) can be found here:
>
> https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
> test.pgsql
>
> If I run the query from my initial email in both Postgres 11 / PostGIS
> 2.5 and Postgres 12 / PostGIS 3, I get the following query plans on the
> machine that also ran the databases of the original query. First,
> Postgres 11/ PostGIS 2.5:
>
>   Limit  (cost=51.71..498.33 rows=102 width=16) (actual time=2.023..2.697
>   rows=215 loops=1)
>     Buffers: shared hit=1159
>     CTE bb_edge
>       ->  Bitmap Heap Scan on treenode_edge te  (cost=2.43..44.77 rows=1
> width=16) (actual time=1.348..1.804 rows=143 loops=1)
>             Recheck Cond: (edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
>             Filter: ((edge &&
> '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
> AND (project_id = 1) AND
> ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry
> && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
> '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision))
>             Heap Blocks: exact=119
>             Buffers: shared hit=167
>             ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..2.43
> rows=19 width=0) (actual time=1.321..1.321 rows=143 loops=1)
>                   Index Cond: (edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
>                   Buffers: shared hit=48
>     ->  Nested Loop  (cost=6.93..453.55 rows=102 width=16) (actual
> time=2.022..2.674 rows=215 loops=1)
>           Buffers: shared hit=1159
>           ->  HashAggregate  (cost=6.51..9.57 rows=102 width=8) (actual
> time=2.004..2.045 rows=259 loops=1)
>                 Group Key: bb_edge.id
>                 Buffers: shared hit=167
>                 ->  Append  (cost=0.00..6.26 rows=102 width=8) (actual
> time=1.351..1.917 rows=338 loops=1)
>                       Buffers: shared hit=167
>                       ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1
> width=8) (actual time=1.351..1.863 rows=143 loops=1)
>                             Buffers: shared hit=167
>                       ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06
> rows=1 width=8) (actual time=0.000..0.017 rows=143 loops=1)
>                       ->  ProjectSet  (cost=0.00..1.55 rows=100 width=8)
> (actual time=0.003..0.008 rows=52 loops=1)
>                             ->  Result  (cost=0.00..0.03 rows=1 width=0)
> (actual time=0.000..0.000 rows=1 loops=1)
>           ->  Index Scan using treenode_pkey on treenode t1
> (cost=0.42..4.32 rows=1 width=16) (actual time=0.002..0.002 rows=1
> loops=259)
>                 Index Cond: (id = bb_edge.id)
>                 Buffers: shared hit=992
>   Planning Time: 0.755 ms
>   Execution Time: 2.765 ms
>
> And this is Postgres 12 / PostGIS 3:
>
>   Limit  (cost=32.51..220.67 rows=54 width=16) (actual time=2.934..3.876
> rows=215 loops=1)
>     Buffers: shared hit=1190
>     CTE bb_edge
>       ->  Index Scan using treenode_edge_gix on treenode_edge te
> (cost=0.53..28.58 rows=1 width=16) (actual time=0.276..2.641 rows=143
> loops=1)
>             Index Cond: ((edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
> AND (edge &&&
> st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision)))
>             Filter: ((project_id = 1) AND st_3ddwithin(edge,
> '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision))
>             Buffers: shared hit=198
>     ->  Nested Loop  (cost=3.94..192.09 rows=54 width=16) (actual
> time=2.931..3.847 rows=215 loops=1)
>           Buffers: shared hit=1190
>           ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual
> time=2.893..2.948 rows=259 loops=1)
>                 Group Key: bb_edge.id
>                 Buffers: shared hit=198
>                 ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual
> time=0.285..2.768 rows=338 loops=1)
>                       Buffers: shared hit=198
>                       ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1
> width=8) (actual time=0.284..2.704 rows=143 loops=1)
>                             Buffers: shared hit=198
>                       ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06
> rows=1 width=8) (actual time=0.001..0.020 rows=143 loops=1)
>                       ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8)
> (actual time=0.006..0.013 rows=52 loops=1)
>                             ->  Result  (cost=0.00..0.03 rows=1 width=0)
> (actual time=0.001..0.001 rows=1 loops=1)
>           ->  Index Scan using treenode_pkey on treenode t1
> (cost=0.42..3.43 rows=1 width=16) (actual time=0.003..0.003 rows=1
> loops=259)
>                 Index Cond: (id = bb_edge.id)
>                 Buffers: shared hit=992
>   Planning Time: 1.148 ms
>   Execution Time: 3.981 ms
>
> And just for easier context, this is the query I used (from the first
> mail):
>
>    EXPLAIN (ANALYZE, BUFFERS) WITH bb_edge AS (
>    SELECT te.id, te.parent_id
>     FROM treenode_edge te
>     WHERE te.edge &&& ST_MakeLine(ARRAY[
>         ST_MakePoint(471548.0, 330140.0, 160440.0),
>         ST_MakePoint(542460.0, 290140.0, 160400.0)] ::geometry[])
>     AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
>         ST_MakePoint(471548.0,  290140.0,    160420.0),
>         ST_MakePoint(542460.0, 290140.0,    160420.0),
>         ST_MakePoint(542460.0, 330140.0, 160420.0),
>         ST_MakePoint(471548.0,  330140.0, 160420.0),
>         ST_MakePoint(471548.0,  290140.0,    160420.0)]::geometry[])),
>         20.0)
>     AND te.project_id = 1
>    )
>    SELECT
>      t1.id,
>      t1.parent_id
>    FROM (
>      SELECT id FROM bb_edge
>      UNION
>      SELECT parent_id FROM bb_edge
>      UNION
>      SELECT UNNEST(ARRAY[41851780,25932042,25932043,14274317,25932045,
>        25932050,25932052,41876633,25932070,42769832,25932073,25932076,
>        42769838,25932082,25932084,27641652,25220534,25932087,41421110,
>        26002235,36359611,25932094,36359614,41421118,41420481,41876677,
>        25932101,27641926,27641927,27642056,25932233,41420487,25932235,
>        25932108,27642059,40327244,25932111,25932243,25967062,25967066,
>        25967067,25967069,25967070,25932130,27642082,25932132,27642085,
>        25967080,25967082,25967084,27642099,28873207]::bigint[])
>    LIMIT 80000;
>
> Let me know if I can be of more help.
>
> Thanks,
> Tom
>
> [0] https://www.cell.com/cell/pdf/S0092-8674(18)30787-6.pdf
> On Mon, Feb 03, 2020 at 02:46:59AM -0500, Regina Obe wrote:
> >It could be the change in logic of _ST_3DDWithin. That was changed to
> support more 3d types, so that might have caused a slow down elsewhere.
> >If you can provide a dataset that would be great.  I'll try to do some
> benchmarks on some 3d sets I have lying around.
> >
> >Thanks,
> >Regina
> >
>
> ----

New description:

 I haven't checked this out but as reported on postgis-users mailing list
 I suspect the culprit is postgis 3.0 and not the 11 -> 12 move because
 changing the signatures to not use the function instrumentation did not
 help.
 So I'm suspecting its the changes in the 3D functions we did to support
 additional types.

 https://lists.osgeo.org/pipermail/postgis-users/2020-February/043850.html

 Sample data set to exercise:
 https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
 test.pgsql


  Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3
 upgrade

 For completeness, news repeated


 ----
 I took a bit of published data (in my case neuron morphologies from [0])
 and created an isolated database that only contained the minimal
 versions of the tables referenced in this query. Both tables have 328738
 rows and I crated all the indices that are there in the original tables.
 With this, I believe, the effect is visible. The 12MB SQL dump (in
 Postgres custom format) can be found here:

 https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
 test.pgsql

 {{{

 If I run the query from my initial email in both Postgres 11 / PostGIS
 2.5 and Postgres 12 / PostGIS 3, I get the following query plans on the
 machine that also ran the databases of the original query. First,
 Postgres 11/ PostGIS 2.5:

   Limit  (cost=51.71..498.33 rows=102 width=16) (actual time=2.023..2.697
   rows=215 loops=1)
     Buffers: shared hit=1159
     CTE bb_edge
       ->  Bitmap Heap Scan on treenode_edge te  (cost=2.43..44.77 rows=1
 width=16) (actual time=1.348..1.804 rows=143 loops=1)
             Recheck Cond: (edge &&&
 '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
             Filter: ((edge &&
 '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
 AND (project_id = 1) AND
 ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry
 && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
 '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
 '20'::double precision))
             Heap Blocks: exact=119
             Buffers: shared hit=167
             ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..2.43
 rows=19 width=0) (actual time=1.321..1.321 rows=143 loops=1)
                   Index Cond: (edge &&&
 '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
                   Buffers: shared hit=48
     ->  Nested Loop  (cost=6.93..453.55 rows=102 width=16) (actual
 time=2.022..2.674 rows=215 loops=1)
           Buffers: shared hit=1159
           ->  HashAggregate  (cost=6.51..9.57 rows=102 width=8) (actual
 time=2.004..2.045 rows=259 loops=1)
                 Group Key: bb_edge.id
                 Buffers: shared hit=167
                 ->  Append  (cost=0.00..6.26 rows=102 width=8) (actual
 time=1.351..1.917 rows=338 loops=1)
                       Buffers: shared hit=167
                       ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1
 width=8) (actual time=1.351..1.863 rows=143 loops=1)
                             Buffers: shared hit=167
                       ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06
 rows=1 width=8) (actual time=0.000..0.017 rows=143 loops=1)
                       ->  ProjectSet  (cost=0.00..1.55 rows=100 width=8)
 (actual time=0.003..0.008 rows=52 loops=1)
                             ->  Result  (cost=0.00..0.03 rows=1 width=0)
 (actual time=0.000..0.000 rows=1 loops=1)
           ->  Index Scan using treenode_pkey on treenode t1
 (cost=0.42..4.32 rows=1 width=16) (actual time=0.002..0.002 rows=1
 loops=259)
                 Index Cond: (id = bb_edge.id)
                 Buffers: shared hit=992
   Planning Time: 0.755 ms
   Execution Time: 2.765 ms

 And this is Postgres 12 / PostGIS 3:

   Limit  (cost=32.51..220.67 rows=54 width=16) (actual time=2.934..3.876
 rows=215 loops=1)
     Buffers: shared hit=1190
     CTE bb_edge
       ->  Index Scan using treenode_edge_gix on treenode_edge te
 (cost=0.53..28.58 rows=1 width=16) (actual time=0.276..2.641 rows=143
 loops=1)
             Index Cond: ((edge &&&
 '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
 AND (edge &&&
 st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
 '20'::double precision)))
             Filter: ((project_id = 1) AND st_3ddwithin(edge,
 '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
 '20'::double precision))
             Buffers: shared hit=198
     ->  Nested Loop  (cost=3.94..192.09 rows=54 width=16) (actual
 time=2.931..3.847 rows=215 loops=1)
           Buffers: shared hit=1190
           ->  HashAggregate  (cost=3.51..5.13 rows=54 width=8) (actual
 time=2.893..2.948 rows=259 loops=1)
                 Group Key: bb_edge.id
                 Buffers: shared hit=198
                 ->  Append  (cost=0.00..3.38 rows=54 width=8) (actual
 time=0.285..2.768 rows=338 loops=1)
                       Buffers: shared hit=198
                       ->  CTE Scan on bb_edge  (cost=0.00..0.06 rows=1
 width=8) (actual time=0.284..2.704 rows=143 loops=1)
                             Buffers: shared hit=198
                       ->  CTE Scan on bb_edge bb_edge_1  (cost=0.00..0.06
 rows=1 width=8) (actual time=0.001..0.020 rows=143 loops=1)
                       ->  ProjectSet  (cost=0.00..0.83 rows=52 width=8)
 (actual time=0.006..0.013 rows=52 loops=1)
                             ->  Result  (cost=0.00..0.03 rows=1 width=0)
 (actual time=0.001..0.001 rows=1 loops=1)
           ->  Index Scan using treenode_pkey on treenode t1
 (cost=0.42..3.43 rows=1 width=16) (actual time=0.003..0.003 rows=1
 loops=259)
                 Index Cond: (id = bb_edge.id)
                 Buffers: shared hit=992
   Planning Time: 1.148 ms
   Execution Time: 3.981 ms

 And just for easier context, this is the query I used (from the first
 mail):

    EXPLAIN (ANALYZE, BUFFERS) WITH bb_edge AS (
    SELECT te.id, te.parent_id
     FROM treenode_edge te
     WHERE te.edge &&& ST_MakeLine(ARRAY[
         ST_MakePoint(471548.0, 330140.0, 160440.0),
         ST_MakePoint(542460.0, 290140.0, 160400.0)] ::geometry[])
     AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
         ST_MakePoint(471548.0,  290140.0,    160420.0),
         ST_MakePoint(542460.0, 290140.0,    160420.0),
         ST_MakePoint(542460.0, 330140.0, 160420.0),
         ST_MakePoint(471548.0,  330140.0, 160420.0),
         ST_MakePoint(471548.0,  290140.0,    160420.0)]::geometry[])),
         20.0)
     AND te.project_id = 1
    )
    SELECT
      t1.id,
      t1.parent_id
    FROM (
      SELECT id FROM bb_edge
      UNION
      SELECT parent_id FROM bb_edge
      UNION
      SELECT UNNEST(ARRAY[41851780,25932042,25932043,14274317,25932045,
        25932050,25932052,41876633,25932070,42769832,25932073,25932076,
        42769838,25932082,25932084,27641652,25220534,25932087,41421110,
        26002235,36359611,25932094,36359614,41421118,41420481,41876677,
        25932101,27641926,27641927,27642056,25932233,41420487,25932235,
        25932108,27642059,40327244,25932111,25932243,25967062,25967066,
        25967067,25967069,25967070,25932130,27642082,25932132,27642085,
        25967080,25967082,25967084,27642099,28873207]::bigint[])
    LIMIT 80000;
 }}}


 Let me know if I can be of more help.

 Thanks,
 Tom

 [0] https://www.cell.com/cell/pdf/S0092-8674(18)30787-6.pdf
 On Mon, Feb 03, 2020 at 02:46:59AM -0500, Regina Obe wrote:
 >It could be the change in logic of _ST_3DDWithin. That was changed to
 support more 3d types, so that might have caused a slow down elsewhere.
 >If you can provide a dataset that would be great.  I'll try to do some
 benchmarks on some 3d sets I have lying around.
 >
 >Thanks,
 >Regina
 >

 ----

--

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4635#comment:1>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list