[postgis-tickets] [PostGIS] #4635: ST_3DDwithin speed slow down between 2.5 and 3.0
PostGIS
trac at osgeo.org
Tue Feb 4 09:11:11 PST 2020
#4635: ST_3DDwithin speed slow down between 2.5 and 3.0
----------------------+---------------------------
Reporter: robe | Owner: pramsey
Type: defect | Status: new
Priority: high | Milestone: PostGIS 3.0.1
Component: postgis | Version: 3.0.x
Resolution: | Keywords:
----------------------+---------------------------
Description changed by robe:
Old description:
> I haven't checked this out but as reported on postgis-users mailing list
> I suspect the culprit is postgis 3.0 and not the 11 -> 12 move because
> changing the signatures to not use the function instrumentation did not
> help.
> So I'm suspecting its the changes in the 3D functions we did to support
> additional types.
>
> https://lists.osgeo.org/pipermail/postgis-users/2020-February/043850.html
>
> Sample data set to exercise:
> https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
> test.pgsql
>
> Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3
> upgrade
>
> For completeness, news repeated
>
> ----
> I took a bit of published data (in my case neuron morphologies from [0])
> and created an isolated database that only contained the minimal
> versions of the tables referenced in this query. Both tables have 328738
> rows and I crated all the indices that are there in the original tables.
> With this, I believe, the effect is visible. The 12MB SQL dump (in
> Postgres custom format) can be found here:
>
> https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
> test.pgsql
>
> If I run the query from my initial email in both Postgres 11 / PostGIS
> 2.5 and Postgres 12 / PostGIS 3, I get the following query plans on the
> machine that also ran the databases of the original query. First,
> Postgres 11/ PostGIS 2.5:
>
> Limit (cost=51.71..498.33 rows=102 width=16) (actual time=2.023..2.697
> rows=215 loops=1)
> Buffers: shared hit=1159
> CTE bb_edge
> -> Bitmap Heap Scan on treenode_edge te (cost=2.43..44.77 rows=1
> width=16) (actual time=1.348..1.804 rows=143 loops=1)
> Recheck Cond: (edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
> Filter: ((edge &&
> '0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
> AND (project_id = 1) AND
> ('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry
> && st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
> '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision))
> Heap Blocks: exact=119
> Buffers: shared hit=167
> -> Bitmap Index Scan on treenode_edge_gix (cost=0.00..2.43
> rows=19 width=0) (actual time=1.321..1.321 rows=143 loops=1)
> Index Cond: (edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
> Buffers: shared hit=48
> -> Nested Loop (cost=6.93..453.55 rows=102 width=16) (actual
> time=2.022..2.674 rows=215 loops=1)
> Buffers: shared hit=1159
> -> HashAggregate (cost=6.51..9.57 rows=102 width=8) (actual
> time=2.004..2.045 rows=259 loops=1)
> Group Key: bb_edge.id
> Buffers: shared hit=167
> -> Append (cost=0.00..6.26 rows=102 width=8) (actual
> time=1.351..1.917 rows=338 loops=1)
> Buffers: shared hit=167
> -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1
> width=8) (actual time=1.351..1.863 rows=143 loops=1)
> Buffers: shared hit=167
> -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06
> rows=1 width=8) (actual time=0.000..0.017 rows=143 loops=1)
> -> ProjectSet (cost=0.00..1.55 rows=100 width=8)
> (actual time=0.003..0.008 rows=52 loops=1)
> -> Result (cost=0.00..0.03 rows=1 width=0)
> (actual time=0.000..0.000 rows=1 loops=1)
> -> Index Scan using treenode_pkey on treenode t1
> (cost=0.42..4.32 rows=1 width=16) (actual time=0.002..0.002 rows=1
> loops=259)
> Index Cond: (id = bb_edge.id)
> Buffers: shared hit=992
> Planning Time: 0.755 ms
> Execution Time: 2.765 ms
>
> And this is Postgres 12 / PostGIS 3:
>
> Limit (cost=32.51..220.67 rows=54 width=16) (actual time=2.934..3.876
> rows=215 loops=1)
> Buffers: shared hit=1190
> CTE bb_edge
> -> Index Scan using treenode_edge_gix on treenode_edge te
> (cost=0.53..28.58 rows=1 width=16) (actual time=0.276..2.641 rows=143
> loops=1)
> Index Cond: ((edge &&&
> '01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
> AND (edge &&&
> st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision)))
> Filter: ((project_id = 1) AND st_3ddwithin(edge,
> '0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
> '20'::double precision))
> Buffers: shared hit=198
> -> Nested Loop (cost=3.94..192.09 rows=54 width=16) (actual
> time=2.931..3.847 rows=215 loops=1)
> Buffers: shared hit=1190
> -> HashAggregate (cost=3.51..5.13 rows=54 width=8) (actual
> time=2.893..2.948 rows=259 loops=1)
> Group Key: bb_edge.id
> Buffers: shared hit=198
> -> Append (cost=0.00..3.38 rows=54 width=8) (actual
> time=0.285..2.768 rows=338 loops=1)
> Buffers: shared hit=198
> -> CTE Scan on bb_edge (cost=0.00..0.06 rows=1
> width=8) (actual time=0.284..2.704 rows=143 loops=1)
> Buffers: shared hit=198
> -> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06
> rows=1 width=8) (actual time=0.001..0.020 rows=143 loops=1)
> -> ProjectSet (cost=0.00..0.83 rows=52 width=8)
> (actual time=0.006..0.013 rows=52 loops=1)
> -> Result (cost=0.00..0.03 rows=1 width=0)
> (actual time=0.001..0.001 rows=1 loops=1)
> -> Index Scan using treenode_pkey on treenode t1
> (cost=0.42..3.43 rows=1 width=16) (actual time=0.003..0.003 rows=1
> loops=259)
> Index Cond: (id = bb_edge.id)
> Buffers: shared hit=992
> Planning Time: 1.148 ms
> Execution Time: 3.981 ms
>
> And just for easier context, this is the query I used (from the first
> mail):
>
> EXPLAIN (ANALYZE, BUFFERS) WITH bb_edge AS (
> SELECT te.id, te.parent_id
> FROM treenode_edge te
> WHERE te.edge &&& ST_MakeLine(ARRAY[
> ST_MakePoint(471548.0, 330140.0, 160440.0),
> ST_MakePoint(542460.0, 290140.0, 160400.0)] ::geometry[])
> AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
> ST_MakePoint(471548.0, 290140.0, 160420.0),
> ST_MakePoint(542460.0, 290140.0, 160420.0),
> ST_MakePoint(542460.0, 330140.0, 160420.0),
> ST_MakePoint(471548.0, 330140.0, 160420.0),
> ST_MakePoint(471548.0, 290140.0, 160420.0)]::geometry[])),
> 20.0)
> AND te.project_id = 1
> )
> SELECT
> t1.id,
> t1.parent_id
> FROM (
> SELECT id FROM bb_edge
> UNION
> SELECT parent_id FROM bb_edge
> UNION
> SELECT UNNEST(ARRAY[41851780,25932042,25932043,14274317,25932045,
> 25932050,25932052,41876633,25932070,42769832,25932073,25932076,
> 42769838,25932082,25932084,27641652,25220534,25932087,41421110,
> 26002235,36359611,25932094,36359614,41421118,41420481,41876677,
> 25932101,27641926,27641927,27642056,25932233,41420487,25932235,
> 25932108,27642059,40327244,25932111,25932243,25967062,25967066,
> 25967067,25967069,25967070,25932130,27642082,25932132,27642085,
> 25967080,25967082,25967084,27642099,28873207]::bigint[])
> LIMIT 80000;
>
> Let me know if I can be of more help.
>
> Thanks,
> Tom
>
> [0] https://www.cell.com/cell/pdf/S0092-8674(18)30787-6.pdf
> On Mon, Feb 03, 2020 at 02:46:59AM -0500, Regina Obe wrote:
> >It could be the change in logic of _ST_3DDWithin. That was changed to
> support more 3d types, so that might have caused a slow down elsewhere.
> >If you can provide a dataset that would be great. I'll try to do some
> benchmarks on some 3d sets I have lying around.
> >
> >Thanks,
> >Regina
> >
>
> ----
New description:
I haven't checked this out but as reported on postgis-users mailing list
I suspect the culprit is postgis 3.0 and not the 11 -> 12 move because
changing the signatures to not use the function instrumentation did not
help.
So I'm suspecting its the changes in the 3D functions we did to support
additional types.
https://lists.osgeo.org/pipermail/postgis-users/2020-February/043850.html
Sample data set to exercise:
https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
test.pgsql
Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3
upgrade
For completeness, news repeated
----
I took a bit of published data (in my case neuron morphologies from [0])
and created an isolated database that only contained the minimal
versions of the tables referenced in this query. Both tables have 328738
rows and I crated all the indices that are there in the original tables.
With this, I believe, the effect is visible. The 12MB SQL dump (in
Postgres custom format) can be found here:
https://github.com/tomka/tmp/blob/master/postgres/catmaid-postgis-
test.pgsql
{{{
If I run the query from my initial email in both Postgres 11 / PostGIS
2.5 and Postgres 12 / PostGIS 3, I get the following query plans on the
machine that also ran the databases of the original query. First,
Postgres 11/ PostGIS 2.5:
Limit (cost=51.71..498.33 rows=102 width=16) (actual time=2.023..2.697
rows=215 loops=1)
Buffers: shared hit=1159
CTE bb_edge
-> Bitmap Heap Scan on treenode_edge te (cost=2.43..44.77 rows=1
width=16) (actual time=1.348..1.804 rows=143 loops=1)
Recheck Cond: (edge &&&
'01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
Filter: ((edge &&
'0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C095034100000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
AND (project_id = 1) AND
('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry
&& st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge,
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
'20'::double precision))
Heap Blocks: exact=119
Buffers: shared hit=167
-> Bitmap Index Scan on treenode_edge_gix (cost=0.00..2.43
rows=19 width=0) (actual time=1.321..1.321 rows=143 loops=1)
Index Cond: (edge &&&
'01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
Buffers: shared hit=48
-> Nested Loop (cost=6.93..453.55 rows=102 width=16) (actual
time=2.022..2.674 rows=215 loops=1)
Buffers: shared hit=1159
-> HashAggregate (cost=6.51..9.57 rows=102 width=8) (actual
time=2.004..2.045 rows=259 loops=1)
Group Key: bb_edge.id
Buffers: shared hit=167
-> Append (cost=0.00..6.26 rows=102 width=8) (actual
time=1.351..1.917 rows=338 loops=1)
Buffers: shared hit=167
-> CTE Scan on bb_edge (cost=0.00..0.06 rows=1
width=8) (actual time=1.351..1.863 rows=143 loops=1)
Buffers: shared hit=167
-> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06
rows=1 width=8) (actual time=0.000..0.017 rows=143 loops=1)
-> ProjectSet (cost=0.00..1.55 rows=100 width=8)
(actual time=0.003..0.008 rows=52 loops=1)
-> Result (cost=0.00..0.03 rows=1 width=0)
(actual time=0.000..0.000 rows=1 loops=1)
-> Index Scan using treenode_pkey on treenode t1
(cost=0.42..4.32 rows=1 width=16) (actual time=0.002..0.002 rows=1
loops=259)
Index Cond: (id = bb_edge.id)
Buffers: shared hit=992
Planning Time: 0.755 ms
Execution Time: 2.765 ms
And this is Postgres 12 / PostGIS 3:
Limit (cost=32.51..220.67 rows=54 width=16) (actual time=2.934..3.876
rows=215 loops=1)
Buffers: shared hit=1190
CTE bb_edge
-> Index Scan using treenode_edge_gix on treenode_edge te
(cost=0.53..28.58 rows=1 width=16) (actual time=0.276..2.641 rows=143
loops=1)
Index Cond: ((edge &&&
'01020000800200000000000000F0C71C41000000007026144100000000C095034100000000F88D20410000000070B511410000000080940341'::geometry)
AND (edge &&&
st_expand('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
'20'::double precision)))
Filter: ((project_id = 1) AND st_3ddwithin(edge,
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88D20410000000070B51141000000002095034100000000F88D20410000000070261441000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry,
'20'::double precision))
Buffers: shared hit=198
-> Nested Loop (cost=3.94..192.09 rows=54 width=16) (actual
time=2.931..3.847 rows=215 loops=1)
Buffers: shared hit=1190
-> HashAggregate (cost=3.51..5.13 rows=54 width=8) (actual
time=2.893..2.948 rows=259 loops=1)
Group Key: bb_edge.id
Buffers: shared hit=198
-> Append (cost=0.00..3.38 rows=54 width=8) (actual
time=0.285..2.768 rows=338 loops=1)
Buffers: shared hit=198
-> CTE Scan on bb_edge (cost=0.00..0.06 rows=1
width=8) (actual time=0.284..2.704 rows=143 loops=1)
Buffers: shared hit=198
-> CTE Scan on bb_edge bb_edge_1 (cost=0.00..0.06
rows=1 width=8) (actual time=0.001..0.020 rows=143 loops=1)
-> ProjectSet (cost=0.00..0.83 rows=52 width=8)
(actual time=0.006..0.013 rows=52 loops=1)
-> Result (cost=0.00..0.03 rows=1 width=0)
(actual time=0.001..0.001 rows=1 loops=1)
-> Index Scan using treenode_pkey on treenode t1
(cost=0.42..3.43 rows=1 width=16) (actual time=0.003..0.003 rows=1
loops=259)
Index Cond: (id = bb_edge.id)
Buffers: shared hit=992
Planning Time: 1.148 ms
Execution Time: 3.981 ms
And just for easier context, this is the query I used (from the first
mail):
EXPLAIN (ANALYZE, BUFFERS) WITH bb_edge AS (
SELECT te.id, te.parent_id
FROM treenode_edge te
WHERE te.edge &&& ST_MakeLine(ARRAY[
ST_MakePoint(471548.0, 330140.0, 160440.0),
ST_MakePoint(542460.0, 290140.0, 160400.0)] ::geometry[])
AND ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
ST_MakePoint(471548.0, 290140.0, 160420.0),
ST_MakePoint(542460.0, 290140.0, 160420.0),
ST_MakePoint(542460.0, 330140.0, 160420.0),
ST_MakePoint(471548.0, 330140.0, 160420.0),
ST_MakePoint(471548.0, 290140.0, 160420.0)]::geometry[])),
20.0)
AND te.project_id = 1
)
SELECT
t1.id,
t1.parent_id
FROM (
SELECT id FROM bb_edge
UNION
SELECT parent_id FROM bb_edge
UNION
SELECT UNNEST(ARRAY[41851780,25932042,25932043,14274317,25932045,
25932050,25932052,41876633,25932070,42769832,25932073,25932076,
42769838,25932082,25932084,27641652,25220534,25932087,41421110,
26002235,36359611,25932094,36359614,41421118,41420481,41876677,
25932101,27641926,27641927,27642056,25932233,41420487,25932235,
25932108,27642059,40327244,25932111,25932243,25967062,25967066,
25967067,25967069,25967070,25932130,27642082,25932132,27642085,
25967080,25967082,25967084,27642099,28873207]::bigint[])
LIMIT 80000;
}}}
Let me know if I can be of more help.
Thanks,
Tom
[0] https://www.cell.com/cell/pdf/S0092-8674(18)30787-6.pdf
On Mon, Feb 03, 2020 at 02:46:59AM -0500, Regina Obe wrote:
>It could be the change in logic of _ST_3DDWithin. That was changed to
support more 3d types, so that might have caused a slow down elsewhere.
>If you can provide a dataset that would be great. I'll try to do some
benchmarks on some 3d sets I have lying around.
>
>Thanks,
>Regina
>
----
--
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4635#comment:1>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list