[postgis-users] Slow ST_InteriorRingN and ST_intersection (using ST_ DumpRings makes intersection 230 times faster)
Lars Aksel Opsahl
Lars.Opsahl at nibio.no
Wed Feb 24 00:00:32 PST 2016
Hei
I have big polygon with a total of 1471333 points and that contains 52079 interior rings.
(https://github.com/larsop/esri_union/blob/master/src/test/sql/regress/esri_union-pre-helling_data_d1.sql )
(its only one row here now because I had to remove the rest to get below 50 Mb size on gitlab)
The problem started with a very simple intersection on this polygon that takes 15 minutes.
So I wanted extract the interior rings and do some tests on them, so tried the following command :
create table sl_lop.test_a6 as
SELECT ST_InteriorRingN(a.geo, generate_series(1, 52079)) as geo from sl_lop.helling_data_d1 as a where gid = 9419961;
This fails after some minutes with the following in the log
2016-02-22 09:48:21 CET sl lop STATEMENT: create table sl_lop.test_a3 as
SELECT ST_InteriorRingN(a.geo, generate_series(1, 52079)) as geo from sl_lop.helling_data_d1 as a where gid = 9419961;
2016-02-22 09:56:06 CET LOG: server process (PID 39525) was terminated by signal 9: Killed
2016-02-22 09:56:06 CET DETAIL: Failed process was running: create table sl_lop.test_a3 as
SELECT ST_InteriorRingN(a.geo, generate_series(1, 52079)) as geo from sl_lop.helling_data_d1 as a where gid = 9419961;
2016-02-22 09:56:06 CET LOG: terminating any other active server processes
2016-02-22 09:56:06 CET sl lop FATAL: the database system is in recovery mode
2016-02-22 09:56:06 CET WARNING: terminating connection because of crash of another server process
2016-02-22 09:56:06 CET DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
I have added some test tests at https://github.com/larsop/esri_union/blob/master/src/test/sql/regress/esri_union.sql with some comments
Then I tested sql 17 below and it used 348.677 ms on my laptop
SELECT '17', sum(ST_Numpoints(ST_ExteriorRing(geom))) FROM (SELECT (ST_DumpRings(a.geo)).geom from sl_lop.helling_data_d1 as a where gid = 9419961) as test;
But test 18 below failed after 430 seconds on my laptop.
SELECT '18', SUM(num_points) FROM (SELECT ST_NumPoints(ST_InteriorRingN(a.geo, generate_series(1, 52079))) as num_points from sl_lop.helling_data_d1 as a where gid = 9419961) as
I am running this on lap top with 16 GB of memory with using POSTGIS="2.1.7 r13414" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 1.11.2, released 2015/02/10" LIBXML="2.9.2" LIBJSON="UNKNOWN" TOPOLOGY RASTER
and shared_buffers = 128MB
Then tried both sql on bigger database with more memory and I was able get both sqls to run
Test 17 with ST_Dump used 396.959 ms and and test 18 with used 474375.633 ms.
This means that ST_DumpRings is about 1200 time faster than using ST_InteriorRingN.
Question 1: Why is ST_InteriorRingN using so much memory and why is it taking so long time ?
And then I taught if ST_Intersection depends ST_InteriorRingN then this explains why ST_Intersection is so slow on big polygons like this.
Then back to the intersection query below takes 15 minutes
SELECT ST_area((ST_intersection(a.geo,b.geom))) as area from sl_lop.helling_data_d1 as a, sl_lop.grid_ar5_helling b WHERE b.id = 18 AND gid = 9419961;
area
---------------------
0.00138038452174049
(1 row)
Time: 933238.069 ms
The table grid_ar5_helling is found at (https://github.com/larsop/esri_union/blob/master/src/test/sql/regress/grid_ar5_helling.sql )
Then I rewrote ST_intersection to use ST_DumpRings and added a test on them and then it's more than 230 faster (from 933238.069 ms to 3912.089 ms)
SELECT ST_area((ST_intersection(a.simple_polygon,b.geom))) as area FROM
(
SELECT ST_MakePolygon(exterior.exterior_ring, interior.interior_ring) AS simple_polygon FROM
(
SELECT ST_ExteriorRing(a.geo) as exterior_ring FROM sl_lop.helling_data_d1 AS a WHERE a.gid = 9419961
) as exterior,
(
SELECT (array_agg(ST_ExteriorRing(ring))) AS interior_ring
FROM
( select (rec).geom as ring, (rec).path[1] as arrayid from (
SELECT ST_DumpRings(a.geo) as rec
from sl_lop.helling_data_d1 as a
WHERE a.gid = 9419961) as a) as a,
sl_lop.grid_ar5_helling b
WHERE b.id = 18 AND b.geom && a.ring AND a.arrayid > 0
-- AND ST_Intersects(ST_ExteriorRing(b.geom),a.ring) if add this test the time reduced by 25% more but then I have the missing holes later
) as interior
) a ,
sl_lop.grid_ar5_helling b
WHERE b.id = 18;
area
---------------------
0.00138038452174049
(1 row)
Time: 3912.089 ms
Question 2 : Is it possible to do something like this in the standard postgis intersection ?
(Or do I have do this in my own code ?)
Lars
More information about the postgis-users
mailing list