[postgis-tickets] [PostGIS] #4720: Memory leak when using Geography

PostGIS trac at osgeo.org
Sat Jul 11 14:46:12 PDT 2020


#4720: Memory leak when using Geography
------------------------------+---------------------
 Reporter:  kthujvu           |      Owner:  pramsey
     Type:  defect            |     Status:  new
 Priority:  medium            |  Milestone:
Component:  postgis           |    Version:  3.0.x
 Keywords:  geography memory  |
------------------------------+---------------------
 A query with a single-line subquery involving Geography for each row ends
 up OOMing.


 I loaded
 https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip
 using **Geography** using `shp2pgsql -c -D -i -I -G
 ne_110m_admin_0_countries.shp > ne110.sql`, then `psql ... -f ne110.sql`.
 There is an spatial index `ne_110m_admin_0_countries_geog_idx`.

 In the queries below I create a grid of points using WITH, the problem
 also occurs if I have that data in a static table.

 This query is eating all memory on my system:

 {{{
 WITH latitudes AS (
         SELECT generate_series AS latitude
         FROM generate_series(-90, 90, 0.1)
 ), longitudes AS (
         SELECT generate_series AS longitude
         FROM generate_series(-180, 180, 0.1)
 ), points AS (
         SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326)::geography
 AS geog
         FROM latitudes
         CROSS JOIN longitudes
 )
 SELECT
         geog,
         (
                 SELECT name
                 FROM ne_110m_admin_0_countries AS ne
                 ORDER BY p.geog <-> ne.geog
                 LIMIT 1
         )
 FROM points AS p
 ;
 }}}

 After 10 minutes: >2G and growing about 8M per second

 This does **not** happen if I use Geometry for the points and cast the
 Geography data in the NE table to Geometry:

 {{{
 WITH latitudes AS (
         SELECT generate_series AS latitude
         FROM generate_series(-90, 90, 0.1)
 ), longitudes AS (
         SELECT generate_series AS longitude
         FROM generate_series(-180, 180, 0.1)
 ), points AS (
         SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326) AS geom
         FROM latitudes
         CROSS JOIN longitudes
 )
 SELECT
         geom,
         (
                 SELECT name
                 FROM ne_110m_admin_0_countries AS ne
                 ORDER BY p.geom <-> ne.geog::geometry
                 LIMIT 1
         )
 FROM points AS p
 ;
 }}}

 After 10 minutes: stable 107M

 This **does happen** when I use the existing Geography data and Geometry
 for the point grid:

 {{{
 WITH latitudes AS (
         SELECT generate_series AS latitude
         FROM generate_series(-90, 90, 0.1)
 ), longitudes AS (
         SELECT generate_series AS longitude
         FROM generate_series(-180, 180, 0.1)
 ), points AS (
         SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326) AS geom
         FROM latitudes
         CROSS JOIN longitudes
 )
 SELECT
         geom,
         (
                 SELECT name
                 FROM ne_110m_admin_0_countries AS ne
                 ORDER BY p.geom <-> ne.geog
                 LIMIT 1
         )
 FROM points AS p
 ;
 }}}

 After ~5 minutes: >500MB and growing



 It does **not** happen when I cast the existing Geography to Geometry and
 use Geography for the point data:

 {{{
 WITH latitudes AS (
         SELECT generate_series AS latitude
         FROM generate_series(-90, 90, 0.1)
 ), longitudes AS (
         SELECT generate_series AS longitude
         FROM generate_series(-180, 180, 0.1)
 ), points AS (
         SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326)::geography
 AS geog
         FROM latitudes
         CROSS JOIN longitudes
 )
 SELECT
         geog,
         (
                 SELECT name
                 FROM ne_110m_admin_0_countries AS ne
                 ORDER BY p.geog <-> ne.geog::geometry
                 LIMIT 1
         )
 FROM points AS p
 ;
 }}}


 After 10 minutes: stable 103M


 It does **not** happen if I use the volatile data for both sides of the
 spatial query:

 {{{
 WITH latitudes AS (
         SELECT generate_series AS latitude
         FROM generate_series(-90, 90, 0.1)
 ), longitudes AS (
         SELECT generate_series AS longitude
         FROM generate_series(-180, 180, 0.1)
 ), points AS (
         SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326)::geography
 AS geog
         FROM latitudes
         CROSS JOIN longitudes
 )
 SELECT
         geog,
         (
                 SELECT geog
                 FROM points AS p2
                 ORDER BY p.geog <-> p2.geog
                 LIMIT 1
         )
 FROM points AS p
 ;
 }}}

 After 10 minutes: stable 110M


 -----
 PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled
 by gc
 c (Debian 8.3.0-6) 8.3.0, 64-bit


 POSTGIS="3.0.1 ec2a9aa" [EXTENSION] PGSQL="120" GEOS="3.7.1-CAPI-1.11.1
 27a5e771
 " PROJ="Rel. 5.2.0, September 15th, 2018" GDAL="GDAL 2.4.0, released
 2018/12/14"
 LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.3.1" WAGYU="0.4.3
 (Internal)" TOPO
 LOGY RASTER

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4720>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list