[postgis-users] ST_Intersection - Slow with raster, fast with polygonized raster?

Regina Obe lr at pcorp.us
Tue Feb 1 13:32:19 PST 2022


Try

 

select
(foo.geomval).val,
sum(st_area((geomval).geom)) as area_ft2
from (
    select (st_intersection( ST_Clip(a.rast, ST_Envelope(b.geom_2263))  , b.geom_2263)) as geomval
   from base_rasters.landcover6in_2017 a
   join scratch.smallsample b
   on st_intersects(a.rast, b.geom_2263)) as foo
group by val;

 

 

From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Mike Treglia
Sent: Monday, January 31, 2022 10:57 AM
To: Florian Nadler <florian.nadler at cybertec.at>
Cc: PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] ST_Intersection - Slow with raster, fast with polygonized raster?

 

Thanks, Florian,

 

Both of those alternatives take ~1 min 20 sec.

 

Below are the execution plans for the queries I previously shared. Let me know if there's anything else that would be helpful.

 

Best,

Mike

 

------Raster/Vector Overlay

explain select
(foo.geomval).val,
sum(st_area((geomval).geom)) as area_ft2
from (
    select (st_intersection(a.rast, b.geom_2263)) as geomval
   from base_rasters.landcover6in_2017 a
   join scratch.smallsample b
   on st_intersects(a.rast, b.geom_2263)) as foo
group by val;

 

HashAggregate  (cost=3904551.46..3904553.46 rows=200 width=16)
  Group Key: (foo.geomval).val
  ->  Subquery Scan on foo  (cost=0.41..456301.46 rows=26525000 width=40)
        ->  ProjectSet  (cost=0.41..191051.46 rows=26525000 width=32)
              ->  Nested Loop  (cost=0.41..51662.59 rows=26525 width=803)
                    ->  Seq Scan on smallsample b  (cost=0.00..1.28 rows=28 width=32)
                    ->  Index Scan using landcover6in_2017_st_convexhull_idx on landcover6in_2017 a  (cost=0.41..1844.10 rows=95 width=771)
                          Index Cond: ((rast)::geometry && b.geom_2263)
                          Filter: _st_intersects(b.geom_2263, rast, NULL::integer)

 

 

 

--Polygonized Raster/Vector Overlay

explain select lc_2017_value,
sum(st_area(st_intersection(a.geom_2263, b.geom_2263))) as area_ft2
from base_rasters.landcover6in_2017_polygons a
join scratch.smallsample b
on st_intersects(a.geom_2263, b.geom_2263)
group by lc_2017_value;

 

HashAggregate  (cost=12518962.15..12518962.23 rows=8 width=16)
  Group Key: a.lc_2017_value
  ->  Nested Loop  (cost=0.42..1295125.12 rows=446631 width=1388)
        ->  Seq Scan on smallsample b  (cost=0.00..1.28 rows=28 width=32)
        ->  Index Scan using landcover6in_2017_polygons_geom_idx on landcover6in_2017_polygons a  (cost=0.42..46238.47 rows=1595 width=1356)
              Index Cond: (geom_2263 && b.geom_2263)
              Filter: st_intersects(geom_2263, b.geom_2263)

 

On Mon, Jan 31, 2022 at 2:42 AM Florian Nadler <florian.nadler at cybertec.at <mailto:florian.nadler at cybertec.at> > wrote:

Hi,

can you provide execution plans for both queries?

In the meantime, please try the following variants:

select
(foo.geomval).val,
sum(st_area((geomval).geom)) as area_ft2
from (
    select (st_intersection(a.rast, b.geom_2263)) as geomval
   from base_rasters.landcover6in_2017 a
   join scratch.smallsample b
   on st_intersects(a.rast::geometry, b.geom_2263)) as foo
group by val;
select
(foo.geomval).val,
sum(st_area((geomval).geom)) as area_ft2
from (
    select (st_intersection(a.rast, b.geom_2263)) as geomval
   from base_rasters.landcover6in_2017 a
   join scratch.smallsample b
   on st_intersects(a.rast, b.geom_2263) where a.rast && b.geom_2263) as foo
group by val

Am 31.01.2022 um 04:32 schrieb Mike Treglia:

Hi all,

 

I was exploring raster/vector overlays, basically to calculate the area of each of several landcover classes (from raster data) within an area or sets of areas (delineated in polygon/multipolygon data). In the past, I had ended up polygonizing the raster, making sure the polygonized layer had a spatial index, and developing a query from there. 

 

But I wanted to come back to the problem and see if I could do it in what seemed like the more elegant way, using the raster directly without an intermediate step of polygonizing it.  While I'm able to do it, at least on a small sample area, it's at a huge performance cost. 

 

I'm wondering if I'm just missing something I need to do to make the raster-based version operate more efficiently. (For what it's worth, rasters like this, are also slow to load into QGIS and such [though once loaded, are fairly quick to render], so I wonder if that's related at all and fixable.)

 

My test case is in New York City - I'm working with a six-inch resolution landcover dataset for the entirety of NYC (https://data.cityofnewyork.us/Environment/Land-Cover-Raster-Data-2017-6in-Resolution/he6d-2qns), which I'm overlaying with a small subset of parcels in a small geographic area. In this example, below, I'm just adding up the area for each landcover class across the focal parcels for simplicity.  When using the polygonized landcover data, it takes about 170ms. When using the raster landcover data, it takes about 1 min 20 seconds. 

 

Below are the queries, as well as the information about the raster table in PostGIS, and the raster2pgsql code I used to load the layer into the database. Curious to hear any thoughts from folks on the list. Of course if there's anything else I can share that would be helpful, let me know

 

Thanks so much! Best,

Mike

 

Here are the queries I'm running:

1) Original Raster over Polygons- ~1 min 20 sec

---------

select 
(foo.geomval).val, 
sum(st_area((geomval).geom)) as area_ft2 
from (
    select (st_intersection(a.rast, b.geom_2263)) as geomval
   from base_rasters.landcover6in_2017 a 
   join scratch.smallsample b
   on st_intersects(a.rast, b.geom_2263)) as foo
group by val;

2) Polygonized Raster over Polygons ~0.170 ms

---------

select lc_2017_value, 
sum(st_area(st_intersection(a.geom_2263, b.geom_2263))) as area_ft2
from base_rasters.landcover6in_2017_polygons a
join scratch.smallsample b
on st_intersects(a.geom_2263, b.geom_2263)
group by lc_2017_value;

 

 

Here's the information for the raster table. (I had added a second index in hopes it might help but no, it didn't).

---------

                                Table "base_rasters.landcover6in_2017"
 Column |  Type   | Collation | Nullable |                           Default
--------+---------+-----------+----------+-------------------------------------------------------------
 rid    | integer |           | not null | nextval('base_rasters.landcover6in_2017_rid_seq'::regclass)
 rast   | raster  |           |          |
Indexes:
    "landcover6in_2017_pkey" PRIMARY KEY, btree (rid)
    "landcover6in_2017_envidx" gist (st_envelope(rast))
    "landcover6in_2017_st_convexhull_idx" gist (st_convexhull(rast))
Check constraints:
    "enforce_height_rast" CHECK (st_height(rast) = ANY (ARRAY[128, 46]))
    "enforce_max_extent_rast" CHECK (st_envelope(rast) @ '0103000020D70800000100000005000000C3F528DC3DD72B41CCCCCCCCB46BFC40C3F528DC3DD72B413333333349B31041E27A14EE9E4930413333333349B31041E2
7A14EE9E493041CCCCCCCCB46BFC40C3F528DC3DD72B41CCCCCCCCB46BFC40'::geometry) NOT VALID
    "enforce_nodata_values_rast" CHECK (_raster_constraint_nodata_values(rast) = '{0.0000000000}'::numeric[])
    "enforce_num_bands_rast" CHECK (st_numbands(rast) = 1)
    "enforce_out_db_rast" CHECK (_raster_constraint_out_db(rast) = '{f}'::boolean[])
    "enforce_pixel_types_rast" CHECK (_raster_constraint_pixel_types(rast) = '{8BUI}'::text[])
    "enforce_same_alignment_rast" CHECK (st_samealignment(rast, '0100000000000000000000E03F000000000000E0BFC3F528DCBDB92E413333333349B3104100000000000000000000000000000000D708000001000100'::
raster))
    "enforce_scalex_rast" CHECK (round(st_scalex(rast)::numeric, 10) = round(0.5, 10))
    "enforce_scaley_rast" CHECK (round(st_scaley(rast)::numeric, 10) = round(- 0.5, 10))
    "enforce_srid_rast" CHECK (st_srid(rast) = 2263)
    "enforce_width_rast" CHECK (st_width(rast) = 128)

 

Here's the raster2pgsql code I used to load the raster in the first place:

-------

raster2pgsql -s 2263 -d -C -t 128x128 -M -I -l 4,16 "NYC_2017_LiDAR_LandCover.tif" base_rasters.landcover6in_2017 | psql -h localhost -U postgres -d nycdb -v ON_ERROR_STOP=1

 

Lastly - here's my PostGIS info (Running on Windows 10, 64 bit): 

POSTGIS="3.1.1 3.1.1" [EXTENSION] PGSQL="130" GEOS="3.9.0-CAPI-1.14.1" SFCGAL="1.3.8" PROJ="7.1.1" GDAL="GDAL 3.2.1, released 2020/12/29" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER

 

 

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> 
https://lists.osgeo.org/mailman/listinfo/postgis-users

-- 
CYBERTEC PostgreSQL International GmbH
Römerstraße 19, A-2752 Wöllersdorf
Web: https://www.cybertec-postgresql.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20220201/a617ed57/attachment.html>


More information about the postgis-users mailing list