[postgis-users] ST_Intersection - Slow with raster, fast with polygonized raster?
Mike Treglia
mtreglia at gmail.com
Mon Jan 31 07:56:43 PST 2022
Thanks, Florian,
Both of those alternatives take ~1 min 20 sec.
Below are the execution plans for the queries I previously shared. Let me
know if there's anything else that would be helpful.
Best,
Mike
------Raster/Vector Overlay
explain select
(foo.geomval).val,
sum(st_area((geomval).geom)) as area_ft2
from (
select (st_intersection(a.rast, b.geom_2263)) as geomval
from base_rasters.landcover6in_2017 a
join scratch.smallsample b
on st_intersects(a.rast, b.geom_2263)) as foo
group by val;
HashAggregate (cost=3904551.46..3904553.46 rows=200 width=16)
Group Key: (foo.geomval).val
-> Subquery Scan on foo (cost=0.41..456301.46 rows=26525000 width=40)
-> ProjectSet (cost=0.41..191051.46 rows=26525000 width=32)
-> Nested Loop (cost=0.41..51662.59 rows=26525 width=803)
-> Seq Scan on smallsample b (cost=0.00..1.28 rows=28
width=32)
-> Index Scan using
landcover6in_2017_st_convexhull_idx on landcover6in_2017 a
(cost=0.41..1844.10 rows=95 width=771)
Index Cond: ((rast)::geometry && b.geom_2263)
Filter: _st_intersects(b.geom_2263, rast,
NULL::integer)
--Polygonized Raster/Vector Overlay
explain select lc_2017_value,
sum(st_area(st_intersection(a.geom_2263, b.geom_2263))) as area_ft2
from base_rasters.landcover6in_2017_polygons a
join scratch.smallsample b
on st_intersects(a.geom_2263, b.geom_2263)
group by lc_2017_value;
HashAggregate (cost=12518962.15..12518962.23 rows=8 width=16)
Group Key: a.lc_2017_value
-> Nested Loop (cost=0.42..1295125.12 rows=446631 width=1388)
-> Seq Scan on smallsample b (cost=0.00..1.28 rows=28 width=32)
-> Index Scan using landcover6in_2017_polygons_geom_idx on
landcover6in_2017_polygons a (cost=0.42..46238.47 rows=1595 width=1356)
Index Cond: (geom_2263 && b.geom_2263)
Filter: st_intersects(geom_2263, b.geom_2263)
On Mon, Jan 31, 2022 at 2:42 AM Florian Nadler <florian.nadler at cybertec.at>
wrote:
> Hi,
>
> can you provide execution plans for both queries?
>
> In the meantime, please try the following variants:
>
> select(foo.geomval).val,sum(st_area((geomval).geom)) as area_ft2from (
> select (st_intersection(a.rast, b.geom_2263)) as geomval from base_rasters.landcover6in_2017 a join scratch.smallsample b on st_intersects(a.rast::geometry, b.geom_2263)) as foogroup by val;
>
> select(foo.geomval).val,sum(st_area((geomval).geom)) as area_ft2from (
> select (st_intersection(a.rast, b.geom_2263)) as geomval from base_rasters.landcover6in_2017 a join scratch.smallsample b on st_intersects(a.rast, b.geom_2263) where a.rast && b.geom_2263) as foogroup by val
>
> Am 31.01.2022 um 04:32 schrieb Mike Treglia:
>
> Hi all,
>
> I was exploring raster/vector overlays, basically to calculate the area of
> each of several landcover classes (from raster data) within an area or sets
> of areas (delineated in polygon/multipolygon data). In the past, I had
> ended up polygonizing the raster, making sure the polygonized layer had a
> spatial index, and developing a query from there.
>
> But I wanted to come back to the problem and see if I could do it in what
> seemed like the more elegant way, using the raster directly without an
> intermediate step of polygonizing it. While I'm able to do it, at least on
> a small sample area, it's at a huge performance cost.
>
> I'm wondering if I'm just missing something I need to do to make the
> raster-based version operate more efficiently. (For what it's worth,
> rasters like this, are also slow to load into QGIS and such [though once
> loaded, are fairly quick to render], so I wonder if that's related at all
> and fixable.)
>
> My test case is in New York City - I'm working with a six-inch resolution
> landcover dataset for the entirety of NYC (
> https://data.cityofnewyork.us/Environment/Land-Cover-Raster-Data-2017-6in-Resolution/he6d-2qns),
> which I'm overlaying with a small subset of parcels in a small geographic
> area. In this example, below, I'm just adding up the area for each
> landcover class across the focal parcels for simplicity. When using the
> polygonized landcover data, it takes about 170ms. When using the raster
> landcover data, it takes about 1 min 20 seconds.
>
> Below are the queries, as well as the information about the raster table
> in PostGIS, and the raster2pgsql code I used to load the layer into the
> database. Curious to hear any thoughts from folks on the list. Of course if
> there's anything else I can share that would be helpful, let me know
>
> Thanks so much! Best,
> Mike
>
> Here are the queries I'm running:
> 1) Original Raster over Polygons- ~1 min 20 sec
> ---------
> select
> (foo.geomval).val,
> sum(st_area((geomval).geom)) as area_ft2
> from (
> select (st_intersection(a.rast, b.geom_2263)) as geomval
> from base_rasters.landcover6in_2017 a
> join scratch.smallsample b
> on st_intersects(a.rast, b.geom_2263)) as foo
> group by val;
>
> 2) Polygonized Raster over Polygons ~0.170 ms
> ---------
> select lc_2017_value,
> sum(st_area(st_intersection(a.geom_2263, b.geom_2263))) as area_ft2
> from base_rasters.landcover6in_2017_polygons a
> join scratch.smallsample b
> on st_intersects(a.geom_2263, b.geom_2263)
> group by lc_2017_value;
>
>
> Here's the information for the raster table. (I had added a second index
> in hopes it might help but no, it didn't).
> ---------
> Table "base_rasters.landcover6in_2017"
> Column | Type | Collation | Nullable |
> Default
>
> --------+---------+-----------+----------+-------------------------------------------------------------
> rid | integer | | not null |
> nextval('base_rasters.landcover6in_2017_rid_seq'::regclass)
> rast | raster | | |
> Indexes:
> "landcover6in_2017_pkey" PRIMARY KEY, btree (rid)
> "landcover6in_2017_envidx" gist (st_envelope(rast))
> "landcover6in_2017_st_convexhull_idx" gist (st_convexhull(rast))
> Check constraints:
> "enforce_height_rast" CHECK (st_height(rast) = ANY (ARRAY[128, 46]))
> "enforce_max_extent_rast" CHECK (st_envelope(rast) @
> '0103000020D70800000100000005000000C3F528DC3DD72B41CCCCCCCCB46BFC40C3F528DC3DD72B413333333349B31041E27A14EE9E4930413333333349B31041E2
> 7A14EE9E493041CCCCCCCCB46BFC40C3F528DC3DD72B41CCCCCCCCB46BFC40'::geometry)
> NOT VALID
> "enforce_nodata_values_rast" CHECK
> (_raster_constraint_nodata_values(rast) = '{0.0000000000}'::numeric[])
> "enforce_num_bands_rast" CHECK (st_numbands(rast) = 1)
> "enforce_out_db_rast" CHECK (_raster_constraint_out_db(rast) =
> '{f}'::boolean[])
> "enforce_pixel_types_rast" CHECK (_raster_constraint_pixel_types(rast)
> = '{8BUI}'::text[])
> "enforce_same_alignment_rast" CHECK (st_samealignment(rast,
> '0100000000000000000000E03F000000000000E0BFC3F528DCBDB92E413333333349B3104100000000000000000000000000000000D708000001000100'::
> raster))
> "enforce_scalex_rast" CHECK (round(st_scalex(rast)::numeric, 10) =
> round(0.5, 10))
> "enforce_scaley_rast" CHECK (round(st_scaley(rast)::numeric, 10) =
> round(- 0.5, 10))
> "enforce_srid_rast" CHECK (st_srid(rast) = 2263)
> "enforce_width_rast" CHECK (st_width(rast) = 128)
>
> Here's the raster2pgsql code I used to load the raster in the first place:
> -------
> raster2pgsql -s 2263 -d -C -t 128x128 -M -I -l 4,16
> "NYC_2017_LiDAR_LandCover.tif" base_rasters.landcover6in_2017 | psql -h
> localhost -U postgres -d nycdb -v ON_ERROR_STOP=1
>
> Lastly - here's my PostGIS info (Running on Windows 10, 64 bit):
> POSTGIS="3.1.1 3.1.1" [EXTENSION] PGSQL="130" GEOS="3.9.0-CAPI-1.14.1"
> SFCGAL="1.3.8" PROJ="7.1.1" GDAL="GDAL 3.2.1, released 2020/12/29"
> LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)"
> TOPOLOGY RASTER
>
>
> _______________________________________________
> postgis-users mailing listpostgis-users at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/postgis-users
>
> --
> CYBERTEC PostgreSQL International GmbH
> Römerstraße 19, A-2752 Wöllersdorf
> Web: https://www.cybertec-postgresql.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20220131/9748c0c7/attachment.html>
More information about the postgis-users
mailing list