[postgis-users] Incorrect Results for ST_ValueCount with large polygons

Michael Treglia mtreglia at gmail.com
Thu Jul 19 14:04:27 PDT 2018


Hi All,

As partial resolution to this, below are a couple of working examples of
SQL, thanks to an example from Regina Obe's book.

These work on smaller subsets of the data I've used for testing, though
unfortunately in running on the entire datasets, after running for hours, I
receive an error: Invalid memory allocation request size 1109458944.
Thoughts/suggestions welcome, but otherwise I hope these examples are
useful to tothers.

Mike


--SQL CODE
with
    cte as (
        select bbl,
            st_valuecount(
                st_clip(st_union(p.rast), geom_2263),1, true,
ARRAY[0,1,2,3,4,5,6,7]
                ) as pv
            FROM
                base_rasters.landcover6in as p
                inner join
                results_scratch.polys_test
            on st_intersects(p.rast, geom_2263)
            group by bbl, geom_2263
        )
SELECT bbl,
    (pv).value, sum((pv).count)
from cte
group by bbl, (pv).value
ORDER by bbl, (pv).value;
--End SQL

And to get the results with land cover by polygon in a 'wide' format (e.g.,
amounts of different land cover in columns):

--SQL CODE
with
    cte as (
        select bbl,geom_2263,
            st_valuecount(
                st_clip(st_union(p.rast), geom_2263),1, true,
ARRAY[0,1,2,3,4,5,6,7]
                ) as pv
            FROM
                base_rasters.landcover6in as p
                inner join
                results_scratch.polys_test
            on st_intersects(p.rast, geom_2263)
            group by bbl, geom_2263
        )
SELECT bbl, geom_2263,
    (sum((pv).count) filter (where (pv).value =
0)::numeric/sum((pv).count))::numeric(5,4) as prop_na,
    (sum((pv).count) filter (where (pv).value =
1)::numeric/sum((pv).count))::numeric(5,4) as prop_canopy,
    (sum((pv).count) filter (where (pv).value =
2)::numeric/sum((pv).count))::numeric(5,4)  as prop_grass,
    (sum((pv).count) filter (where (pv).value =
3)::numeric/sum((pv).count))::numeric(5,4)  as prop_bare,
    (sum((pv).count) filter (where (pv).value =
4)::numeric/sum((pv).count))::numeric(5,4)  as prop_water,
    (sum((pv).count) filter (where (pv).value =
5)::numeric/sum((pv).count))::numeric(5,4)  as prop_bldgs,
    (sum((pv).count) filter (where (pv).value =
6)::numeric/sum((pv).count))::numeric(5,4)  as prop_roads,
    (sum((pv).count) filter (where (pv).value =
7)::numeric/sum((pv).count))::numeric(5,4)  as prop_paved,
    ((sum((pv).count) filter (where (pv).value = 0)) * 0.25)::numeric(12,2)
as area_na,
    ((sum((pv).count) filter (where (pv).value = 1)) * 0.25)::numeric(12,2)
as area_canopy,
    ((sum((pv).count) filter (where (pv).value = 2)) * 0.25)::numeric(12,2)
as area_grass,
    ((sum((pv).count) filter (where (pv).value = 3)) * 0.25)::numeric(12,2)
as area_bare,
    ((sum((pv).count) filter (where (pv).value = 4)) * 0.25)::numeric(12,2)
as area_water,
    ((sum((pv).count) filter (where (pv).value = 5)) * 0.25)::numeric(12,2)
as area_bldgs,
    ((sum((pv).count) filter (where (pv).value = 6)) * 0.25)::numeric(12,2)
as area_roads,
    ((sum((pv).count) filter (where (pv).value = 7)) * 0.25)::numeric(12,2)
as area_paved
from cte
group by bbl, geom_2263
ORDER by bbl;
--End SQL

On Wed, Jul 11, 2018 at 4:59 PM, Michael Treglia <mtreglia at gmail.com> wrote:

> Hi everyone,
>
> Sorry for multiple emails on this, but wanted to ping this list again to
> see if there is anything obvious I'm missing or should check regarding this
> issue. I've reiterated context/the problem/sample sql code below, and if
> there's other info that would be helpful, please let me know.
>
> Thanks! Sincerely,
> Mike
>
> # To reiterate:
> ## The goal/issue:
> I'm trying to calculate # of pixels from a categorical raster, by value,
> within distinct polygons of a vector layer. I have code, below, that seems
> to work accurately for small polygons, though I'm getting inaccuracies in
> large polygons - the addition of some pixels, including some from a value
> that aren't actually contained within a focal polygon. (I'm comparing to
> available calculations for the same overlay, along with results from QGIS
> Zonal Histogram tool). I'd like to get this working for this specific
> overlay, but then be able to apply the SQL code to comparable situations.
>
> ##Data:
> The polygons are parcels, ranging quite widely in size;
> The raster dataset is a 6" land cover dataset loaded as an in-db raster
> via raster2pgsql;
> raster2pgsql -s 2263 -d -C -t 128x128 -M -I -l 4,16
> "landcover_2010_nyc_05ft.img" base_rasters.landcover6in | psql -h localhost
> -U postgres -d dbname -v ON_ERROR_STOP=1
>
> ## System Info:
> I'm running  PostGIS 2.4.4 with PostgreSQL v 10.3 on Windows 7 x64
>
> ## Two sets of sample code that give the same result:
> 1)
> SELECT  bbl, (pvc).VALUE, SUM((pvc).COUNT) AS tot_pix
>  FROM base_rasters.landcover6in
>   INNER join  results_scratch.polys_test
>   ON ST_Intersects(landcover6in.rast, polys_test.geom_2263),
>     ST_ValueCount(ST_Clip(landcover6in.rast,polys_test.geom_2263),1,
> TRUE, ARRAY[1, 2, 3, 4, 5, 6, 7]) AS pvc
>   GROUP BY (pvc).VALUE, bbl
>  ORDER BY bbl, (pvc).VALUE;
>
> 2)
>  SELECT
>    bbl, (value_count).value, SUM((value_count).count) AS count
>   FROM
>     (
>     select
>     bbl,
>       rid,
>       ST_ValueCount(
>         ST_Union(ST_Clip(rast, geom_2263, TRUE)), 1, TRUE, ARRAY[1, 2, 3,
> 4, 5, 6, 7]
>       ) value_count
>     FROM
>       (SELECT bbl, geom_2263 FROM results_scratch.polys_test) v,
>       (SELECT rid, rast FROM base_rasters.landcover6in) r
>     WHERE ST_Intersects(rast, geom_2263)
>     GROUP BY bbl, rid, geom_2263
>     ) i
>   GROUP BY bbl, value
>   ORDER BY bbl, value
>
>
>
> On Fri, Jun 22, 2018 at 5:12 PM, Michael Treglia <mtreglia at gmail.com>
> wrote:
>
>> Hi again all,
>>
>> Sorry for the multiple messages on this - realizing the query I posted
>> was somewhat off from what I intended - below are examples that are getting
>> at what I'm going for but exhibiting the issue I described (sometimes
>> including pixels in counts aren't actually be within polygons [i.e., the
>> categorical value isn't actually within the polygon based on visual
>> inspection & this same analysis with other tools]).
>>
>> Simpler/quicker:
>>
>>   SELECT  bbl, (pvc).VALUE, SUM((pvc).COUNT) AS tot_pix
>>  FROM base_rasters.landcover6in
>>   INNER join  results_scratch.polys_test
>>   ON ST_Intersects(landcover6in.rast, polys_test.geom_2263),
>>     ST_ValueCount(ST_Clip(landcover6in.rast,polys_test.geom_2263),1,
>> TRUE, ARRAY[1, 2, 3, 4, 5, 6, 7]) AS pvc
>>   GROUP BY (pvc).VALUE, bbl
>>  ORDER BY bbl, (pvc).VALUE;
>>
>>
>>
>> A longer, less efficient version that gives the same results (but unions
>> the clipped raster tiles):
>>
>>
>>  SELECT
>>    bbl, (value_count).value, SUM((value_count).count) AS count
>>   FROM
>>     (
>>     select
>>     bbl,
>>       rid,
>>       ST_ValueCount(
>>         ST_Union(ST_Clip(rast, geom_2263, TRUE)), 1, TRUE, ARRAY[1, 2, 3,
>> 4, 5, 6, 7]
>>       ) value_count
>>     FROM
>>       (SELECT bbl, geom_2263 FROM results_scratch.polys_test) v,
>>       (SELECT rid, rast FROM base_rasters.landcover6in) r
>>     WHERE ST_Intersects(rast, geom_2263)
>>     GROUP BY bbl, rid, geom_2263
>>     ) i
>>   GROUP BY bbl, value
>>   ORDER BY bbl, value
>>
>> On Thu, Jun 21, 2018 at 4:23 PM, Michael Treglia <mtreglia at gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm basically trying to compute # of pixels of land cover type by
>>> polygon, for lots of polygons.
>>>
>>> I'm working with a query a query like this:
>>> SELECT  bbl, (pvc).VALUE, SUM((pvc).COUNT) AS tot_pix
>>>  FROM base_rasters.landcover6in
>>>   INNER join  results_scratch.polys_test
>>>   ON ST_Intersects(landcover6in.rast, polys_test.geom_2263),
>>>     ST_ValueCount(ST_Clip(landcover6in.rast,polys_test.geom_2263),1) AS
>>> pvc
>>>   GROUP BY (pvc).VALUE, bbl
>>>  ORDER BY (pvc).VALUE;
>>>
>>> For smaller polygons, spot-checking, it looks like I'm generally getting
>>> the right results. However, for larger polygons, not always. For example,
>>> though a polygon might not have roads, I'm getting non-0 counts for road
>>> pixels within a polygon (polygons have unique identifiers as 'bbl').  The
>>> total area of pixels counted ends up being larger than the polygon (larger
>>> than would be attributable to differences between polygons edges & pixels)
>>>
>>> I'm thinking it's an issue when there are multiple raster tiles
>>> included... but not entirely sure. Any suggestions on whether there's
>>> something off in this query that might be causing an issue like I describe?
>>>
>>> Thanks in advance for any suggestions!
>>> Mike
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20180719/fd7dbcfa/attachment.html>


More information about the postgis-users mailing list