[postgis-devel] Another idea to speedup raster value editing

Mon Feb 6 13:23:46 PST 2012

> 	> This suggests a user visible API addition of
> 	>
> 	> ST_DumpRasterAsGeomval(rast raster, cell_pt geometry OUT,  values
> array
> 	> OUT)
> 
> 
> 	How is this different from ST_DumpAsPolygons()? I agree that
> ST_DumpAsPolygonValues() would have been clearer...
> 
> Well, the geometry column would contain points (cell centers), for one. Values is
> also an array instead of a scalar, making this a true raster operation instead of a
> single band operation.

> With this call, you're working with pixels instead of the perimeter of aggregated
> clumps of pixels. That's vital when you start thinking multiband. Individual pixels
> can always have a unique combination of band values. Aggregations of
> pixels...well, in certain special cases where the bands are well correlated, I
> suppose it may work OK. But on real data, you'd end up with one polygon
> containing all the pixels having the value [12, 15] and another polygon for pixels
> having value [12, 22]...assuming you put in a lot of effort to implement a
> multiband DumpAsPolygons.

There is already a ST_PixelAsPolygons() that dump a geomval for each pixel without spatial aggregation.

I agree that we should make it multiple band aware by being able to pass an array of band. The result would be a geomvalarray.

I agree also that we should have a ST_PixelAsPoints() doing basically the the same thing as ST_PixelAsPolygons()

> 	> ST_LoadRasterFromGeomval(rast raster, t_name text, pt_col text
> default
> 	> "cell_pt", values array)
> 	>
> 	> t_name = table name
> 	> pt_col   = name of column containing points
> 	> values = array of column names containing values to load into raster.
> 
> 
> 	How is this different from the planned ST_UnionToRaster()? What do
> you do when two or more points fall into the same pixel?
> 
> First question: it appears that ST_UnionToRaster: a] is not planned to operate on
> an existing raster; 

Right. But ST_BurnToRaster() just above is.

> b] will apply the same value to every geometry; 

False: Read carefully. Both ST_BurnToRaster() and ST_UnionToRaster() have an optional 'value" parameter. This parameter could be an array to support the creation of multiband raster from multiple values associated with points, lines or polygons.

c] does not
> specify if it will even take "points", or what the outcome might be if you
> provided points.

One point= one pixel. The real problem come with multipoints. In this case the values have to be gathered from the Z or M, not from a table attribute. There should be a special parameter for that saying "take the value from the geometry".

> Second question: do whatever is appropriate. If you're considering the pixels to
> be an area, then average the values within the pixel, or take the max, or the min,
> depending on what you're interested in. Or consider the array of geomvals to
> represent layers in "z order" (topmost layer wins). If two or more polygons fall
> within the same pixel, then weight the average by the area of each polygon. If
> you're considering the pixels to represent point samples, interpolate (krige, IDW,
> etc.). (And by "you", I don't mean the authors of the tool, I mean users.) In any
> case, these are all meant to outline the array of use cases which are likely to be
> common, support whatever subset is convenient.

This is the goal of p_expression, t_expression and f_expression. They are pixel, temporary and final expressions allowing the implementation of many options. They are accompanied by their respective nodata alternative expressions. The number of parameter might seem impressive but they can be resumed to one keyword parameter like 'MEAN', 'COUNT', 'MAX_VALUE', 'MIN_VALUE', 'MAX_LENGTH', 'MAX_AREA' implying a predefined set of pixel, temporary and final expression similar to what we did for ST_Union.

> But realize that by asking that second question, you have placed yourself
> squarely in "Coverage (ISO19123)" territory. Deciding what value to return at a
> point based on a set of geomvals is exactly what a coverage does. It may be
> worth your while to investigate their solution to this problem before rolling your
> own.

I would be surprised ISO19123 offers so much flexibility. Last time I looked at it (2008) CV_CommonPointRule was defining only average, low, high, all, start and end. All very easily implementable with the p_expression, t_expression and f_expression. There is no way to say 'THE_VALUE_OF_THE_POLYGON_WITH_THE_GREATEST_AREA_COVERING_THE_AREA_OF_THE_PIXEL' which is 'MAX_AREA'. 

Pierre