Warning: long. very long.<br><br>I spent some time thinking about a statement of work for postgis raster fv.02 and 03 this morning. (BTW: The "planning" page is out of sync with the "Specifications" page, as fv.03 on the planning page is fv.04 on the specs page.) I want to try and make sure that proposed development on my part is consistent with the overarching structure established by you all. For that I need the big picture, so the first thing I need to make sure I have nailed down is the big picture.<br>
<br>The thing which attracted me to postgis raster at first was the desire to provide a set of seamless set of operations for vector and raster operations. However, it seems that even my very first attempt to use the tool has exposed a fundamental philosophical difference between vector and raster which is difficult to handle uniformly. Basically, I wanted to reconstruct an entire raster from the tiles into which it had been segmented.<br>
<br>A simple case like this, where the tiles are regularly blocked and disjoint, seems amenable to treatment with a raster version of the existing ST_Collect() function (which is speedy because it doesn't try anything crafty to eliminate overlaps.) The proposed implementations of ST_Union in svn assume a more general case where there were overlaps, and hence a need to handle the selection of a destination raster value from among the candidates in the source rasters.<br>
<br>Right there is the difference between geometry and raster processing. Geometries carry shape information with no implied value (e.g., they are primitives). Rasters are closer to coverages or feature tables: each pixel location (geometry) is inseparably paired with a set of numeric values in the various bands. (CV_GeometryValuePair in 19123 speak)<br>
<br>Looking at the current prototypes, I think there may be some benefit to identifying some concerns to separate before too much more work is done. <br><br>Firstly, I think we can and should separate purely geometric items from the need to generate a value fore each resultant pixel.<br>
<br>Secondly, we need to separate single-band-value operations from raster-value operations. ENVI embodies this distinction as Band Math vs. Spectral Math. For instance, setting the color of a pixel would be a raster-value (all band) operation, whereas most of the MapAlgebra and statistics functions are single-band-value operations.<br>
<br>Thirdly, I think we've entered territory where aggregate functions need to be considered separately from the non aggregate functions; at least in some cases.<br><br>Intuitively, I want the geometric operations to be as similar to their vector counterparts as possible/reasonable. They should be primitives upon which more complex functions can be based, and not the other way around. <br>
<br>For the simple case, the geometric behavior of these new raster-returning functions is well constrained. For more complex cases, geometric behavior is somewhat less well defined. Whether the geometric behavior is simple or not depends on the data itself (do the input grids have the same pixel size? are they related by a simple translation? are they rotated with respect to each other?) Geometrically speaking, raster aggregate functions do not need different treatment than their vector counterparts.<br>
<br>Considered separately, the complexity of a selection of a value for each destination raster cell depends on the spatial predicate and not the data. If two rasters are participating: ST_Difference and ST_SymDifference have only one possible value; ST_Union could possibly force a choice; ST_Intersection is guaranteed to force a choice. Alternatively, each of these predicates could be equally simple if they just returned a mask. (And returning a mask would be as close as possible to the semantics/behavior of the geometric predicates.) Clearly, there is no ambiguity as to the resultant value if a raster and a geometry are inputs to the predicate.<br>
<br>Aggregate functions which set the value in a raster result must be defined and used extremely carefully. Functions which yield the same value regardless of evaluation order can be provided safely (count, sum, stdev, min, max, etc.) Other functions may be provided on an "at-the-user's risk" basis. "First" and "Last" will evaluate differently depending on how the query is evaluated by the server--assuming there are overlapping input rasters. However, for the ST_Collect or ST_Union call intended to assemble a larger raster out of many non-overlapping components, these are needed. <br>
<br>Thank you for your patience as I grapple with the big picture in email. I'll let this brew for a little while before solidifying any statements of work. Please if you do have observations, comments or corrections to my big picture, do speak up. <br>
<br>Bryce<br>