[postgis-devel] Another idea to speedup raster value editing

Wed Feb 8 09:05:50 PST 2012

First, it appears that this thread has been hijacked by discussion of a
general solution when a simple "raster loader" was called for. I'd suggest
that Sandro's needs could be met by a simple loader which just iterates
over all the points in an {array, multipoint, table} and sets the value at
that point. Last value in a pixel wins.

On Mon, Feb 6, 2012 at 2:23 PM, Pierre Racine
<Pierre.Racine at sbf.ulaval.ca>wrote:

> > b] will apply the same value to every geometry;
>
> False: Read carefully. Both ST_BurnToRaster() and ST_UnionToRaster() have
> an optional 'value" parameter. This parameter could be an array to support
> the creation of multiband raster from multiple values associated with
> points, lines or polygons.
>

Ok. But this array is a constant. This same constant array would be applied
to all geometries. (e.g., band 1 always has the same value; band 2 always
has the same value, etc.)

> I would be surprised ISO19123 offers so much flexibility. Last time I
> looked at it (2008) CV_CommonPointRule was defining only average, low,
> high, all, start and end. All very easily implementable with the
> p_expression, t_expression and f_expression. There is no way to say
> 'THE_VALUE_OF_THE_POLYGON_WITH_THE_GREATEST_AREA_COVERING_THE_AREA_OF_THE_PIXEL'
> which is 'MAX_AREA'.
>

19123 is an abstract specification, describing a conceptual framework
within which problems may be framed. That means it's a people thing, not a
computer thing. Here's a bit of the toolbox I gleaned from 19123:

1] You may always ask a coverage for a value, given a location.
2] The "internals" of how the value is generated is not specified.
3] Some coverages produce a value for a location by examining a collection
of (geometry object, value) relationships.
3a] Of the coverages which are backed by (geometry object, value)
relationships, some use their "backing values" directly.
3b] Others use their backing values as the inputs to a calculation, which
generates the value to be returned.
4] Some coverages produce a value for a location purely via an equation,
and need no collection of values.
5] A "grid" is a special type of structured collection of (geometry object,
value) relationships, one relationship per grid point.

So, predefined concepts in hand, we encounter the current situation. We
have some geometry objects, each of which should have its own value. We
want to make a grid out of these objects, but we feel we must account for
the unstructured nature of the original collection (the objects may not be
aligned to our desired grid, there may be more than one object per grid
cell, etc.)  The essential question is: Does familiarity with the tools
defined by 19123 help me frame the current problem in a productive way so
that the impact of my solution is maximized? Let's see:

1] Our "inputs", regardless of their particular form (table, multipoint
w/value in Z or M, array, setof), represent a collection of (geometry
object, value) relationships.
2] Our "output" is another collection of (geometry object, value)
relationships...this time on a grid.
2a] we assume the "geometry objects" (e.g., the grid definition) can either
be deduced or specified by the user.
2b] we need to calculate a value for each grid point, based on the inputs.
3] Aha! Calculating values in 2b, based on the collection of (geometry
object, value) relationships in 1 means we're making a coverage. But what
we don't know is "how do we want to generate values?"
4] After we have the coverage in 3, we can just ask it for values at each
grid point and copy them into the grid.

Note that so far, we've only applied concepts we already know. We're not
breaking any new ground or leaving footprints on the moon or anything. The
form of the solution has already taken shape, and the only thing left to do
is clarify what we want values from the coverage in #3 to represent.
Explicitly stated:

"The value for any point in the coverage should be representative of the
values of the backing (geometry object, value) relationships which are in
the vicinity of the query point."

And of course, for our current application, "vicinity" means "area of the
pixel in the output grid".

Boy is that open-ended. It implies that the many and varied methods of
calculating "representative values" produces a whole family of
"Neighborhood sampling" coverages. At the same time, it becomes clear that
it is probably worthwhile to put some limits on the inputs, just to
preserve our sanity: for instance, all of the "backing" geometry objects
should be points, or all of them should be polygons, but we shouldn't try
to calculate a value from a "mixed" input dataset.

Ok, now we have a big family of coverages. We've already decided to divide
them up by the type of input geometry. We need another criteria. The most
obvious one is the division between "representative values" which depend on
the individual geometric properties of the input data and those which do
not. Any method which does not depend on the geometric property of the
input data is a candidate for a rasterize-then-union-with-mapalgebra
strategy (albeit, there may be some candidates for optimization within this
group). Conversely, those methods which do depend on geometric properties
of the inputs (e.g., interpolation between points needs the positions of
the points; weighting by polygon area needs the areas of the polygons)
require some other solution.

Wow is that progress! Now we can see the overall solution. We can see how
and where the "planned strategy" fits in, and we can see what cases require
a supplement to that strategy. Along the way, we defined a useful tool for
the vector world. Most of the solution was pre-formed and led us directly
to the one critical problem statement which lies at the center of this
solution space. Teasing out the complete problem using familiar concepts
was fast and easy. (Although writing it out longhand as a step-by-step
example of how to use a conceptual toolkit takes a long time.) No getting
caught up in trivia before encountering the main issue.

Now we can deal with things like "Which forms of (geometry object, value)
collections do we want to handle?", specifically which representative value
calculations should be supported, and other implementation details.

Bryce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20120208/e5d85020/attachment.html>