[postgis-devel] gen2 raster iterator tutorial

Pierre Racine Pierre.Racine at sbf.ulaval.ca
Mon Sep 26 13:39:12 PDT 2011


Bryce,

Sorry for such a late answer. At FOSS4G and then sick home.

I'm not going to answer this email point by point. Too long. Having a more modular MapAlgebra code is certainly a concern that you could address but performance and simplicity are also a concern.

Since you are proposing to use this iterator to replace some things which are already working pretty well (this is one reason for "such resistance") like ST_Intersection(geometry, raster) and the one raster version of ST_MapAlgebra(), or write functions which are planned to be written differently (ST_Intersection, ST_Difference, ST_Symdifference planned as wrappers around the two rasters mapalgebra), I would propose that you:

1) Write an alternative ST_Intersection(geometry, raster) using your iterator and benchmark it with the existing one by repeating the operations described in the tutorial. This is on tiled rasters. If you can get good performance intersecting with a big non tiled raster, this is certainly becoming very interesting.

2) Rewrite the one raster version of ST_MapAlgebra() using your iterator (a C version already exist) and benchmark them. Performance is key.

If you get the same results without loss of performance (ideally some gain) we can go further...

3) Implement the two raster version of ST_Mapalgebra() as proposed in the specs.

4) Write some specifications for ST_Intersection(raster, raster) (what should it return) and implement it (as a wrapper around the two raster version of ST_Mapalgebra() or otherwise. I guess you plan it otherwise).

5) Same for ST_Difference() and ST_Symdifference()

6) Discuss the specs of ST_Union (which I planned as an aggregate).

You will have to negotiate with Bborie for 3) and 4) since those two functions are on its task list.

You will have to negotiate with core PostGIS developpers (strk, mark and Paul) to get part of your code to be included in liblwgeom. We have no jurisdiction on those.

Deal?

I would love to think that these 3000 lines of code would simplify our code and enhance our performance.

Pierre

> -----Original Message-----
> From: postgis-devel-bounces at postgis.refractions.net [mailto:postgis-devel-
> bounces at postgis.refractions.net] On Behalf Of Bryce L Nordgren
> Sent: Saturday, September 17, 2011 1:17 PM
> To: PostGIS Development Discussion
> Subject: Re: [postgis-devel] gen2 raster iterator tutorial
> 
> On Sat, Sep 17, 2011 at 12:01 PM, Pierre Racine <Pierre.Racine at sbf.ulaval.ca>
> wrote:
> >> I'm still writing code, so this isn't on a ticket yet.
> >
> > Normally we write a ticket first, then we discuss the strategic implementation
> and then we write code.
> 
> The "architecture" wiki page describes the approach at an abstract level, with
> pictures, in a great amount of detail. At that point you asked for code examples.
> This is it. The architecture document didn't even start until "generation 2".
> Generation 1 is still on a ticket.
> (#1058).
> 
> For four months I've been dragging people into discussions on the list regarding
> things I see as real or potential problems, with varying outcomes. Some related
> to this, some not. Some inspired conversation and some didn't.
> 
> > I still haven't seen a clear argument demonstrating the flaws of this direction
> and the advantages in term of simplicity/performance/flexibility/new
> functionality of a new approach and, still, this is far from being clear when
> reading this tutorial.
> 
> Well the tutorial is meant to provide an example of usage, not a justification for
> existence.
> 
> I don't know how many times I've said: this is mostly about porting the existing
> SQL into C in the most maintainable way with the least amount of redundant
> code. Specifically, it's about not concentrating all possible variants of
> functionality inside the main loop of MapAlgebra, then writing simple, special
> case user functions to select that particular branch. This is not about changing
> user-visible method signatures.
> 
> You can think of it as a transition from a bunch of "if" statements in the middle
> of the loop to set of function callbacks. At times, I've called this providing for
> "extension points". It's the specifics of implementation, not the direction, which
> this architecture addresses.
> 
> To reiterate points from earlier in the summer: the current implementation can
> only accomodate new functionality by tacking yet another piece on,
> endangering all existing functions due to unplanned interactions. "Extension
> points" allow new functionality to be created by allowing the creation of a
> plugin which is separate from existing code and which cannot interact with it.
> (e.g., to use the new functionality, you provide the new plugin instead of the old
> one.) In short, the current implementation is good for a prototype which can
> demonstrate a small subset of the possible combinations on the tables on the
> architecture page, but cannot grow much more than it already has. Part of this
> has to do with the limited capabilities of pl/pgsql:
> I think you may have maxed out that platform. Plus you don't have access to
> GDAL from SQL.
> 
> And of course, at some point I stop providing justification for a well-designed
> architecture adhering to standard practices. Everything I've said so far can be
> summarized to a software engineer succinctly
> as: The current implementation has poor encapsulation, poor separation of
> concerns, and nonexistent extensibility. The same can not be said of the
> architecture I proposed, offered for discussion, then implemented.
> 
> 
> > What are the problems encountered in the current approach (vectorize raster
> when operating in vector mode and rasterize when operation in raster mode)
> justifying a new one?
> 
> False statement: you have no current implementation of "raster mode"
> ST_Intersection. They all return geomvals. You only have ST_Intersection and
> none of the other operations. Thus far, my architecture only concerns "raster
> mode".
> 
> Complexity. The current implementation is scary long. And so far, it supports no
> spatial relationship functions.
> 
> Fragility: Adding new things endangers old ones.
> 
> > What benefit would we have with the new one? What do we lose (if we lose
> anything)?
> >
> > What impact would this have on the SQL API? What new functionality does this
> bring?
> 
> 1] Well, you'd gain a comprehensive set of "raster mode" operations
> (ST_Intersection, ST_Union, ST_Difference, ST_Symdifference: all of which
> return raster). There is nothing in the current implementation for these to
> replace (or mask); hence there really is no "raster mode"
> as of now.
> 
> 2] If my idea about adding a geometry/geomval iterator to the framework pans
> out ("future directions" on the tutorial), you gain a solid base on which to finish
> writing functions which return geomvals (ST_Union, ST_Difference, and
> ST_Symdifference). Again, you haven't written these.
> 
> 3] ST_Intersection returning geomval could be ported to C. The SQL interface
> should stay the same.
> 
> 4] The current implementation resamples on the fly (eliminating the storage of
> the rasterized intermediate product), can handle arbitrary raster alignments, and
> can reproject to different coordinate systems.
> As far as I know, these features aren't even being discussed for the functions
> which don't yet exist.
> 
> 5] As the (EVALUATOR *) is an extension point, and is what provides for sampling
> or resampling, we can write one for any sampling method we choose: bilinear,
> bicubic, etc. Provided that GDAL exposes the required functionality, we can
> provide a thin adapter which defers the heavy lifting to GDAL.
> 
> 6] One-and-two input MapAlgebra would be improved by porting them to C
> using this framework. (Cleanly, I might add). Just write an (EVALUATOR
> *) which submits the expression to the SQL parser, like you have now.
> It can immediately benefit from everything in #4. Due to the nature of the
> framework, you also get (for free) MapAlgebra functions which can take
> (geomval, raster) as arguments. This was the main improvement of
> gen2 over gen1.
> 
> Mostly what you'd lose is a plan (not code) to base all of #1 on a very complex
> two raster MapAlgebra function which can't handle geomvals, can't reproject,
> and can't tolerate different raster alignments.
> 
> Why is this meeting such resistance?
> 
> > This is a basic 1) problem identification, 2) proposed solution, 3) pros and cons
> methodology. Up to now we've got only 2)...
> 
> False.
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list