[postgis-users] Re: Re: Re: Proposed SQL interface for PGRaster

Patrick pvanlaake at users.sourceforge.net
Sat Dec 9 09:03:25 PST 2006


Hi Steve,

"Marshall, Steve" <smarshall at wsi.com> wrote in message 
news:8536F69C1FCC294B859D07B179F0694406D09F46 at EXCHANGE.ad.wsicorp.com...
> I think in looking at your design that we have quite divergent views
> about what a PGRaster should be.  You are proposing a multi-table
> approach where PGRaster is not a type, but a table, and where the
> relationship between rasters, bands, and tiles are modelled as table
> relationships.
>
> [...]
>
> However, this purely-relational design goes against my primary goal for
> PGRaster: to allow one to relate raster objects and geometry objects,
> including converting data from one representation to the other.  In this
> design PGRaster is no longer a type, so it is hard to write functions to
> operate on PGRasters.
>
> What I see in your design is a way to use an RDBMS to provide an
> application with access to pieces of rasters.  It is up to the RDB
> client application to operate on these raster primitives (tiles, bands,
> etc).  My vision is to put more of the raster processing logic on the
> server-side, i.e. executing in the process space of the postgres
> backend.

I don't agree with you that our visions are very divergent or incompatible, 
they just differ in where we want to end up. Both our proposals require all 
raster properties and data to reside on the DBMS, with the data partitioned 
in the most efficient way to optimize system performance and reduce overhead 
in access and I/O. I would be happy if we got that much working.

Your ideas go beyond the data structure to add specific processing 
functions. I think that is a worthwhile endeavour, but you can not realize 
it until you dealt with the first part. You may, of course, make data 
storage completely opaque to the client application, but you will have to 
provide for some mechanism to grant access to the raw data, i.e. all the 
data that covers a certain geometrical area. In many cases this will simply 
be the entire area covered by a raster (country, county, study area, you 
name it) without any qualifiers based on some property of the raster. There 
are simply so many operations possible on raster data that they can not all 
be supported by any DBMS. So my question is then simply: If you need to 
supply raw data, then why not do it as raw as it gets? And I will reiterate, 
if the design works the applications will follow. I promise that I will make 
OS software available that can access PGRaster the day it is released with 
PostGIS.

I still think, by the way, that your proposal makes a lot of sense, because 
there will always be uses for high-level operations on the server side. 
Operations like merging/mosaicing are the raster equivalent of an INSERT 
statement, and operations combining rasters with GEOMETRIES (give me all 
elevations in the state of Kansas) would also be very useful IMHO. I am 
still quite sceptical when it comes to server overload, though.

> While this similarity to file-based access might seem like a strength, I
> think it is actually a critical weakness.  The reason is that file
> access is going to be much faster than RDB access.  Despite what Oracle
> or other RDB vendors may claim, I've found nothing that indicates that
> RDB tricks like indexing, etc. will produce better performance than
> access to well-formated files.  Thus, the pure relational design just
> becomes a slower way to the same functionality as file-based access.
>
There is more to the use of an RDBMS than access speed. Sharing 20TB of 
imagery between 20 users requires more than a fast network. Concurrency 
control, access control, all the benefits of the standard DBMS also apply to 
rasters-in-a-db. This issue has been brought up several times during the 
last year and I have seen some interesting responses.

However you twist it though, you are in the same predicament. Where is the 
data in your scenario? It is also on the server, in the DB, and I am quite 
sure that a user will not like to duplicate his/her data in order to gain 
some processing functions.

> To pursue that goal, I'm going to continue down the road of trying to
> implement PGRaster so that GEOMETRY-raster operations can be performed
> on the database server.  I think this will liberate client applications
> from the need to provide a significant amount of functionality, and that
> this benefit will outweigh the inevitable performance cost of a
> relational implementation.  I hope that there will be some other
> believers in this vision who will want to join me.

I am with you. And my proposal is this: Provide and expose a simple tabular 
data structure that will hold all raster properties and data. 
Simultaneously, create a type that will encapsulate all of the above and 
which supports high-level processing, interfacing with the other objects in 
PostGIS.

Cheers,
Patrick 






More information about the postgis-users mailing list