[postgis-users] Re: Re: Proposed SQL interface for PGRaster

Patrick pvanlaake at users.sourceforge.net
Wed Dec 6 11:48:41 PST 2006


"Stephen Marshall" <smarshall at wsi.com> wrote in message 
news:4576D1B4.9050201 at wsi.com...
>
> I have been gravitating towards using a known image format as the I/O 
> format to avoid having to reinvent so many of their features.  For 
> example, image formats tend to be well-compressed and accomodate most of 
> the metadata we need to have about a raster, endian-ness, etc.  If we 
> invent a our own format for PGRaster, we will have to accomodate all those 
> features, and we will have to write our own input/output codecs.  Why do 
> that when we can use an existing format, with codecs that are already 
> written?
> Interchanging data in a well-known image format also feels more open to 
> me.  Clients can easily link against libraries like GDAL to decompress the 
> data.  If we invent a format, we would have to provide a client-side 
> codec, or require all clients to write their own.   Returning the data in 
> some internal format used by the server seems like foisting a server 
> implementation detail on all the clients.  As I see it, the PGRaster data 
> format should be the open interface for interchange between two systems: 
> the PostGIS server and a client application.
>
> Using a well-compressed image format will tend to optimize input and 
> output of PGRaster data.  Since the formatted raster data will be 
> transferred to and from the server in a well-compressed form, there will 
> be fewer bytes in the client-server communications.
>
> I can see the advantages of picking a single raster format to be *the* 
> PGraster format, rather than being format agnostic.  In fact, when I 
> originally floated the PGRaster idea a month or so ago, I suggested using 
> JPEG2000 as the single raster format.  However Frank Warmerdam (of GDAL 
> fame) argued for an approach based on a more abstract interface into 
> different raster formats.  This is similar to how GDAL works, and he 
> suggested that GDAL could be used in the PGRaster implementation to hide 
> away details of the exact raster format.
>
> I'm still on the fence about the advantages of using a single format vs. 
> multiple formats supported through an abstract raster codec.  However, I 
> prefer the idea that PGRaster interchanges data in an established format, 
> rather than some format we invent.  I think a well-established format will 
> be more open, require fewer bytes to be transferred between client and 
> server, and avoid the daunting task of designing our own image format.
>

Hi Steve,

I agree with you that it does not make any sense developing a new and 
complex format. I also agree that I/O requirements include compression. We 
must guard, though, against overloading the server-side by processing from 
internal to I/O format and vice-versa. Also, PGRaster is not supposed to be 
an image "engine". And how difficult is a simple tiled layout anyways? As I 
indicated before, most applications have their own internal representation 
of data and quite a few work internally with tiles of data. Several image 
formats themselves are also tiled, JPG, TIFF and IMG come to mind. It would 
of course not be too difficult to support both mechanisms and please 
everybody.

I can imagine a format where a tile of data is a row in a PGRTile table with 
tile_index_x, tile_index_y and tile_data as fields. Selecting a bbox of data 
in "raw" format then yields a view with a row for every tile of data. The 
application decides what to do with it. Every tile can be individually 
gzipped before transmission. Alternatively, all data is converted to any 
available server-side image format and streamed to the client. That of 
course, also has quite fundamental consequences, because it breaks the 
standard DBMS-type data interface.

On the application side, I foresee a situation where code is optimized to 
request PostGIS tiles of data on-the-fly. Think Google Earth. Once an 
application knows how the data is organized (tie-point coordinate, x and y 
resolution) there is nothing easier than calculating which tile you need at 
a certain location and that translates to a very simple query:

select tile_data from PGRTile where tile_index_x = 185 and tile_index_y = 
27;

With indexes on the x and y tile indexes, what could be faster?

Cheers,
Patrick 






More information about the postgis-users mailing list