[postgis-devel] [WKT Raster] Regular blocking in gdal2wktraster.py

Mateusz Loskot Mateusz.Loskot at cadcorp.com
Thu Mar 19 13:13:11 PDT 2009


Pierre Racine wrote:
> >> I can see two major user cases that gdal2wktraster.py needs to
> >> fulfill:
> >>
> >> 1) Some users want to store big rasters (overlapping or not, e.g.
> >> landsat images) as many tables. In this case 1 raster = 1 row in the
> >> raster_column table and hence many tile tables in the database. (100
> >> landsat = 100 files = 100 tables) Some people might want to store
> >> only one file, other might want to batch load many.
> >
> >Pierre,
> >
> >Are you suggesting that single entry in RASTER_COLUMNS may describe a
> >raster scattered in N tables?
> 
> No. In the previous case 1 raster = 1 row in RASTER_COLUMNS = 1 table
> of tile.
> 
> But some users might want to do this for a series of raster.


I just misunderstood what "many tile tables in the database"

> >> 	b) They might want to retile them (in batch) as smaller tiles
> >> (similar to case 1 but all in the same table).
> >
> >In this case, users are supposed to do intermediate processing using
> >already available tools, like gdal_retile or others.
> >Then, they are able to use gdal2wktraster.py as in case 2.a)
> >
> >I don't see any point to include retiling functionality in
> >gdal2wktraster.
> 
> Well, then we could also say that to load a big raster they should also
> use gdal_retile to first create many tile files and then import them in
> batch. But this might not ensure a tile ordering as you expect. This is
> why it is nice to have gdal2wktraster.py to do it.

Not realy. For raster blocking we need to stick to tiling according to
size of block, so users should have no liberty to specify their own size of
tiles, in this particular case (blocking).

> Let's put it like this:
> 
> -I name a "big raster" the conventional idea we have about an image. It
> is rectangular (or square) and you want to store all the tiles it is
> composed of (or that you wish to create). However reality is often more
> complex...

Clear.

> -A "coverage" might not be rectangular at all (because it is composed
> of many rasters covering a terrestrial surface not necessarily
> rectangular). Still, like big rasters users, you want to be able to
> import it all, tiled (or retiled) into a single table without having to
> write a batch.

Clear.

Now, does regular blocking concept apply only to 1st or to both cases?
Perhaps there is misunderstanding in what regular blocking really means here.

I understand regular blocking as a regular grid in which each cell is filled
with a tile of raster. Later, it seems reasonable to support NULL-tiles - a cell
of grid that does not contain any tile. 
Simply, regular blocking seems to be an optimization for less-or-more-common
case of storing and accessing big raster data.
 
> >> For all scenarios, it should be possible to store in-db or out-db.
> >> Out- db means the importer should copy the tile files in a
> >> determined
> >> folder (case 1 and b2). In case 2a the folder already exist.
> >
> >ATM, I'm focused on in-db only. IMHO, it's far to early to even plan
> >out-db as there is completely no support for out-db on the WKT Raster
> >side.
> 
> All the code to support off-db tiles is there and working very well.
> The only functions not supporting out-db tiles yet in the API are:
> rt_band_get_data(), rt_band_set_pixel() and rt_band_get_pixel() that we
> can implement with GDAL. However an application dealing with off-db
> tiles would normally use RT_GetPath() and load each tile from the file
> system as it was planned originally.

Understood. However, it's a long way to go before I start working
with off-db raster data myself.

> >I'm following principles of prototyping, and I'm not trying to draw a
> >big-ass plan and then start developing it. I'm making very small
> >steps, so I can change direction quickly.
> 
> I understand, but planning having in head that at some point we have to
> support off-db tiles (which is one of the main feature of WKT Raster)
> is maybe better.

Possibly, however, there is plenty of work to get done for in-db rasters
so I hope you don't mind I will stay focused on that part.

BTW, you have mentioned use of GDAL in WKT Rasters. This decision seems to
be crucial for further development. Perhaps, this is what should be
planned sooner than later.

> >My plan about gdal2wktraster is as short as this summary:
> >
> >1) Load 1..N raster files into a single table, one raster file per
> >row. It's a user's decision, if he wants to load N
> >unrelated/different/heterogeneous raster files into single table, or
> >if he wants to keep things in some order. With current gdal2wktraster
> >user has a liberty and tool to achieve both.
> >
> >2) Load 1 raster file according to principles of regular blocking we
> >have already drawn. So, 1 raster is tiled according to block size 
> >and tiles are loaded into a single table (one tile per row). 
> >So, the whole table makes a regular grid of non-overlapping tiles.
> 
> And what if I want to store many rasters as 2) but in the same table?
> This is not a zillions of exceptions, this is just one more use case.
> And as a simple user this is the case I will be encountering the most.

Pierre, I'm confused.

You ask for: "I want to store many rasters in the same table"

Case 1) says: "Load 1..N raster files into a single table"

I can't see any difference, really.

First, you can store many rasters in case 1) as well (see 1..N).
Second, in both cases, 1) and 2), all rasters/tiles are stored in a single table.
The only difference in case 2) is that some additional restrictions apply.
These restrictions are listed in the RASTER_COLUMNS specification
for regular blocking case:
- size of all tiles is equal to size of block reported for input raster
- all tiles have the same size
- all tiles fit common grid
- ...

Again, perhaps we understand regular blocking in different way, aren't we?


> >I believe that all other cases will over-complicate things at this
> >stage of development of WKT raster, and are of a low priority for me.
> 
> I understand that Cadcorp is interested in storing rasters as Oracle
> does and not consider better uses cases and that what is simple is what
> Cadcorp has decided is simple and all the rest is overcomplicated.

I am sure it is not like that.

I am reading your specification document [1] and there is a section
called "Three ways to use a WKT raster table".

[1] http://www.cef-cfr.ca/uploads/Membres/WKTRasterSpecifications0.8.pdf

Let's confront this slide with current version of gdal2wktraster
and what I see is that:

1. Image warehouse - it is (almost) perfectly supported now.
It's possible to load N number of miscellaneous images into a table.

2. A vector-like discrete coverage - it's also possible, isn't it?
Similar use of gdal2wktraster.py as in point 1.

3. A continuous tiled coverage - it is a relaxed version of regular blocking
use case. However, with help from Frank and Martin, we have defined
RASTER_COLUMNS and applied a bunch of restrictions to make this use
case more clear and simpler to handle by client applications.
For instance, on you slide there it's said "images may overlap",
what is not allowed in regular blocking for well-known reasons.
And, this is the use case I'm going to support now in gdal2wktraster.

Certainly, it's possible to list yet another use case which actually
is a simplification of 1) or 2):

4) Whole raster loaded into single table into single row.


Am I missing or misunderstanding anything here?

> WKT Raster was not planned for stupidly storing hundred of big rasters
> as hundred of tables.

Pierre, where am I suggesting anything like that?
On the contrary, I'm emphasizing use cases in which raster is loaded
into a single table (as a whole or tiled, but tiled in regular blocking way).
Just to quote myself:

1) Load 1..N raster files into a single table.
2) Load 1 raster file (...) 1 raster is tiled according to block size 
and tiles are loaded into a single table (one tile per row).

> There has been a lot of discussion about this
> already. WKT raster is planned to store raster coverages as one huge
> table to profit from GiST indexing and to implement seemless
> raster/vector analysis functions.

Yes, that's the deal.

> This is a much modern usage of a
> raster coverage. Storing many rasters representing the same variables
> as many tables make not much sence and is not very usefull from an
> analytical point of view.

Again, I've never ever suggested "many tables".

> I think we have very different view of the better way to store a raster
> coverage in the DB. You seems to want to stick to what was proposed in
> past proposals that everybody agreed were not very useful.

I do not do that.

> WKT Raster
> is precisely a better proposal because it deals with coverages, not big
> rasters.

I am for coverages and this is exactly what I'm working on now.
Regular blocking is a coverage and it represents one of use cases you have
drawn in the specification (PDF), with some extra restrictions,
which actually do not change the nature of the use case at all.

> The out-db feature is also very important for web
> applications. This is what people wants and think is useful so this
> project have to provide them with what they want. (Not just what
> Cadcorp wants...)

I absolutely agree.

However, I'd like to add:

- I haven't said I'm against off-db rasters. I have only said that
off-db is not the part of work I can pick up, but raster blocking
support *is* the part of work I can. So, I'm focused on this particular
features which *is* included in the specification, is on the roadmap.

- (or remind) gdal2wktraster was started as a prototype,
a proof of concept, before development of fully-featured
raster2pgsql utility is started. gdal2wktraster is more a
complementary feature, isn't it? And, I believe I have a
liberty to not to want to develop support of off-db rasters
in this very simple tool. If anyone wants to do it, great,
feel free to contribute it. But please, don't blame me
because I'm not particularly interested in adding off-db
support to gdal2wktraster.py.

- I believe our goals are very well aligned to the WKT Raster
specification. Please correct me if I'm wrong, and where.

Best regards,
--
Mateusz Loskot
Senior Programmer, Cadcorp
http://www.cadcorp.com






More information about the postgis-devel mailing list