[postgis-devel] [WKT Raster] Regular blocking in gdal2wktraster.py

Thu Mar 19 11:37:34 PDT 2009

>> I can see two major user cases that gdal2wktraster.py needs to fulfill:
>>
>> 1) Some users want to store big rasters (overlapping or not, e.g.
>> landsat images) as many tables. In this case 1 raster = 1 row in the
>> raster_column table and hence many tile tables in the database. (100
>> landsat = 100 files = 100 tables) Some people might want to store only
>> one file, other might want to batch load many.
>
>Pierre,
>
>Are you suggesting that single entry in RASTER_COLUMNS may describe
>a raster scattered in N tables?

No. In the previous case 1 raster = 1 row in RASTER_COLUMNS = 1 table of tile.

But some users might want to do this for a series of raster.

>> 	b) They might want to retile them (in batch) as smaller tiles
>> (similar to case 1 but all in the same table).
>
>In this case, users are supposed to do intermediate processing
>using already available tools, like gdal_retile or others.
>Then, they are able to use gdal2wktraster.py as in case 2.a)
>
>I don't see any point to include retiling functionality in gdal2wktraster.

Well, then we could also say that to load a big raster they should also use gdal_retile to first create many tile files and then import them in batch. But this might not ensure a tile ordering as you expect. This is why it is nice to have gdal2wktraster.py to do it.

Let's put it like this:

-I name a "big raster" the conventional idea we have about an image. It is rectangular (or square) and you want to store all the tiles it is composed of (or that you wish to create). However reality is often more complex...
-A "coverage" might not be rectangular at all (because it is composed of many rasters covering a terrestrial surface not necessarily rectangular). Still, like big rasters users, you want to be able to import it all, tiled (or retiled) into a single table without having to write a batch.

>> For all scenarios, it should be possible to store in-db or out-db. Out-
>> db means the importer should copy the tile files in a determined folder
>> (case 1 and b2). In case 2a the folder already exist.
>
>ATM, I'm focused on in-db only. IMHO, it's far to early to even plan out-db
>as there is completely no support for out-db on the WKT Raster side.

All the code to support off-db tiles is there and working very well. The only functions not supporting out-db tiles yet in the API are: rt_band_get_data(), rt_band_set_pixel() and rt_band_get_pixel() that we can implement with GDAL. However an application dealing with off-db tiles would normally use RT_GetPath() and load each tile from the file system as it was planned originally.

>I'm following principles of prototyping, and I'm not trying to draw a big-ass
>plan and then start developing it. I'm making very small steps,
>so I can change direction quickly.

I understand, but planning having in head that at some point we have to support off-db tiles (which is one of the main feature of WKT Raster) is maybe better.

>> In a best practice documentation section (that I plan to write someday)
>> I would recommand to users scenario 2a or 2b since this is the storage
>> method that will benefit the most from storing raster in the database.
>
>Honestly, I'm starting to worry about too many use cases we are going to offer.
>Simple tool is supposed to be simple to use enough, so it doesn't need a
>brick-weight handbook, with zillions of recommendations and exceptions.
>
>My plan about gdal2wktraster is as short as this summary:
>
>1) Load 1..N raster files into a single table, one raster file per row.
>It's a user's decision, if he wants to load N unrelated/different/heterogeneous
>raster files into single table, or if he wants to keep things in some order.
>With current gdal2wktraster user has a liberty and tool to achieve both.
>
>2) Load 1 raster file according to principles of regular blocking we have already drawn.
>So, 1 raster is tiled according to block size and tiles are loaded into a single table
>(one tile per row). So, the whole table makes a regular grid of non-overlapping tiles.

And what if I want to store many rasters as 2) but in the same table? This is not a zillions of exceptions, this is just one more use case. And as a simple user this is the case I will be encountering the most.

>I believe that all other cases will over-complicate things at this stage of
>development of WKT raster, and are of a low priority for me.

I understand that Cadcorp is interested in storing rasters as Oracle does and not consider better uses cases and that what is simple is what Cadcorp has decided is simple and all the rest is overcomplicated.

WKT Raster was not planned for stupidly storing hundred of big rasters as hundred of tables. There has been a lot of discussion about this already. WKT raster is planned to store raster coverages as one huge table to profit from GiST indexing and to implement seemless raster/vector analysis functions. This is a much modern usage of a raster coverage. Storing many rasters representing the same variables as many tables make not much sence and is not very usefull from an analytical point of view.

I think we have very different view of the better way to store a raster coverage in the DB. You seems to want to stick to what was proposed in past proposals that everybody agreed were not very useful. WKT Raster is precisely a better proposal because it deals with coverages, not big rasters. The out-db feature is also very important for web applications. This is what people wants and think is useful so this project have to provide them with what they want. (Not just what Cadcorp wants...)

Pierre