[postgis-devel] [WKT Raster] Regular blocking in gdal2wktraster.py

Fri Mar 27 07:37:16 PDT 2009

>> >> If an application threats each rows of a raster table as a single
>> >> raster (as I have always try to sell WKT Raster "a tile is a raster
>> >> and a raster is a tile, there is no distinction") then the
>> >> application should not see any difference. As for a vector layer.
>> >
>> >A raster might be a little ambiguous term, sometimes.
>> >A raster is a single pixel (a sample) or it is a complete grid of
>> >pixels.
>> >Let's assume a raster is a grid of pixels.
>> >Since the beginning of my WKT raster adventure, I preferred to say:
>> >1 table = 1 raster (raster coverage, raster overlay, just raster).
>>
>> In my mind:
>>
>> 1 raster = 1 tile = 1 row
>>
>> and
>>
>> 1 coverage = 1 table
>>
>> 1 "big raster" = 1 table. It is similar to a small coverage but very
>> (too much) constrained (to regular blocking without missing tiles).
>> Let's put it like this: "big raster" is the quintessence of coverage
>> perfection :-) But we can not add tiles (from a second file for
>> example) to it without braking its perfection. If we can (append tiles
>> from many source images) and that we support missing tiles then type 4
>> is identical to type 2 and the "big raster" type disappears.
>>
>> many "big raster"s would have to be stored as separate tables (but this
>> would not be true anymore if we support loading of many images at the
>> same time).
>
>I have to think more about it and the terminology we use.

Sure that "big raster" doesn't mean much... The main idea behind is that such an image is threated independently of other images (even if their aim is to form a coverage) and hence stored in independent tables. I think we could call it a "tiled image" in opposition to "tiled coverage" which generally is the result of many images stored in a single table. In my mind a set of many "tiled image" stored as many table is less usefull from a geospatial analytical point of view than a "tiled coverage". A "tiled images" is also much more constraint/restricted than a "tiled coverage" for all the reason we discussed.

>> >> I understand however that
>> >> applications prefers to load big rectangular complete tiled rasters
>> >> but this is very limitative and not very practical from an
>> >> analytical point of view.
>> >
>> >AFAIU, regular blocking is supposed to help to operate on a part of
>> >big raster (a coverage), on a window of big raster.
>>
>> In my ming it is more the tiling that helps to operate on a part of big
>> raster than regular blocking. Tiles can be of any size (regular
>> blocking is optional/optimal). This is problably not true in a file
>> like a TIFF but it is certainly in a database with Gist indexing where
>> tiles are selected through a geometric SQL query e.g. "SELECT rast FROM
>> table WHERE rast && geom" (not through address math). An other way to
>> say it would be that a database index is much more flexible than an
>> address based file index, this is why it allows variable tiles size.
>> This is why regular blocking is not as important for me as tiling. WKT
>> Raster does not care if tiles are not all the same size. Agreed?
>
>One correction, WKT Raster can work in two modes:
>1. By default, it does not care if tiles are of the same size or not.
>2. If regular blocking enabled, WKT Raster does care.
>
>Agreed?

How does it care? Otherwise than setting regular_blocking to true? WKT Raster functions do not have a clue about the global relative structure of a table of tile. It deal only with one tile at a time threating it as a single/complete raster. Remember: "a tiles is a raster an a raster is a tile"... Every Future SQL functions with get a row as argument, never a table.

From my point of view, regular_blocking is just a flag telling applications "he! this table is almost perfect and structured that way (like a tiff file!), so you can take advantage of it... if you can..." Otherwise an application should load one tile at a time (using a proper SQL query) as it does for a vector feature table. It is still not clear to me how an application can really take advantage of regular blocking. The way you access tiles in a database is very different than the way you access tiles in a file.

>> Question: I did not try the -k option yet. It seems to load tiles as
>> they are already defined in the file, right? I can't specify a new
>> block size (this is what I call retiling)? e.g. -k 100x100
>
>Right, there is no way to specify size of tile.
>
>Simply, I'm a baby of Unix where I use number of scattered but very
>specialized tools, instead of a few monolithic big applications.
>Also, I don't like to reinvent the wheel :-)
>So, if I need to tile a dataset according custom size, I just
>use GDAL utilities. Then I use gdal2wktraster to load these tiles.
>That's why I can't see any point to implement custom-size-tiling
>in gdal2wktraster.

Then, wasn't adding the -k option a bit reinventing the wheel? You could have used gdal_retile.py to create a bunch of filesystem tiles and import them with the wilcard. This is what I did when I first imported my 3600 100x100 tiles three weeks ago. I was almost happy with this solution until you added -k. Then I also wanted it to work on multiple files. To simplify everything we could also remove the -k option and tell our users "if you want this tile structure when importing your raster(s), use gdal_retile.py"... Is it really "simplifying" if we tell people to use an other tool?

Pierre