[postgis-users] How to handle tiled rasters (ST_Value, grid_pt=>coord, coord=>grid_pt etc)?

Sat Dec 11 18:25:23 PST 2010

2010/12/11 Paragon Corporation <lr at pcorp.us> wrote:
...
> Its typical that you would have a regular raster coverage, but I'm still not
> clear why knowing the original grid column/row from the original raster is
> relevant for any use case.

I just getting acquainted by the fact that there are so many
characteristics allowed of different possible raster table
arrangements in PostGIS Raster. I am still looking for situations (and
solutions) where the PostGIS user doesn't have to keep in mind if it's
a tiled raster or not.

The main characteristics to me have been, 1. if a coverage overlaps or
not, and 2. if it is a tiled image. The fact (1) of non-overlapping
can significantly speed up queries and the second would help to
"reconstruct" the original file. Both consequently could to be stored
in metadata.

In other words, if it's a tiled image, would'nt it be good practice to
store that in a single table (without any other images)?

I say this because I realize that it's possible to load many raster
filed into the same raster table even with different blocksize_x and
blocksize_y (in which case I wonder what get's stored in
raster_columns).

...
> The assumption I am making of course is that your raster coverage is so big
> that you would never ask for the whole thing. And if you were planning to do
> it that way, then you would never chop up your raster file in the first
> place.
> So you would only have one record in your table where the columns and rows
> map directly to your original file.

As you assumed below, my other example 'relief_jpg' is a single raster
(originally only 158K). How can I speed up the following query? It's a
query on one row and it takes indefinitely long:
SELECT ST_Value(ST_Reclass(rast, 1, '1:2'), 1, 1) FROM relief_jpg;

...
>> So I have to revise e.g. my query 4 (and your solution):
>> Query 4: Given one real projected coordinate (e.g. 489500/102500) return
>> the raster objects (or tiles) and the ("closest") raster grid point(s) and
>> value(s) therein.
>>
>> Thanks for your solution proposal:
>>
>> SELECT rid, ST_Value(rast, ST_GeomFromText('POINT(489500 102500)', 21781))
>>  FROM ch1000_tif_tiled
>>  WHERE ST_Intersects(rast, ST_GeomFromText('POINT(489500 102500)', 21781)
>> );
>> 1: 181;
>> 2: 201;0
...
> Oh you are getting 2 records back.  This should only happen if your point
> happens to lie at the edge of 2 pixels.

Correct: I took it from the upper left corner (the example in ST_Value
which uses ST_UpperLeftX).
I would expect only one (depending on how the coverage is modelled
:->): What does WCS say?

...
>> -- Solution using metadata
>> SELECT num_rasters, blocksize_x as height, blocksize_y as width, srid,
>>   array_length(pixel_types, 1) AS num_bands, pixel_types as btypes
>>  FROM
>>    raster_columns as r,
>>    (SELECT count(*) AS num_rasters FROM relief_jpg) dummy
>>  WHERE r_table_name = 'relief_jpg';
>> -- 1;;;21781;3;"{8BUI,8BUI,8BUI}"
>
> Don't quite understand what you mean by mix of table name as element and as
> text string.

The subclause uses "... FROM relief_jpg) dummy" whereas in the where
clause there's "WHERE r_table_name = 'relief_jpg';".

I simply thought this is an indication that there's some more
information that could be stored in the metadata, like the fact that
the 240 tiles form a single raster coverage and are non-overlapping.
But I probably still have to think about all this.

>> * Obviously raster2pgsql did not set blocksize_x/blocksize_y
>> I think the raster2pgsql routine only sets the blocksize_x and y if you

Sorry: I mixed up blocksize with pixelsize.

Yours, S.

2010/12/11 Paragon Corporation <lr at pcorp.us>:
>
>
>> Ok; I realise now that I assumed the special case "rectangular regularly
> tiled raster coverage" (defined as arrangement d. in
>> http://trac.osgeo.org/postgis/wiki/WKTRaster/Documentation01 ).
>
>> But still given I have imported one single raster "rectangular regularly
> tiled raster coverage"
>> * Don't you think this and my related queries above could be a typical and
> use case?
>
> Its typical that you would have a regular raster coverage, but I'm still not
> clear why knowing the original grid column/row from the original raster is
> relevant for any use case.
> To me its like trying to remember the original row a geometry was in in the
> ESRI shape file I imported.
>
> If you can state a specific example that would be great :)
>
> I'm pretty green as far as raster analysis goes so please don't take my lack
> of understanding  as an insutl.
>
>> * This probably deserves simplified queries where the user does'nt have to
> think about speed and implementation?
>> * After doing a raster2pgsql (.py) with tiling option "-k 20x20" does the
> fact get lost, that this originally was a single raster file?
>
> The memory of it being a single raster is kept in the raster_columns table.
> You can use the additional file switchin import  which would include the
> original file name as a column in the raster table.  This of course is most
> Useful if you are importing a whole folder of rasters presumably
> hierarchically ordered.
>
>  Perhaps Pierre, Jorge, or Mat have more thoughts on the matter.
>
>> (this fact could certainly give hints to PostGIS to speed up queries).
> How does it give hints?  Right now each raster tile has a spatial bounding
> box around it.  So that is wha tit uses to speed up queries.  The only more
> in depth hint I can think of is when getting into the actual
> Pixels of a tile (by generating some histogram of the distribution of pixel
> ranges to make pixel value queries much much faster), then again -- knowing
> that a tile is part of a bigger set I still don't see the usefulness of as
> far as speed is concerned.
>
> The assumption I am making of course is that your raster coverage is so big
> that you would never ask for the whole thing. And if you were planning to do
> it that way, then you would never chop up your raster file  in the first
> place
> So you would only have one record in your table where the columns and rows
> map directly to your original file.
>
>
>
>> So I have to revise e.g. my query 4 (and your solution):
>> Query 4: Given one real projected coordinate (e.g. 489500/102500) return
> the raster objects (or tiles) and the ("closest") raster grid point(s) and
>> value(s) therein.
>
>> Thanks for your solution proposal:
>
>> SELECT rid, ST_Value(rast, ST_GeomFromText('POINT(489500 102500)', 21781))
>>  FROM ch1000_tif_tiled
>>  WHERE ST_Intersects(rast, ST_GeomFromText('POINT(489500 102500)', 21781)
> );
> 1: 181;
> 2: 201;0
>
>> The result would be ok for the general case (not knowing how many
> different rasters are in the table).
>
>> But the resuls seems weird (having two rows) given that I imported only
> and only one raster as tiles.
>
> Oh you are getting 2 records back.  This should only happen if your point
> happens to lie at the edge of 2 pixels.
>
> I see your point though that if your raster were in a single record, you
> would get one record back and the record would in fact be arbitrary of the
> 2.
> I think Pierre had discussed this a while back what to do in this scenario.
>
>
> Really in my mind, the correct thing to do would be to return an array.  But
> correctness is less useful in many cases than just arbitrarily picking one.
> Especially
> Since this SHOULD BE a rare occurrence that a point lies at the edge of two
> pixels.
>
> Though the fact they are sitting in two different rows means the point is a
> the edge of a tile and a pixel which would seem even rarer unless you
> specifically
> Picked up a point in that location.  Is that the case?
> If not then this is a bug.
>
>> Regarding metadata I played around with the examples in chapter 13 using
> an untiled raster 'relief_jpg' with 3 bands:
>
>> -- Listing 13.1 Getting general summary info about rasters in a kauai
> table SELECT count(*) As num_rasters, ST_Height(rast) As height,
>    ST_Width(rast) As width, ST_SRID(rast) As srid,
>    ST_NumBands(rast) As num_bands,
>    ST_BandPixelType(rast,1) As btype
> FROM relief_jpg
> GROUP BY ST_Height(rast) ,
>    ST_Width(rast), ST_SRID(rast),
>    ST_NumBands(rast),
>    ST_BandPixelType(rast,1);
> -- 1;584;891;21781;3;"8BUI"
>
>> -- Solution using metadata
> SELECT num_rasters, blocksize_x as height, blocksize_y as width, srid,
> array_length(pixel_types, 1) AS num_bands, pixel_types as btypes
>  FROM
>    raster_columns as r,
>    (SELECT count(*) AS num_rasters FROM relief_jpg) dummy
>  WHERE r_table_name = 'relief_jpg';
> -- 1;;;21781;3;"{8BUI,8BUI,8BUI}"
>
>>  So, two things are unclear to me:
>> * btypes should be an array in your solution since there are 3 bands,
> aren't they?
> Yes they are which the text representation of an array in PostgreSQL is
> {8BUI,8BUI,8BUI}
>
>> * in my solution there is a mix of table name as element and as text
> string.
>> * Obviously raster2pgsql did not set blocksize_x/blocksize_y
>
> Don't quite understand what you mean by mix of table name as element and as
> text string.
> From your query it looks like you only have one raster record in you
> relief_jpg.
> I think the raster2pgsql routine only sets the blocksize_x and y if you
> choose to tile the raster.  I suppose it might make
> Sense to just have it set to the width and height of the raster if the
> raster is not tiled.  Then again the fact its not set
> Gives you the clue that its tiled verses not tiled, which seems kind of
> useful.
>
> Thanks for reviewing the chapter,
> Regina
>
> http://www.postgis.us
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>