[postgis-devel] GSoC idea: in-situ access for rasters

Pierre Racine Pierre.Racine at sbf.ulaval.ca
Wed Apr 17 07:40:33 PDT 2013


Vladimir,

Could you speak more about the pros and cons of the Foreign Data Wrappers approach VS the out-db storage which is already available for PostGIS rasters?

If there are not so many pros, would you be willing to propose or work on another idea?

Pierre

> -----Original Message-----
> From: postgis-devel-bounces at lists.osgeo.org [mailto:postgis-devel-
> bounces at lists.osgeo.org] On Behalf Of Vladimir Kikhtenko
> Sent: Thursday, April 11, 2013 4:23 AM
> To: postgis-devel at lists.osgeo.org
> Subject: [postgis-devel] GSoC idea: in-situ access for rasters
> 
> Hello, list!
> 
> I want to present my idea for GSoC'13 project. It is about interacting
> with raster data without preloading it in the database first.
> Using Foreign Data Wrappers (quite new Postgres 9 feature) we can
> create foreign table for some raster file that will read data directly
> from the file, when the query is issued. I suppose that this foreign
> table will contain rows for each pixel in the source file and columns
> for each dataset in it. Also, some columns will represent the
> geolocation of data. For example table can look like:
> 
> CREATE FOREIGN TABLE foreign_raster (
>     xpos int4, -- this two columns describe pixel's position in raster
>     ypos int4,
>     footprint geometry OPTIONS (type 'footprint'), -- geolocation of a pixel
>     layer1 float8 OPTIONS (sds 'Atmospheric Optical Depth'),
>     layer2 int2 OPTIONS (sds 'Atmospheric Optical Depth Model', type 'byte'),
> ) SERVER hvault_service
>   OPTIONS (filename '/some/path/some-file.hdf');
> 
> Even more, we can create a catalog of files, so foreign table will
> contain pixels from each of them. This approach can be useful when you
> have large archive of raster data you want to access like it were in
> database, but do not want to actually create copy of it due to disk
> usage constraints. Another option is to store timestamps for files in
> catalog, so we can request time-series for some point or region of
> interest.
> 
> Also FDW API gives possibility to analyze the query tree before
> execution, which leads to additional optimizations. We can examine
> query quals and filter catalog, so we will only read files that
> contribute to the query result. For example, if we store the footprint
> of whole raster in catalog, and the query contains clause like
> ST_Contains(footprint, ST_GeometryFromText('POINT( 43.19 64.90)')), we
> can process only files that contain that point in their footprint.
> 
> I've already implemented this idea for HDF files as a part of my MSc
> thesis that will defend in June. See
> https://github.com/kikht/fdb/tree/master/hvault (sorry, the code is
> not very well commented, yet). We are using it in our lab to access
> 100Tb archive of MODIS images. I think it is possible to implement
> access to files through GDAL, so many file formats would be supported
> in one move.
> 
> Do you interested in such project? Any comments or questions are
> highly appreciated.
> 
> --
> Vladimir Kikhtenko
> Novosibirsk State University, Russia
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list