[postgis-devel] GSoC idea: in-situ access for rasters
Pierre Racine
Pierre.Racine at sbf.ulaval.ca
Wed Apr 17 07:40:33 PDT 2013
Vladimir,
Could you speak more about the pros and cons of the Foreign Data Wrappers approach VS the out-db storage which is already available for PostGIS rasters?
If there are not so many pros, would you be willing to propose or work on another idea?
Pierre
> -----Original Message-----
> From: postgis-devel-bounces at lists.osgeo.org [mailto:postgis-devel-
> bounces at lists.osgeo.org] On Behalf Of Vladimir Kikhtenko
> Sent: Thursday, April 11, 2013 4:23 AM
> To: postgis-devel at lists.osgeo.org
> Subject: [postgis-devel] GSoC idea: in-situ access for rasters
>
> Hello, list!
>
> I want to present my idea for GSoC'13 project. It is about interacting
> with raster data without preloading it in the database first.
> Using Foreign Data Wrappers (quite new Postgres 9 feature) we can
> create foreign table for some raster file that will read data directly
> from the file, when the query is issued. I suppose that this foreign
> table will contain rows for each pixel in the source file and columns
> for each dataset in it. Also, some columns will represent the
> geolocation of data. For example table can look like:
>
> CREATE FOREIGN TABLE foreign_raster (
> xpos int4, -- this two columns describe pixel's position in raster
> ypos int4,
> footprint geometry OPTIONS (type 'footprint'), -- geolocation of a pixel
> layer1 float8 OPTIONS (sds 'Atmospheric Optical Depth'),
> layer2 int2 OPTIONS (sds 'Atmospheric Optical Depth Model', type 'byte'),
> ) SERVER hvault_service
> OPTIONS (filename '/some/path/some-file.hdf');
>
> Even more, we can create a catalog of files, so foreign table will
> contain pixels from each of them. This approach can be useful when you
> have large archive of raster data you want to access like it were in
> database, but do not want to actually create copy of it due to disk
> usage constraints. Another option is to store timestamps for files in
> catalog, so we can request time-series for some point or region of
> interest.
>
> Also FDW API gives possibility to analyze the query tree before
> execution, which leads to additional optimizations. We can examine
> query quals and filter catalog, so we will only read files that
> contribute to the query result. For example, if we store the footprint
> of whole raster in catalog, and the query contains clause like
> ST_Contains(footprint, ST_GeometryFromText('POINT( 43.19 64.90)')), we
> can process only files that contain that point in their footprint.
>
> I've already implemented this idea for HDF files as a part of my MSc
> thesis that will defend in June. See
> https://github.com/kikht/fdb/tree/master/hvault (sorry, the code is
> not very well commented, yet). We are using it in our lab to access
> 100Tb archive of MODIS images. I think it is possible to implement
> access to files through GDAL, so many file formats would be supported
> in one move.
>
> Do you interested in such project? Any comments or questions are
> highly appreciated.
>
> --
> Vladimir Kikhtenko
> Novosibirsk State University, Russia
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
More information about the postgis-devel
mailing list