[gdal-dev] GDAL Postgis Driver using SPI?

Tue Jan 26 02:31:17 PST 2016

Hi,

> 
> I am working on a project to manipulate huge rasters in a postgis database.
> In order to achieve best performance, we are implementing most of our code
> 'server-side', e.g. as a postgresql extension.
> 
> It occurs to me that another GDAL postgis driver using SPI
> (http://www.postgresql.org/docs/current/static/spi.html) instead of libpq
> would be entirely feasible and offer better performance as it executes
> directly against the database.

I'm not familiar with PG server side programming, but looking at the 
introduction of the SPI doc : "Note that if a command invoked via SPI fails, 
then control will not be returned to your procedure.", so would that mean that 
a C++ method in the driver could be interrupted ? (probably through a 
longjmp(), I imagine that must be implemented like that). Which could cause 
potentially memory leaks, or worse, if no precaution is taken (*). Perhaps  
using palloc() / SPI_palloc() would be needed (but wouldn't help for C++ 
objects temporary allocated) ?

> Making it perfect in situations where:
> 
> - The postgres database is local (so using libpq to setup a network
> connection is not necessarily the best method)
> - You want best read/write performance and are happy to only operate gdal
> 'server side'
> - You want to use gdal in order to write postgres/postgis extensions in C
> and expose them to clients as SQL (e.g. thin clients, webapps etc..)

So you would have an extension using a GDALDataset and offering processing on 
it ?
I'm trying to guess use cases where this would be a clear win over using 
PostgisRaster ST_XXX functions which also run server side. Perhaps for 
operations on a large set of tiles seen as a consistant virtual dataset rather 
than operating on each raster tile individually as the PostgisRaster functions 
do, and you don't want to do a ST_Union first or hit the limits of a PostGIS 
raster object ? 

> 
> Using SPI would require building the driver against postgres and is
> somewhat similar to using libpq, such that much of the existing code in
> the postgis raster driver can be reused.

In your previous email you mention "a entirely new driver". Did you think it 
would be feasible to have common classes shared by the PGRaster client driver 
and PGRaser server driver for most of the code, and move away the way to talk 
to the DB (ie libpq vs SPI ?). That would ease long term maintenance instead 
of code duplication that would result of the initial copy&paste operation.

Even

(*) the doc mentions later "It is possible to recover control after an error 
by establishing your own subtransaction surrounding SPI calls that might fail. 
This is not currently documented because the mechanisms required are still in 
flux." ...

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com