[postgis-users] A PostGIS-Raster data proposal

Wed Oct 25 12:41:08 PDT 2006

>From my perspective the primary benefit of doing something like this would
be to put images into tiles.  E.g. if you want a mosaic of USGS quads for
the entire US, chop them into tiles, spatially index the MBRs, and then
query the database for  the extent of interest.  The database would
automatically pull in the tiles necessary to create the image and merge them
together.  A single image of the desired extent would be returned.

--Craig

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Marshall, Steve
Sent: Wednesday, October 25, 2006 12:14 PM
To: postgis-users at postgis.refractions.net
Subject: Re: [postgis-users] A PostGIS-Raster data proposal

>From what I've been able to find re toasted columns; they're stored 
>somewhere in the database as a separate table, outside the referencing
row,
>and limited to 1GB.  The 1GB size may be significant in that some
raster
>formats can be larger.  Not a show-stopper, but it would require 
>repackaging of very large jpeg2000s, for example.

The 1GB limitation is a good point.  It is not a stopper for my purposes,
but could be a problem for some applications, e.g. very high resolution
satellite imagery, unless the data is reorganized into smaller tiles.

> Given that raster archives can be many TB in size, what are the 
> ramifications for backup and restore of a database containing toasted 
> rasters ?
> Since toasted values are in a separate table, can I backup and restore
the
> primary table rows sans toasted columns, and manage the toast in
slices ?
> What happens if my raster table gets, um, toasted  ?  Hopefully, a
full
> multiTB  restore/reorg would not be necessary before I'm running
again.

The TOAST implementation is hidden from the end user, so toasted columns go
through the backup/restore processes as though they were stored inline in
the table.  At least to my knowledge, you cannot dump the non-toasted
columns in a table without dumping the toasted columns.  The same
limitations apply to restore, which may require more inefficient restore
procedures if the image data becomes corrupted.  However, for what it's
worth, I've had no corruption problems in PostgreSQL that were not related
to hardware or power failures, things I now take more pains to avoid.

Both the TOAST limitations you cite might be better mitigated by using the
PostgreSQL large object interface instead.  LO size limitation is 2GB,
instead of 1GB for TOAST, and you can dump the LO data independently of the
tables that reference the LO data, which would address your backup concerns.
Also, LO has a more capable random access interface than TOAST, providing
seeks for both reads and writes.

Unfortunately, the LO implementation has some limitations.  I don't think
the seeking writes are guaranteed to follow transactional semantics, i.e.
they are not rolled back correctly (this at least used to be true).  But
more worrying to me is the fact that all LO's are stored in a single
relation (pg_largeobject) so this relation is likely to get very big.  For
your multi-TB image repository, all this data
would be in a single relation.   This would mean thousands of 1GB files
to hold the pg_largeobject relation, as well as a lot of space for the index
into this pg_largeobject.  Very large relations in PostgreSQL are difficult
to VACUUM, and tend to have poor performance characteristics, in part
because their indices cannot be loaded into memory all-at-once.

TOAST allows one to store image data in many relations because there is
one toast table for each relation that has toastable columns.   I think
the  multi-relation storage for image data will perform better and be more
maintainable, but your comments have been good food for thought on the
subject.

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users