[postgis-devel] [postgis-users] Large GeoTIFF ingestion by PostGIS

Antonio Rodriges antonio.rrz at gmail.com
Sun May 6 08:47:59 PDT 2018


Regina,

Thank you for answering and sharing the experience.

> Storing a large raster in the database as a single raster record is pretty much out of the question.  You'd quickly exceed the max rowxcol pixel allowed.  I forget exactly what that limit is.  In addition  the spatial index on the rasters would be useless so any join operation you do or filtering by spatial location will be painfully slow.

I could guess that "rowxcol" probably should be read as "row x
(multiply) col", i.e. the number of pixels per one raster object

>
> What kind of operations are you planning to do with the rasters?  That will determine the tile size you should use.

The main workload will be reclassification, map algebra, resampling
and similar operations.
However, if unary operations like interpolation are easy to express
(just run the operation on each table row), others are not.
For example, I would like to subtract one raster of another: A - B.
We import big rasters as tiles that are stored in a table with plenty of rows.
If A and B cover different spatial areas and have different resolutions should I
1) find the common bounding box for A and B, say bbox
2) clip A and B using the bbox (A and B can be already in-db, so this
should be slower than out-db; this is actually a step of a more
complex query)
3) resample A or B to the resolution of B or A (some common resolution)
4) finally subtract A - B
which tiles should I subtract from each other?
how can I be sure that tiles with the same rid from A and B cover
exactly the same spatial extent?
is there any better way to accomplish such a simple operation like the
difference?
will spatial indexing or any other facility help somehow in this situation?

I suppose most difficulties arise from a raster to be dissolved into
separate tiles within a single table leading to complex SQL queries.
Am I right or there are simpler ways to work with large rasters?

Thank you!
Antonio

> 128x128 is most suitable for intensive analysis of small areas.   If you are doing less granular stuff like just unioning or clipping rasters, then you'd probably want to go with higher pixel sizes.  I tend to use 256x256 and rarely exceed 1000x1000
>
> Out-db operations I haven't done benchmarks recently on it, but some operations are faster with it than in-db and some are faster without-db.
>
> Generally processes like converting to other raster formats, tilling, and clipping as I recall are faster without-db. Operations like reclass / map algebra flavor of things are slower without-db.
>
> -----Original Message-----
> From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Antonio Rodriges
> Sent: Thursday, May 03, 2018 6:11 AM
> To: PostGIS Users Discussion <postgis-users at lists.osgeo.org>; postgis-devel at lists.osgeo.org
> Subject: [postgis-users] Large GeoTIFF ingestion by PostGIS
>
> Hello,
>
> I have a relatively large GeoTIFF file that I would like to process by PostGIS (other, much larger GeoTIFFs also exist):  24937   38673 pixels
>
> So far I am aware of 3 ways to ingest the file into PostGIS.
> However, I have concerns about all of these ways:
>
> 1: in-db, no tiling:
> raster2pgsql raster.tiff public.new_table |  psql -d postgis -U postgis
>
> In fact, we create a single large raster inside PostGIS. Should it be efficient to process?
>
> 2. in-db, with tiling:
> raster2pgsql -t 128x128 raster.tiff public.new_table |  psql -d postgis -U postgis
>
> This results in numerous rasters that are percieved by PostGIS as separate rasters, not as a single raster object. While this seems to be the most appropriate way in terms of storage, it is very inconvenient to process such a list of rasters. All queries should work with a table of rasters, not with a raster.
>
> 3. out-db:
> raster2pgsql -R raster.tiff public.new_table |  psql -d postgis -U postgis
>
> We leave the raster outside the DB. I suppose only internal GeoTIFF tiling might help to manage the I/O efficiently. However, does PostGIS take internal tiling into account during the I/O? Is this ingestion option is less efficient in terms of performance compared to in-db approaches?
>
> The main questions are:
> A. What is the best way to ingest a large GeoTIFF into PostGIS -- in terms of both storage efficiency and query conveniency?
> B. Are there any other ways (efficienct and convenient) to put a large GeoTIFF into PostGIS?
>
> Thank you!
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel


More information about the postgis-devel mailing list