[GRASS-dev] Re: r.in.gdal: how to speed-up import with huge amount of bands?

Markus Neteler neteler at osgeo.org
Mon Mar 29 03:00:53 EDT 2010


On Mon, Mar 29, 2010 at 8:35 AM, Markus Neteler <neteler at osgeo.org> wrote:
> Hi,
>
> I have received temperature map time series of 50 years of daily data
> in single Geotiff files (Tmin, Tmean, Tmax). Each GeoTIFF has around
> 21550 bands, 4GB file size.
>
> Problem: the import takes "forever" despite using a superfast disk, i.e.
> 120 seconds per band (size is only 464 x 201), so 29 DAYS for each file.
>
> The problem will be that it has to run through the entire 4GB over and over
> again for each channel. I guess I want to do heavy caching - but how?
>
> Looking at ImportBand() in main.c of r.in.gdal, I see that GDALRasterIO()
> is used:
>  http://www.gdal.org/gdal_tutorial.html#gdal_tutorial_read
> but I don't see hints to tell the IO function to keep more in cache.

I found something in nearblack.c of GDAL, which is

Index: raster/r.in.gdal/main.c
===================================================================
--- raster/r.in.gdal/main.c     (revision 41604)
+++ raster/r.in.gdal/main.c     (working copy)
@@ -666,6 +666,7 @@
     /*      Select a cell type for the new cell.                            */
     /* -------------------------------------------------------------------- */
     eRawGDT = GDALGetRasterDataType(hBand);
+    GDALSetCacheMax (2000000000); /* heavy caching */

     switch (eRawGDT) {
     case GDT_Float32:

It allocates way more RAM (2GB) but the speed remains exactly
the same: 120 seconds per band.

> The original files are in netCDF format, I used gdalwarp to generate GeoTIFF.
> First I thought to write it out as 21550 files but didn't find an
> option to do so.

Clueless,
Markus


More information about the grass-dev mailing list