[gdal-dev] Performance Issue with VRT Pixel Function and Large Number of Source Rasters

Even Rouault even.rouault at spatialys.com
Mon Apr 14 06:45:20 PDT 2025


Abdul,

if you add <SkipNonContributingSources>true</SkipNonContributingSources> 
as a child element of the <VRTRasterBand> element, and apply patch 
https://github.com/OSGeo/gdal/commit/3dbc60b334ee022f2993dca476b08d5fed01698c 
, "gdal_translate -of GTiff merged.vrt OUTPUT.tif" completes in a few 
minutes

Cf 
https://gdal.org/en/stable/drivers/raster/vrt.html#using-derived-bands-with-pixel-functions-in-python 
for the doc of SkipNonContributingSources

Even

Le 14/04/2025 à 06:52, Abdul Raheem Siddiqui via gdal-dev a écrit :
> Dear GDAL Community,
>
> I am encountering a performance issue when using a VRT consisting of a 
> large number of source rasters and built-in C++ pixel function 
> ("max"). I would appreciate any guidance on whether some GDAL config 
> option can improve this, or I am doing something wrong, or this is a 
> potential optimization opportunity.
>
> I have A VRT file referencing ~750 individual rasters (Byte data type, 
> avg size ~1000x1000 pixels, untiled, and same CRS for all source 
> rasters). The VRT uses the built-in “max” pixel function.
>
> Running gdal_translate to convert the VRT to GTiff takes ~1.5 hours 
> and consumes ~4GB RAM.
> gdal_translate -of GTiff merged.vrt OUTPUT.tif
>
> When tripling the number of source rasters (to ~2250) by duplicating 
> entries in the VRT, processing time increases to ~4.5 hours, with RAM 
> usage rising to ~11.5GB.
> gdal_translate -of GTiff merged_3x.vrt OUTPUT.tif
>
> Extracting a small subset of the VRT via -projwin does not improve 
> performance (still slow, ~14GB RAM used).
> gdal_translate -projwin -146955.241 797044.4497 -138766.1444 
> 789656.0648 -of GTiff merged_3x.vrt OUTPUT.tif
>
> *Removing the pixel function makes processing instantaneous, even with 
> 2250 rasters.*
>
> Performance is unaffected by --config GDAL_CACHEMAX or -co 
> NUM_THREADS=ALL_CPUS.
>
> The issue persists across other datasets that I have. In fact, it gets 
> really worse when source rasters are relatively larger or of Float32 
> data type.
>
> Gdalinfo on one of the source rasters: https://pastebin.com/gjHDA2Wd
> *Data:* 
> https://drive.google.com/file/d/1LGlzGGZvkPyXvKKgkPVzQGbBw55p5dRd/view?
>
> System: Windows, 8 CPUs, 32GB RAM (GDAL 3.9.2 via OSGeo4W).
>
> Thank you for your time and insights. Please reply if you are aware of 
> how performance can be improved.
>
> Regards,
>
> *Abdul Siddiqui, PE*
>
> _abdul.siddiqui at ertcorp.com_
>
>
> *ERT  | *Earth Resources Technology, Inc.
>
> 14401 Sweitzer Ln. Ste 300
>
> Laurel, MD 20707
>
> https://www.ertcorp.com
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20250414/22883c71/attachment-0001.htm>


More information about the gdal-dev mailing list