[gdal-dev] Performance Issue with VRT Pixel Function and Large Number of Source Rasters
Rahkonen Jukka
jukka.rahkonen at maanmittauslaitos.fi
Mon Apr 14 08:05:00 PDT 2025
Hi,
I got interested in trying what if all the images overlap totally. The timings from my test looks good to me, 28 seconds for creating a max raster from 1000 single band rasters, 1000x1000 pixels each. However, I am not totally sure if my test is valid so I present it here.
First create 1000 images with gdal_create with different pixel values
for /L %n in (1,1,1000) do gdal_create -outsize 1000 1000 -of gtiff -ot int16 -burn %n -a_srs epsg:4326 -a_ullr 20 30 30 20 %n.tif
Then create a prototype of a VRT
gdalbuildvrt alloverlap_max.vrt *.tif
Edit the VRT in a few places:
- Change SimpleSource into ComplexSource everywhere
- Add subClass="VRTDerivedRasterBand" and the pixel function:
<VRTRasterBand dataType="Int16" band="1" subClass="VRTDerivedRasterBand">
<PixelFunctionType>max</PixelFunctionType>
Finally, convert into GeoTIFF
gdal_translate -of gtiff alloverlap_max.vrt max.tif
The result looks correct, the max value of max.tif is 1000 even the last image in the VRT is 999.tif with pixel values=999, and I tried also min function and got 1 as expected.
-Jukka Rahkonen-
________________________________________
Lähettäjä: gdal-dev käyttäjän Even Rouault via gdal-dev puolesta
Lähetetty: Maanantai 14. huhtikuuta 2025 16.45
Vastaanottaja: Abdul Raheem Siddiqui; gdal-dev at lists.osgeo.org
Aihe: Re: [gdal-dev] Performance Issue with VRT Pixel Function and Large Number of Source Rasters
Abdul,if you add <SkipNonContributingSources>true</SkipNonContributingSources> as a child element of the <VRTRasterBand> element, and apply patch https://github.com/OSGeo/gdal/commit/3dbc60b334ee022f2993dca476b08d5fed01698c , "gdal_translate -of GTiff merged.vrt OUTPUT.tif" completes in a few minutesCf https://gdal.org/en/stable/drivers/raster/vrt.html#using-derived-bands-with-pixel-functions-in-python for the doc of SkipNonContributingSourcesEvenLe 14/04/2025 à 06:52, Abdul Raheem Siddiqui via gdal-dev a écrit :Dear GDAL Community,I am encountering a performance issue when using a VRT consisting of a large number of source rasters and built-in C++ pixel function ("max"). I would appreciate any guidance on whether some GDAL config option can improve this, or I am doing something wrong, or this is a potential optimization opportunity.I have A VRT file referencing ~750 individual rasters (Byte data type, avg size ~1000x1000 pixels, untiled, and same CRS for all source rasters). The VRT uses the built-in “max” pixel function.Running gdal_translate to convert the VRT to GTiff takes ~1.5 hours and consumes ~4GB RAM.gdal_translate -of GTiff merged.vrt OUTPUT.tifWhen tripling the number of source rasters (to ~2250) by duplicating entries in the VRT, processing time increases to ~4.5 hours, with RAM usage rising to ~11.5GB.gdal_translate -of GTiff merged_3x.vrt OUTPUT.tifExtracting a small subset of the VRT via -projwin does not improve performance (still slow, ~14GB RAM used).gdal_translate -projwin -146955.241 797044.4497 -138766.1444 789656.0648 -of GTiff merged_3x.vrt OUTPUT.tifRemoving the pixel function makes processing instantaneous, even with 2250 rasters.Performance is unaffected by --config GDAL_CACHEMAX or -co NUM_THREADS=ALL_CPUS.The issue persists across other datasets that I have. In fact, it gets really worse when source rasters are relatively larger or of Float32 data type.Gdalinfo on one of the source rasters: https://pastebin.com/gjHDA2WdData: https://drive.google.com/file/d/1LGlzGGZvkPyXvKKgkPVzQGbBw55p5dRd/view?System: Windows, 8 CPUs, 32GB RAM (GDAL 3.9.2 via OSGeo4W).Thank you for your time and insights. Please reply if you are aware of how performance can be improved.Regards, Abdul Siddiqui, PEabdul.siddiqui at ertcorp.comERT | Earth Resources Technology, Inc.14401 Sweitzer Ln. Ste 300Laurel, MD 20707https://www.ertcorp.com _______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
More information about the gdal-dev
mailing list