[gdal-dev] Fwd: Performance Variability with GDAL Caching and Multi-Threading for MODIS Data
Varisht Ghedia
varisht at tathya.earth
Tue Apr 1 22:34:33 PDT 2025
Hi Laurentiu,
I am using the pymodis library:
https://github.com/lucadelu/pyModis/tree/master to extract the LST and QC
bands from a MODIS (aqua / terra) MOD11A1 product. Upon checking the code,
it looks like internally the library has the following gdal calls for the
tasks I execute:
gdal.AutoCreateWarpedVRT
gdal.ReprojectImage
I execute the script like this:
modis_convert.py -s "( 1 0 0 0 0 0 0 0 0 0 0 0 )" -g 30 -o 2025-03-14 -e
32618 MOD11A1.A2025073.h10v10.061.2025074095514.hdf
Here:
-s : Select the bands to extract (LST in this case)
-g : Spatial resolution of the output file (30m)
-o : Prefix of the output file
-e : EPSG code for the output (EPSG:32618)
MOD11A1.A2025073.h10v10.061.2025074095514.hdf: MODIS terra product
To test the effects of cache and multi-threading I set the config options
at the start of the program like this:
gdal.SetConfigOption("GDAL_NUM_THREADS", "ALL_CPUS")
gdal.SetConfigOption("GDAL_CACHEMAX", "2G")
RAM usage is not much of a concern as at a time, I process a single product
for now, so I can allocate a higher amount if needed and if it speeds up
things.
Thanks for your insights regarding NUM_THREADS and CACHEMAX. Is there a
dedicated option to enable multi-threading i.e. -m using python or does
ALL_CPUS enable multi-threading automatically. Is there a difference
between -m and ALL_CPUS?
Thanks and Regards,
Varisht Ghedia
On Tue, 1 Apr 2025 at 22:15, <gdal-dev-request at lists.osgeo.org> wrote:
> Send gdal-dev mailing list submissions to
> gdal-dev at lists.osgeo.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
> or, via email, send a message with subject or body 'help' to
> gdal-dev-request at lists.osgeo.org
>
> You can reach the person managing the list at
> gdal-dev-owner at lists.osgeo.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gdal-dev digest..."
> Today's Topics:
>
> 1. Re: Fwd: Performance Variability with GDAL Caching and
> Multi-Threading for MODIS Data (Lauren?iu Nicola)
> 2. GDAL 3.10.3 release candidate available (Even Rouault)
> 3. Proposal for GDAL Driver: EOPF Zarr (Earth Observation
> Product Format) (Adagale Yuvraj Bhagwan)
>
>
>
> ---------- Forwarded message ----------
> From: "Laurențiu Nicola" <lnicola at dend.ro>
> To: gdal-dev at lists.osgeo.org
> Cc:
> Bcc:
> Date: Tue, 01 Apr 2025 10:40:43 +0300
> Subject: Re: [gdal-dev] Fwd: Performance Variability with GDAL Caching and
> Multi-Threading for MODIS Data
> Hi,
>
> Since it's not exactly clear from your description, what operations are
> you running, just the equivalent of gdal.Translate()? gdal.Warp()? GDAL
> can use threading in a couple of places:
>
> - to compress the output before writing it, e.g. the NUM_THREADS
> creation option of GTiff
> - to decompress the input when reading a region larger than one block
> or strip, e.g. the NUM_THREADS open option of GTiff
> - for pipelining the I/O and warping in gdalwarp (-multi)
> - to parallelize warping itself in gdalwarp (-wo NUM_THREADS)
>
> And of course, there might be others I'm not aware of.
>
> I'm not sure about the effects you see when setting the cache, but note
> that the default cache GDAL_CACHEMAX is "5% of the usable physical RAM,
> [...] consulted the first time the cache size is requested". To disable the
> cache you can use GDAL_CACHEMAX=0, which can reduce the memory usage and
> speed up the program in very specific cases (e.g. when processing one block
> at a time without reading parts of the input twice), but becomes a lot less
> useful when you do any kind of warping or resampling.
>
> Laurentiu
>
> On Tue, Apr 1, 2025, at 10:19, Varisht Ghedia via gdal-dev wrote:
>
> Dear GDAL Developers,
>
> I am working on optimizing the processing times for MODIS datasets
> (LST_1Km and QC Day tile) using pymodis with some modifications.
> Specifically, I have added flags for:
>
> -
>
> Running on all available CPU cores (ALL_CORES)
> -
>
> Adjusting GDAL cache size (GDAL_CACHEMAX)
>
> However, I am observing unexpected performance variations. In some cases,
> increasing the cache size degrades performance instead of improving it.
> Below are my test results for two different datasets from the same tile.
> Tile used: MOD11A1.A2025073.h10v10.061.2025074095514.hdf
>
> EPSG:32618, Resampled to 30m
> *QC_tile.tif*
>
> ALL_CORES + 2G
> real 0m24.199s
> user 0m53.352s
> sys 0m9.998s
>
> STANDARD RUN (No Cache, No Multi-Threading)
> real 0m32.133s
> user 0m30.581s
> sys 0m2.299s
>
> ALL_CORES + 512M
> real 0m13.830s
> user 0m51.083s
> sys 0m1.911s
>
> With 512M cache, performance improves significantly, but with larger
> caches (1G, 2G, 4G), execution time increases.
> *LST_Day_1km.tif*
>
> ALL_CORES + 512M
> real 0m42.863s
> user 0m44.105s
> sys 0m3.583s
>
> STANDARD RUN (No Cache, No Multi-Threading)
> real 0m45.121s
> user 0m26.477s
> sys 0m3.712s
>
> ALL_CORES + 2G
> real 0m37.548s
> user 0m48.302s
> sys 0m8.113s
>
> ALL_CORES + 4G
> real 0m51.845s
> user 0m48.213s
> sys 0m7.988s
>
> For this dataset, using a 2G cache improves performance, but increasing it
> to 4G makes processing slower.
> *Questions:*
>
> 1.
>
> How does GDAL’s caching mechanism impact performance in these
> scenarios?
> 2.
>
> Why does increasing cache size sometimes degrade performance?
> 3.
>
> Is there a recommended way to tune cache settings for MODIS HDF
> processing, considering that some layers (like QC) behave differently from
> others (like LST_1Km)?
>
> Any insights into how GDAL handles multi-threading and caching internally
> would be greatly appreciated.
>
> Thanks in advance for your help!
>
> Best regards,
>
> Varisht Ghedia
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>
>
>
> ---------- Forwarded message ----------
> From: Even Rouault <even.rouault at spatialys.com>
> To: "gdal-dev at lists.osgeo.org" <gdal-dev at lists.osgeo.org>
> Cc:
> Bcc:
> Date: Tue, 1 Apr 2025 13:09:27 +0200
> Subject: [gdal-dev] GDAL 3.10.3 release candidate available
> Hi,
>
> I have prepared a GDAL/OGR 3.10.3 release candidate.
>
> Pick up an archive among the following ones (by ascending size):
>
> https://download.osgeo.org/gdal/3.10.3/gdal-3.10.3rc1.tar.xz
> https://download.osgeo.org/gdal/3.10.3/gdal-3.10.3rc1.tar.gz
> https://download.osgeo.org/gdal/3.10.3/gdal3103rc1.zip
>
> A snapshot of the gdalautotest suite is also available:
>
> https://download.osgeo.org/gdal/3.10.3/gdalautotest-3.10.3rc1.tar.gz
> h ttps://download.osgeo.org/gdal/3.10.3/gdalautotest-3.10.3rc1.zip
>
> The NEWS file is here:
>
> https://github.com/OSGeo/gdal/blob/v3.10.3RC1/NEWS.md
>
> Best regards,
>
> Even
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
>
>
>
>
> ---------- Forwarded message ----------
> From: Adagale Yuvraj Bhagwan <Yuvraj.Adagale at eurac.edu>
> To: "gdal-dev at lists.osgeo.org" <gdal-dev at lists.osgeo.org>
> Cc:
> Bcc:
> Date: Tue, 1 Apr 2025 16:45:48 +0000
> Subject: [gdal-dev] Proposal for GDAL Driver: EOPF Zarr (Earth Observation
> Product Format)
> Hello GDAL Community,
>
> We’re developing a GDAL driver for the Earth Observation Product Format
> (EOPF), a cloud-optimized Zarr-based format tailored for large-scale EO
> data.
> This driver aims to enable seamless access to EOPF datasets and their
> metadata through GDAL, supporting features like chunked I/O, and
> compatibility with STAC metadata.
>
> Key features:
> - Support for Zarr V2/V3 structures with EOPF-specific enhancements.
> - Integration with cloud storage (S3, GCS, etc.).
> - Alignment with ESA/Copernicus data standards.
>
> We’d appreciate your feedback on integration requirements and best
> practices. The code is available at EOPF-Sample-Service/GDAL-ZARR-EOPF
> <https://github.com/EOPF-Sample-Service/GDAL-ZARR-EOPF>, and we plan to
> submit a PR soon.
>
> Best regards,
>
> *Yuvraj Adagale*
>
> *Eurac Research*
>
>
> *Researcher*
>
> Institute for Earth Observation
> *T* +39 344 584 4031
>
> yuvraj.adagale at eurac.edu
>
>
>
> Drususallee/Viale Druso 1
>
> I-39100 Bozen/Bolzano
>
>
>
> Legal Seat
>
> Drususallee/Viale Druso 1
>
> I-39100 Bozen/Bolzano
> *www.eurac.edu <http://www.eurac.edu/>*
>
>
>
> *Facebook <https://facebook.com/eurac.research> | YouTube
> <https://www.youtube.com/EURACtv> | X <https://twitter.com/eurac> |
> LinkedIn <https://www.linkedin.com/company/euracresearch> | Instagram
> <https://www.instagram.com/euracresearch/>* *| CV*
>
>
>
>
>
> *[image: signature_1401579056] <https://www.eurac.edu/en>*
>
>
>
> According to regulation (EU) 2016/679 this transmission is intended only
>
> for the use of the addressee and may contain confidential information.
>
> If you receive this transmission in error, please notify the sender
> immediately
>
> by email and delete all copies of this message and any attachments.
>
>
>
> Diese Nachricht ist im Sinne der Verordnung (EU) 2016/679 ausschließlich
> für
>
> den Adressaten bestimmt und kann vertrauliche Informationen enthalten.
>
> Sollten Sie diese Nachricht irrtümlich erhalten haben, bitten wir Sie, den
>
> Absender darüber unverzüglich per E-Mail in Kenntnis zu setzen sowie die
>
> Nachricht und etwaige Kopien und Anlagen zu vernichten.
>
>
>
> Ai sensi del Regolamento UE 679/2016 questo messaggio è ad uso esclusivo
>
> del destinatario e può contenere informazioni riservate. Qualora Le fosse
>
> pervenuto per errore, Le chiediamo gentilmente di comunicarcelo
>
> immediatamente via e-mail ed eliminare qualsiasi copia e allegato.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20250402/0c6a3a76/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-signature_.png
Type: image/png
Size: 17457 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20250402/0c6a3a76/attachment-0001.png>
More information about the gdal-dev
mailing list