[gdal-dev] gdalinfo on large vrt takes a long time

William Kyngesburye woklist at kyngchaos.com
Tue Jul 18 15:17:27 PDT 2023


(sorry, my email sorting rules missed your reply somehow, just found it)

The env var didn't help, and the vrt does have statistics.

Pathnames in the vrt are relative to the vrt, if that might be a problem in this situation.

I turned on CPL_DEBUG.  gdalinfo is doing whatever is taking time after GDALDefaultOverviews::OverviewScan().  The next output that appears when that's done is the list of files.  The individual files do not have overviews, and the vrt has no overviews.  When I add the vrt overviews (I disable them most of the time by renaming the over file because they're out of date), I get the overview scan message, a list of overviews with no delay, then another overview scan message that has the long processing time again, then the file list.

-----
William Kyngesburye
<kyngchaos*at*kyngchaos*dot*com>
<https://www.kyngchaos.com>

Don't Panic

> On Jun 8, 2023, at 6:18 PM, Even Rouault <even.rouault at spatialys.com> wrote:
> William,
> 
> it might be perhaps related to the GetMinimum() call done by gdalinfo. Cf https://trac.osgeo.org/gdal/ticket/5444
> 
> But normally it should only try to open the first source, and not all of them. At least that's what I could confirm on a quick testing. But I do see that the CanUseSourcesMinMaxImplementations() method will stat() sources whose filename looks like a local file (obviously if that's a mounted file system / vpn thing, it will not realize it is remote).
> 
> Try setting the VRT_MIN_MAX_FROM_SOURCES=NO environment variable / configuration option to see if that makes a difference. If it does, the CanUseSourcesMinMaxImplementations() logic should be modified to avoid doing those stat's().
> 
> If that's confirmed to be linked to GetMinmum(), you may also workaround the issue by doing a "gdalinfo -stats the.vrt" (from the server) to have statistics incorporated in the VRT, then GetMinimum() should be instant
> 
> Even
> 
> Le 09/06/2023 à 00:43, William Kyngesburye a écrit :
>> I'm writing a script that needs some info from a vrt raster, and one has thousands of files. When reading the vrt over the internet (vpn to our server) it takes a long time. I think it's looking at every file of the vrt.  What is gdal reading from the files that's not in the vrt itself? I used all the -no options and I'm not adding any other info options like checksums or stats. I just need basic info from the vrt that's in the vrt file.
>> 
>> -----
>> William Kyngesburye
>> <kyngchaos*at*kyngchaos*dot*com>
>> <https://www.kyngchaos.com>
>> 
>> Don't Panic
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> -- 
> http://www.spatialys.com
> My software is free, but my time generally not.


More information about the gdal-dev mailing list