[gdal-dev] Many files over the network

Kyle Ellison kellison at geocue.com
Tue Mar 15 17:48:16 EDT 2011


Even,

I tried out the patch.  I was unable to actually try it in the 1.9 source since my environment uses the C# bindings and gdal18.dll.  So, I applied the patch manually to the 1.8 version I had.  That was not terribly difficult... mainly just different line numbers in geotiff.cpp.  Anyway, we got the performance of yesteryear back.  I would very much like you to commit that change to the main trunk if that is possible.  Thanks for your help on this.

Best Regards,
Kyle

-----Original Message-----
From: Even Rouault [mailto:even.rouault at mines-paris.org] 
Sent: Monday, March 14, 2011 5:41 PM
To: gdal-dev at lists.osgeo.org
Cc: Kyle Ellison
Subject: Re: [gdal-dev] Many files over the network

Kyle,

you didn't mention which driver was in question. I guess this is GeoTIFF ? 
I've looked at the code of the driver and it appears that it loads the .rpb and .imd files at least since GDAL 1.6.0. The new thing in GDAL 1.8.0 is that it also tries to load the _rpc.txt file. Is it that small difference which causes the slowdown you observe ?

In fact, setting GDAL_DISABLE_READDIR_ON_OPEN = TRUE might make things actually worse (w.r.t to that aspect of loading rpb/rpc/imd) since the driver has to really test the filesystem to look for the existence of the files, whereas by default it would rely on the papszSiblingsFile list.

Anyway, I've attached a patch (against SVN trunk) that differs the loading of RPC and IMD until necessary (that is to say when GetMetadata() or
GetMetadataItem() is called with "RPC" or "IMD" metadata domain, or when
GetFileList() is called).

Could you try it and report if it makes things better for you ?

Another idea to solve the performance problem would be to use an alternate
GDALOpen() where you could provide the papszSiblingFile list. If you read several files in the same directory, you could build the list one and provide it multiple times afterwards.

Best regards,

Even

> Often, we need to open many raster files over a network connection 
> with thousands of other files residing in the same directory.
> 
> 
> 
> Previously, we were using version 1.7.0 of GDAL (from FW Tools), and 
> we used SetConfigOption("GDAL_DISABLE_READDIR_ON_OPEN", "TRUE")
> 
> to suppress the automatic search for sibling files.  This approach 
> served us well.
> 
> 
> 
> We upgraded to 1.8.0.  It was quite a bit slower opening the raster files.
> I was able to get GDAL built in debug and stepped through the code and 
> discovered the following:
> 
> 
> 
> 1.       It searches for several files containing RPC metadata.
> 
> 2.       It searches for files for PAM as well.
> 
> 
> 
> I was able to use SetConfigOption("GDAL_PAM_ENABLED", "NO") to 
> suppress searching for PAM files.
> 
> 
> 
> However, I do not see a way to suppress searching for RPC metadata.  
> Does anyone know of a way to do this or other workarounds? If there is 
> no way to currently do this, is the community open to adding this option?
> 
> 
> 
> I apologize if this question has been posted previously, but I haven't 
> yet found a convenient way to search the archives.
> 
> 
> 
> Thanks,
> 
> Kyle




More information about the gdal-dev mailing list