[gdal-dev] VSIReadDir() and GDALOpenInfo

Frank Warmerdam warmerdam at pobox.com
Wed Jan 30 12:00:16 EST 2008


Folks,

These have been a couple complaints so far about the new behavior of the
GDALOpenInfo class (used by GDALOpen), where it will build a list of filenames
in a directory before calling each of the drivers to try and open something.

This was done in the hopes of reducing the overhead of having many drivers
probing around for files using fstat and fopen which I presume can be fairly
expensive.  This was especially perceived as an issue for the new identify
operation.

But it seems that just scanning for a list of files in a directory can be
quite expensive in a directory with many thousands of entries - at least on
some operating systems, as documented in:

   http://trac.osgeo.org/gdal/ticket/2158

Two steps have been taken in an effort to reduce the impact of this issue.
First, Even has corrected VSIReadDir() to avoid an O(n^2) effect in how
the filename list was being built.  Second, I have added a configuration
option so that it is possible to disable the directory scanning for those
who are aware that this issue is impacting their performance.

But I'm wondering if there is a fundamentally better approach to scanning
for the list of files in a directory, or if this is inherently quite an
expensive operation.  Thoughts?  Other experiences on this?  Is this more
of an issue with particular file systems?

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org



More information about the gdal-dev mailing list