[Gdal-dev] RFC 12: Improved File Management

Frank Warmerdam warmerdam at pobox.com
Mon May 7 12:27:54 EDT 2007


Ray Gardener wrote:
> Hmm... I'm partial to the opposite route: abstracting away details like 
> the structure of datasets, since they're driver-dependant. Do callers 
> really need to know how datasets are organized? 

Ray,

As Daniel points out there are several use cases where it is important
to be able to manage the files associated with a dataset.  I'm initially
implementing the GetFileList() primarily in support of the delete, and
rename operations.

But in the FDO library (used by MapGuide) there is a concept of being able
to "bundle up a dataset in a zip file and send to the server" and doing this
also requires a list of associated files.

In MapServer I would like to be able to write a dataset with GDAL and then
ask for all the files so I can zip them up and return them to a client.

Clearly most applications using GDAL aren't going to need to know details
about which files are associated with a dataset.  We did get through
quite a few years without this info though it often meant people building
a bit of format specific logic into their applications.

 > Is it even a warranted
> assumption that the data is on a filesystem?

There are clearly some datasets that are no in the file system - such
as WCS for instance.  There are also some datasets which are in memory
using the /vsimem/ virtual file system, but for our purposes we can
treat this as a filesystem as long as great care is taken to use the
VSI*L API to access it.

 > The more we get away from a
> black box model, the more maintenance it's going to be; the API surface 
> area keeps going up and increases the combinatorial effects. No offence, 
> but an RFC like this seems more like solving a symptom instead of a root 
> cause.

I'm not sure I agree with this.

> GDAL should do all file and filesystem I/O as callbacks, and a utility 
> reference library built on top that implements the callbacks for common 
> cases like C stdlib I/O, GUI filesys browsers, etc. 

That is exactly the intention of the VSI*L API - a POSIX-like virtualization
of file IO so we can have files in memory, in the file system, and in the
future possibly files accessed out of zip files, etc.

However, that doesn't address the need to manage these datasets which is
what RFC 12 attempts to help with.

I'd add that not all drivers use the VSI*L API.  In some cases they just
need to be updated.  In other cases the driver is built on an external
library that doesn't give us the option of replacing the IO mechanism.

> The more GDAL 
> continues to know about files and filesys details, the more painful it's 
> going to get.

Well, I think you need to provide a few concrete examples of this so we
can better understand your point.  I certainly don't want routine GDAL
applications to need to know much.  They don't know and I don't think
that is changing.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org




More information about the Gdal-dev mailing list