[gdal-dev] FileGBD vs OpenFileGBD, a few of questions

Even Rouault even.rouault at spatialys.com
Mon Feb 11 06:05:20 PST 2019


Andrea,

> Am I right in assuming one has to pretty much build from sources in order
> to try out the  FileGBD driver?

In osgeo4w, you can install the "gdal-filegdb" package

> 
> I've tried to open a relatively small file, 20MB, 200k lines, works and
> displays fine, the
> in memory spatial index (from the docs, "By default, it will also build on
> the fly a in-memory spatial index during the first sequential read of a
> layer")
> seems to be effective.
> 
> I've then tried a larger one, 10GB, 40 million lines, and with this one it
> does not seem like there is a spatial
> index going at all, even zooming in I see the disk reading madly at
> 100-200MB/s for several seconds in order
> to return a very small area (by small I mean it has 1000 lines tops, road
> network of a village).
> It's like there was no spatial index, but it could also be that the spatial
> index is memory limited and too shallow to be effective.
> Maybe I never really gave it an occasion to complete the first sequential
> scan, but was wondering about others's experiences.

Looking at the code, the driver requires really particular call sequences to 
complete the building of the index, basically iterate over the whole dataset, 
and will invalidate it as soon as you do something else: for example, if you 
reset the iteration on features after having read at least one. 
It might be possible the way QGIS calls OGR prevents the driver to build the 
index. Another thing is that QGIS doesn't necessarily persist for very long 
OGR connections.

If you enable GDAL traces (CPL_DEBUG=ON environmenet variable), you'll see
OpenFileGDB: SPI_COMPLETED
when the spatial index is built

Anyway if you have a very large dataset, and that the dataset handle is not 
persisted among spatial queries, this in-memory spatial index will not be 
useful.

> 
> I assume things would be faster with the FileGDB, but was wondering about
> this OpenFileGBD statement: "Robust against corrupted Geodatabase files."
> So I'm assuming the FileGBD one is not robust reading corrupted Geodatabase
> files. But what does that mean? Segfault/hard crash?

Yep, possibly. This is a binary blob, which has likely never gone through fuzz 
testing as the OpenFileGDB did. Howevery if you consume "normal" datasets, 
that shouldn't be an issue.

> Also, has the FileGDB driver been tested in a heavily multithreaded
> environment, does it work fine there?

The FileGDB SDK has (at least early versions which were used to develop it) 
multithreading issues, so the FileGDB driver has a big lock over all calls. 
That won't scale very well of course...

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list