[gdal-dev] OGR random read optimization?

Ray Gardener rayg at daylongraphics.com
Tue Dec 1 18:13:57 EST 2009


Hmmm... okay. Fwiw, I did the optimization in a wrapper layer class to 
help simplify the interface, i.e., folding sequential and random access 
into a single get_feature(int index) call. The optimization allows 
efficient access if the indices are sequential and best-effort if 
random, and the concept of SetNextByIndex goes away.

An STL-like approach could also be used, treat a layer as a collection 
with explicit iterator types, e.g., sequential_feature_iterator, 
random_feature_iterator, etc. This is also useful for thread safety 
since the iterator can hold the iteration state.

Another optimization is to cache the necessary filemark (and any other 
info) every so often along with the index number, into an array sorted 
by index. Then if an index which is less than the current index is 
given, a bsearch can be done into the cache for the nearest filemark 
behind the desired index, and we seek to there, then seek forward, 
avoiding having to reset and seek from the start.

Ray


Even Rouault wrote:
> Ray,
> 
> The idea is interesting but unfortunately, I don't see how this can be 
> implemented in a generic way in current OGRLayer. The root of the 
> problem is that OGRLayer doesn't track the current index because 
> GetNextFeature() is a virtual method overriden by each driver. The 
> nicest way your idea could be implemented would be to have instead a 
> virtual IGetNextFeature() method which would be the one overriden by 
> drivers (and do the job of current GetNextFeature()) but called by non 
> virtual OGRLayer::GetNextFeature(). The latter method could keep track 
> of the current index and store it as a member of OGRLayer. This would be 
> a bit similar to RasterIO() / IRasterIO() that exists on the GDAL side. 
> The approach would also have the advantage to avoid that each driver's 
> GetNextFeature() has to deal with testing spatial and attribute filters. 
> Unfortunately, this would require changes in all OGR drivers, which 
> makes it probably not appropriate for any release in the GDAL/OGR 1.X 
> series. Maybe something for a GDAL/OGR 2.0 ;-) ?
> 
> In fact, this could be in theory introduced in a backward compatible way 
> if we :
> 1) Add a OGRLayer::IGetNextFeature() virtual method : Default 
> implementation in OGRLayer class : issue a CPLError and return NULL
> 2) Add a OGRLayer::IResetReading() virtual method : Default 
> implementation in OGRLayer class : issue a CPLError
> 3) Add a OGRLayer::ImplementsIGetNextFeatureAndIResetReading() virtual 
> method. Default implementation in OGRLayer class : return FALSE.
> 4) Change OGRLayer declaration so that GetNextFeature() and 
> ResetReading() are no longer pure virtual, but just virtual. Their 
> default implementation would call IGetNextFeature() and IResetReading()
> 5) Change OGRLayer::SetNextByIndex() default implementation so that it 
> does the optimization you suggest when 
> ImplementsIGetNextFeatureAndIResetReading() returns TRUE. Otherwise it 
> defaults to current code.
> 
> That approach would not require any code change in existing drivers. And 
> drivers interested in the new capability would just need to rename their 
> GetNextFeature() -> IGetNextFeature(), ResetReading() -> IResetReading() 
> and override ImplementsIGetNextFeatureAndIResetReading() to make it 
> return TRUE.
> 
> I'm wondering if it's worth the change. For true random access patterns, 
> even with your optimization, SetNextByIndex() will always be slow for 
> drivers that don't have a specialized fast implementation of it (average 
> complexity of O(n) by call). It can only bring performance gains if you 
> request increasing indices.
> 
> I'd note that currently efficient implementations of SetNextByIndex() 
> exist in Shapefile, MEM and GRASS drivers. GDAL/OGR 1.7.0dev has also 
> added implementation of it for PostgreSQL and VRT (for the latter, only 
> if the underlying layer has OLCFastSetNextByIndex capability).
> 
> Best regards,
> 
> Even
> 
> Ray Gardener a écrit :
>> The OGRLayer::SetNextByIndex() function can be sped up by keeping 
>> track of the previously used index, and calling ResetReading() only if 
>> the new index is lesser than it. Otherwise, read forward and skip 
>> features by the difference of the new and old indices.
>>
>> For thread safety, the old index should be kept in a caller-level data 
>> structure which is passed in as an argument to SetNextByIndex.
>>
>> Ray
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>>
>>
> 
> 
> 
> 



More information about the gdal-dev mailing list