[gdal-dev] OT: OGR and Quadtree and R-tree

Frank Warmerdam warmerdam at p...
Tue Aug 21 09:22:09 EDT 2001


Paul Selormey wrote:
> Hello All,
> 
> This is really off-topic, but I could not find a good place to
> discuss the OGR and since it is "part" of the GDAL I decided to post
> this here.
> 
> Not a professional GIS person, a newbie. I have read about Quadtree
> and R-tree and how they make rendering fast.
> 
> 1. Which one do you use for Shape file and MID/MIF/TAB and why?
> 
> 2. I have seen the implementation of the quadtree in shapelib and the
> R-tree in GFC. However, no such algorithm is used in the OGR, why?
> 
> 3. I have used OpenEV, which is based on the OGR/GDAL. The rendering
> is not all that slow, so why is the need for the quadtree?
> 
> 4. To anyone building a OGRLayer object and concerned with the memory
> and speed, what is the most practical/best way to use the classes
> provided in the OGR?

Paul,

1) Shapefiles do not generally have a spatial index. ESRI programs do
sometimes generate .sbn/.sbx files which are some sort of spatial index but
they are not documented. The OGR driver for shapefiles does not include
any spatial indexing ... it doesn't even use my hacky quadtree stuff in
shapelib at this time.

2) OGR is generally used for one pass translation rather than high
performance random access, so indexing has not been a priority. Furthermore
indexing separately in each driver would complicate the code ... perhaps
significantly.

3) Note that OpenEV vector rendering loads all the shapes into memory making
the speed of OGR rather irrelevant. Rendering speed of large layers of
vectors in OpenEV isn't all that fast, especially if zoomed in on a subset
of a large layer because OpenEV always drawns all the vectors through OpenGL.
However; OpenEV does demonstrate the truth that spatial indexing of vectors
is generally only important if you want to deal with vector large vector
datasets (larger than can be conveniently held in RAM for instance).

4) OGR doesn't really provide any support functions to simplify implementing
a high performance driver; however, it does provide important hooks that a
driver can implement to provide better performance to the application. In
particular, if applications set a spatial or attribute query a given OGRLayer
implementation can choose to implement it internally in a smart way to provide
good performance. Make sure you override GetExtent() and GetFeatureCount()
so the default implementations (which result in reading the whole dataset
once) are not used. Also, if you want to provide high performance, make sure
that random reads are supported.

Finally, try to return meaningful results from your TestCapability() call.

While OGR formats can be implemented fairly efficiently; if you are
super-sensitive to performance, going through OGR is likely not the most
efficient mechanism. OGR adds a non-trivial overhead in terms of making
and destroying the OGRFeatures. That is why something like MapServer
which is very performance sensitive renders directly from a hacked/optimized
shapelib instead of going through a nice general library like OGR. Well, it
does have the option of going through OGR, but it's most optimized option
is still the direct from shape method.

Best regards,

-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at p...
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent





More information about the Gdal-dev mailing list