[gdal-dev] Re: Performance of reading large polygons with holes

Jukka Rahkonen jukka.rahkonen at mmmtike.fi
Sun Apr 22 04:53:22 EDT 2012


Rahkonen Jukka <Jukka.Rahkonen <at> mmmtike.fi> writes:


> Thus it is 4 seconds vs. 32 seconds measured by Martin. It is a considerable
difference but perhaps Martin is
> not doing exactly the same thing.  Anyway, speed of OGR seems to be excellent.

I am acting as a man-in-a-middle and Martin writes now as follows:

"Since the claimed 4 s for OGR seems suspiciously fast, I played around
with some scenarios intended to ensure that OGR was actually reading and
constructing every geometry.  I eventually settle on computing the
maximum area of the polygons, since this was the simplest query I could
come up with that would guarantee building the polygons.  On the Java
side I used JEQL, since I could easily replicate this query, it's using
more or less the same shapefile code as OJ, and it was the source of the
32 s number I gave earlier.

The result suprised me:  OGR: ~40s, JEQL ~20s.

The details, in case anyone wants to retry this:

OGR:

ogrinfo -sql "select max(OGR_GEOM_AREA) from tpi_1" tpi_1.shp

Result:    max_OGR_GEOM_AREA (Real) = 68476900073.166

JEQL:

ShapefileReader t file: "tpi_1.shp";
t = select max(Geom.area(GEOMETRY)) from t;
Print t;

Result:

col0:Double
68476900073.13647
Run completed in 19.015 s

(Good to see that the areas are within 0.03 m^2!)

So either the OGR area routine is slow, or else the reader is pretty
smart and only builds geometries when it really has to.  On the JEQL
side, the read time of 32 s I quoted before was actually based on a
query which was computing the maximum number of points in the geometries
(again, to force reading the geometries, since JEQL uses lazy
evaluation). So for some reason counting the max number of points is
slow, which is peculiar. More research required to track down why that
is.  But I think the area-based result is valid."

-Jukka Rahkonen-







More information about the gdal-dev mailing list