[gdal-dev] Call for discussion on RFC 92 text: WKB Only geometries
Even Rouault
even.rouault at spatialys.com
Sat Feb 4 10:55:37 PST 2023
Hi Sean,
> but wouldn't it be possible for all OGRFeatures to carry WKB data by
> default and add a method to provide it to callers?
My understanding of what you propose would involve massive code rewrites
in all drivers and wouldn't be desirable from a performance point of
view, because most drivers can't generate WKB easily (PostGIS and GPKG
are the exceptions rather the norm). So either all other drivers should
be modified to compose WKB at hand (massive coding effort. Probably
several weeks of effort and significant risk of regressions). Or get it
from the ExportToWkb() method of the OGRGeometry instance they currently
build, but then you pay the price in memory and CPU time to generate WKB
that might not be consumed by users.
| And only construct an OGRGeometry when it's asked for? Such as when
GetGeometryRef is called?
Good point, we could both make GetGeometryRef() and GetGeomFieldRef()
virtual methods whose default implementation would be the same as
currently, ie. return the value of the corresponding member variable in
the base OGRFeature class stored with
SetGeometry[Directly]()/SetGeomField[Directly]()
And add a new virtual method:
virtual GByte* OGRFeature::GetWKBGeometry(int iGeomField, size_t*
pnOutSize) const
whose default implementation would just use
GetGeomFieldRef(iGeomField)->ExportToWkb().
The few drivers that can provide a more efficient implementation (GPKG
typically) would create a derived class OGRFeatureGPKG with a specific
implementation of those new virtual methods to avoid systematic
OGRGeometry instantiation. The only drawback I see is that making
GetGeometryRef() and GetGeomFieldRef() virtual would have a slight
performance impact, but probably small enough.
But fundamentally I'm wondering if RFC 92 hasn't been made mostly out
fashioned now that we have RFC 86. RFC 86 generally leads to 2x speed-up
or more on real-world datasets compared to OGRFeature iteration (as
measured by the bench_ogr_c_api vs bench_ogr_batch utilities) on drivers
that have implemented it (currently Arrow, Parquet, FlatGeoBuf, GPKG),
whereas RFC 92 only applies to GPKG & PostGIS and in the best -
artificial - case only lead to 30% speed-up.
Of course, adopting RFC 86 requires significant effort from GDAL users,
but the benefit is really measurable whereas with RFC 92 it would be
marginal in most scenarios. As far as I can tell, the performance boost
of RFC 86 comes mostly from saving creation & destruction of millions of
OGRFeature instances, its array members, string attributes, geometries
objects, more than the columnar organization of the ArrowArray data
structures. In the GeoPackage driver, I've also shown that it makes it
possible for efficient multi-threading pre-fetching, totally transparent
for the user.
But to avoid selling false hopes, the benefit of RFC 86 in end-to-end
scenarios would probably drop significantly (at least if looking at
performance gain in percentage. The absolute performance savings on the
GDAL side would remain) if you need to recreate individual features
(QGIS' QgsFeature or MapServer' msShape objects) from the content of
ArrowArray. So this is likely a complete shift of concepts that would be
required.
Even
>
> On Tue, Jan 31, 2023 at 4:27 AM Even Rouault
> <even.rouault at spatialys.com> wrote:
>
> Hi,
>
> Please find for review "RFC 92 text: WKB Only geometries" at
> https://github.com/OSGeo/gdal/pull/7149
>
> This RFC provides shortcuts to avoid instantiation of full
> OGRGeometry
> instances
> in scenarios where only the WKB representation of geometries is
> needed. The
> hope is to save CPU time.
>
> This is something I wanted to at least experiment. I've mixed
> feelings
> if it's something we actually want to adopt.
>
> Even
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>
> --
> Sean Gillies
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20230204/3f2cc0ed/attachment.htm>
More information about the gdal-dev
mailing list