[PROJ] Vector/SIMD acceleration
Even Rouault
even.rouault at spatialys.com
Fri Apr 17 04:01:04 PDT 2020
Thomas,
> That is - actually, I work on a proof-of-concept for an improved,
> next generation WKT, ironing out some of the geodetically
> unfortunate elements of WKT2019.
Hum. Was hoping we wouldn't need a WKT2021 :-)
>
> But incidentally this involves implementing support for the
> OGC/ISO19100 "Coordinate Set" (i.e. "sets of coordinate tuples")
> concept, since ISO metadata is attached at the set, rather than
> tuple, level.
For other readers, I suppose you speak about the classes described at:
http://docs.opengeospatial.org/as/18-005r4/18-005r4.html#18
That's indeed something I left aside during PROJ 6 implementation. CoordinateMetadata
could be interesting to implement, as it has a WKT:2019 representation (not implement
currently), and could be useful for transformations involving dynamic/time-dependent CRS.
> But until then I'll be very interested in discussing the form and
> contents of a parallel coordinate data structure (CoordinateSet
> class), so we can keep things compatible.
To give you some hints on what I prototyped, the generic vector-capable type is
template<typename T, int N> class VF{};
(VF stands Vector Floating point)
And it has specializations VF<double,1>, VF<double,2>, VF<double,4> etc
VF<double,1> once optimized is equivalent to a plain old double
VF<double,2> on SSE2 expands to a SSE 128bit register (or 2 double on non vector platforms,
but not necessarily runtime efficient)
VF<double,4> on AVX/AVX2 expands to a AVX 256bit register, or on SSE2 on 2 SSE 128 bit
registers
VF<double,8> on AVX/AVX2 expands to 2 AVX 256bit registers, or on SSE2 to 4 SSE 128 bit
registers (and possibly on AVX-512 to a AVX-512 512 bit register)
Then you can define a:
template<typename T, int N> struct PJ_VF_XY
{
VF<T,N> x;
VF<T,N> y;
};
etc etc
But I don't think we would want to expose that on PROJ API level, one of the good reason is
that it is C++ so cannot be used at the C level.
> My first thought was to make the CoordinateSet class simply a
> container for the material currently given as args to
> proj_transform_generic()
I concur with this. I'd also imagine CoordinateSet to be more or less similar to
proj_trans_generic() arguments, with pointers to X, Y, Z, T double* arrays and user provided
strides, to be accept all reasonable memory arrangements (typically separate/contiguous X,
Y, Z, T components, or interleaved XY, XYZ, XYZT patterns)
(I see that CoordinateSet refers to the DirectPosition type, which must be defined in some
other ISO standard, but I don't think we want & need possibly heavy weight objects there)
If the user chooses a in-memory arrangement of its data that directly matches what PROJ
would use under the hood (for a vector type, the separate/contiguous arrangement would be
the best), it can save a bit of time for the loading/unloading between memory and SSE/AVX
registers, but even if those moves between memory and registers aren't done in the most
efficient way, they are certainly neglectable regarding the cost of the math operations.
PROJ should be smart enough internally to figure out from the available components of a
pipeline if it must use a vector operation or not. Typically you would have a fwd2d_2points,
fwd2d_4points, fwd2d_8points function pointers for a PROJ operation, that would be set or
not, depending on available hardware capabilities at runtime and benchmarked efficiency of
using those variants.
Even
--
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/proj/attachments/20200417/ee3a9b7b/attachment.html>
More information about the PROJ
mailing list