[PROJ] Performance of proj_create_crs_to_crs()

Paul Ramsey pramsey at cleverelephant.ca
Sun Feb 17 08:31:20 PST 2019


On Sun, Feb 17, 2019 at 7:50 AM Kristian Evers <kreve at sdfe.dk> wrote:
>
> Sure, initialising PJ objects is a lot more expensive in PROJ 6 than
> it has been previously. All the new fancy features come at a price. It
> is to be expected. Before it was quite straight forward what would happen
> when creating a PJ object but now we accept CRS description in a number
> of different formats and do a lookup in the CRS database to find all the
> possible transformations between your two CRS’s. Doing all that costs
> CPU cycles. I am sure there’s a potential for optimisation but that has
> not been the primary concern for this release.
>
> Personally I don’t worry too much about the time it takes to instantiate
> PJ objects since usually you would only have to do it once. I can see how
> PostGIS might be different in that regard so your strategy of caching the
> objects seems sound.

Well, as things stand now, any SQL statement that includes
ST_Transform() in PostGIS has gone from a 17ms minimum to a 350ms
minimum run time. That's basically unacceptable (not in a moral sense,
just in the sense that if PostGIS users will see things to 20x slower
when they "upgrade" proj, with no benefit as far as they can tell,
then they will not want to upgrade).

We cache projection objects per-call, so at least the penalty isn't
350ms * num_rows, but it's still a penalty per SQL call. This implies
a pretty big redesign of our caching system, which will be
unfortunately complex, as it's not just a matter of moving the cache
up one level to the backend lifespan. Our reprojection system is
supposed to use spatial_ref_sys table as the source of truth, so any
longer-lived caching system will also have to be very aware of changes
to spatial_ref_sys and poison when they happen. I'm not actually even
sure at the moment how to implement that, unlike system catalogue
tables, there's no good way to handle cache poisoning in user tables.
:/

I wonder if running proj_create_crs_to_crs() with simple proj strings
in the from/to slots reduces the likelihood that we get expensive
lookups of auxiliary information during PJ initialization? Would be a
shame to basically dumb down our use of proj and get none of the
benefits of all the work, but it would be better than accepting a
350ms query time floor for any use of ST_Transform.

P.

>
> /Kristian
>
>
> > On 17 Feb 2019, at 16:24, Paul Ramsey <pramsey at cleverelephant.ca> wrote:
> >
> > So, having gotten all the axis swapping tap dancing working, I went to
> > run some of my favourite transforms around BC, finishing up with one
> > of my favourites...
> >
> >  st_transform('SRID=3005;POINT(1000000 0)',4267)
> >
> > This takes a point from a NAD83 projected system (EPSG:3005) to a
> > NAD27 geodetic system (EPSG:4267).
> >
> > Here's the crazy part: this transformation takes 400ms, and the time
> > is all spend in in proj, getting the PJ.
> >
> > I ran 20-30 of them in a row and captured the workload in Instruments
> > in case these function calls ring any bells WRT overhead,
> >
> > https://pasteboard.co/I1AXN5b.png
> >
> > Fortunately for bulk conversion PostGIS already caches the projection
> > object, in fact most of my work this week was in renovating that part
> > of the code, but older versions of Proj are much much faster in
> > resolving projections from projection strings.
> >
> > Thoughts?
> > _______________________________________________
> > PROJ mailing list
> > PROJ at lists.osgeo.org
> > https://lists.osgeo.org/mailman/listinfo/proj
>


More information about the PROJ mailing list