[PROJ] Pitching the proj project to Google's Geo team

Greg Troxel gdt at lexort.com
Mon Aug 22 10:32:09 PDT 2022


Martin Desruisseaux <martin.desruisseaux at geomatys.com> writes:

> Le 22/08/2022 à 17:05, Lesparre, Jochem via PROJ a écrit :
>
>> I support:
>>
>>   *  (…snip…)
>>   * Greg’s suggestion to stop using WGS84 (EPSG:4326).
>>
> Just one nuance about above last point. I agree about deprecating
> EPSG:4326 for /data producers/ (i.e. encourage all data producers to
> describe their CRS as accurately as they can). But we still need
> EPSG:4326 as an ensemble (i.e. with its ~2 meters uncertainty) for
> /data users/. The reason is that when a software reads a file from
> unknown source, if the CRS has not been accurately defined by the
> producer, there is nothing the software or user can do about that.

Agreed; there is data where there is no information about which frame.

I think it's important to undestand that if data is labeled with
EPSG:4326, that it might be in some ITRF, because it came out of a GNSS
receiver with SBAS, or in NAD83(something) because it came out of a GPS
receiver in the US with corrections from a Coast Guard differential
system.  And probably a bunch of other CRSes.  And random other frames
transformed *to* 4326, whatever that means, done who knows how.  It
would be interesting to run down the reality for a number of datasets
labeled 4326 and see what the distribution is.

> In those cases, the worst thing would be to /pretend/ that we have a
> CRS of some specific realization while actually the software has no
> idea. We need a way to said "we don't know what is the realization
> because the producer did not tell us, expect a 2 meters uncertainty".

I don't think it's the worst thing.  Assuming NAD27 would be worse.
And: if somebody claims to have data in 4326 but when you ask them "but
really what is it" and they have no idea, would you believe them?

We need to be mindful of two accuracies:

  1) the intrinsic error of the data relative to the actual CRS it is in
  2) the error from that actual CRS to the labeled CRS

When data is simply labeled 4326, yes, there is a 2m possible error from
any of the frames it could properly be to WGS84(G23139)==ITRF2014,
ignoring epoch as a separate issue.  But the intrinsic error of the data
might be far larger.  If the data really is in WGS84(TRANSIT), and it is
navigation solutions not from an authorized user, there will be ~100m of
intrinsic error.

What is bad is that if the data labeled 4326 happens to actually be

  1) has a small intrinsic error
  2) is in G2139 or really any of the last few realizations

then it is important not to introduce error in the transformation
process.  There is a difference between accurately carrying the error
term and creating additional errors.

If the data actually happens to be in TRANSIT, assuming it's in G2139
and adding 2m to the error budget is going to result in that 2m being
used up in a mistransformation, rather than just having an extra 2m of
formal error without any additional actual error.  But data that is
actually in TRANSIT is exceedingly rare.  I have some, somewhere, and
it's only worth looking at to understand how Selective Availability used
to be.  I really doubt anyone can tell the difference between TRANSIT
with SA and G730 with SA.

> So I would not be in favour of interpreting EPSG:4326 as the latest
> realization. I think it should continue to mean "we don't know the
> realization", and every coordinate transformation involving EPSG:4326
> should declare a 2 meters uncertainty. We need to keep the possibility
> to said "we don't know", even if we want to encourage data producers
> to be more accurate.

I didn't mean "interpret it exactly as if".  After some thought, I'd
like to restate it.

When dealing with source data in 4326:

  increase the formal error of the result by 2m, to account for the
  possibility of treating data that is actually TRANSIT as G2139

  treat the CRS as if it were the latest realization

When dealing with a target CRS of 4326

  transform to the latest realization

  don't do anything about error, because it's totally legit to treat
  G2139 data as being of the ensemble, and everyone else will add 2m of
  error when they take it out of the ensemble.

This will capture what you care about, that there is 2m of additional
uncertainty on top of the (probably unknown) intrinsic error of the
data.  And, if the data happens to be decent (not TRANSIT, not G730), it
won't mistransform it.

By mistransform, I mean the following processs getting a bad answer:

  take data in 4326

  transform to 6319, NAD83(2011) epoch 2010.0

proj comes up with a null transform because of ensemble error, but in
the reasonably likely case that the path to 4326 was careful, or the
data was in some recent realization, but it's labeled that way because
of some container format (e.g. TMS, geojson), you get an incorrect 1
meter shift in New England.

This is not theoretical.  I have data in NAD83(2011) epoch 2010.0,
gathered with dual-frequency RTK (ardusimple) that I think has 5 or 10
cm standard error.  If I view that in qgis with 15cm imagery from
MassGIS (government) in jp2 natively in NAD83 (UTM zone 19, alignment
with USGS), things line up to a pixel.  MassGIS publishes the same
imagery as TMS which is 4236/webmercator, and they have transformed
NAD83->WGS84 to a recent realization.  That imagery in qgis is
misaligned.  If I set my project CRS to 4326 it's still misaligned
because 6319->4326 is null.  But if I set my project CRS to ITRF2014,
then everything lines up because the NAD83->ITRF2014 transform is not
null.  (This is all from memory; I'm really sure of the effect and
mostly I use jp2 tiles and 6319.)

Right now, this need to say we don't know is causing real, avoidable
errors for a significant amount of data.  I'm not opposed to increasing
the formal error - that doesn't hurt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/proj/attachments/20220822/ec7f987c/attachment.sig>


More information about the PROJ mailing list