[PROJ] Is an unknown unit feasible?

Roger Bivand Roger.Bivand at nhh.no
Thu Dec 1 01:40:11 PST 2022


In running reverse dependency checks for 9.1.1 (all OK), and looking at 
https://github.com/dieghernan/tidyterra/issues/64 with 
https://github.com/r-spatial/sf/issues/2049, I have a (probably silly) 
question.

The problem that has appeared is that converting a set of features to GPKG 
without a defined SRS gives "Undefined geographic SRS":

test.csv is:

Latitude,Longitude,Name
48.1,0.25,"First point"
49.2,1.1,"Second point"
47.5,0.75,"Third point"

from https://gdal.org/drivers/vector/csv.html#vector-csv. If the 
coordinates are outside the valid range for geographical cooordinates, the 
same happens. Example:

$ ogr2ogr -f GPKG test0.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* \
  -oo Y_POSSIBLE_NAMES=Lat* -oo KEEP_GEOM_COLUMNS=NO
$ ogrinfo -ro -al test0.gpkg

In GDAL, -a_srs allows an "Undefined Cartesian SRS" to be inserted:

$ ogr2ogr -f GPKG -a_srs 'LOCAL_CS["Undefined Cartesian SRS"]' \
  test1.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* -oo Y_POSSIBLE_NAMES=Lat* \
  -oo KEEP_GEOM_COLUMNS=NO
$ ogrinfo -ro -al test1.gpkg

This might help (in R packages, we have always assumed that a missing SRS 
means Cartesian), but:

$ projinfo 'LOCAL_CS["Undefined Cartesian SRS"]'
WKT2:2019 string:
ENGCRS["Undefined Cartesian SRS",
     EDATUM[""],
     CS[Cartesian,2],
         AXIS["(E)",east,
             ORDER[1],
             LENGTHUNIT["metre",1,
                 ID["EPSG",9001]]],
         AXIS["(N)",north,
             ORDER[2],
             LENGTHUNIT["metre",1,
                 ID["EPSG",9001]]]]

where LENGTHUNIT is "metre". Trying to edit a PROJJSON version of the 
same, entering "", "unknown" for "unit" =, or omitting "unit" altogether 
always leads to trouble at some point. Because GPKG (and probably other 
OGR/GDAL drivers) expect that SRS is defined, falling back on "Undefined 
geographic SRS" if nothing is given (which is probably wrong much of the 
time but for which degree units are valid), or can be tricked into 
"Undefined Cartesian SRS" with units which must be known.

There are plenty of legacy data sets in spatial statistics that are planar 
but for which no units are known (typically raw digitizer coordinates, 
subsequently saved as Esri Shapefile without a *.prj file in ESRI WKT1). 
There are also cases where the exact positions must be shielded from 
immediate recognition (individual patient data in epidemiology, protected 
species, etc.) in which the units are say converted to [0, 1] on the 
longest axis and other obfuscating transformations are used. So the units 
really are "unknown" by design.

This problem arises as file formats expect data sets to be provided with 
valid SRS. We are trying to encourage users to follow this path, but there 
are real cases where declaring a unit as "metre" when it is unknown will 
lead to unforced errors of interpretation by subsequent users of a file 
with this assertion.

What might be the technical issues that could arise from an equivalent to 
EDATUM[""] for LENGTHUNIT[""]? Could the conversion factor be for example 
NaN, 7.4.2 in http://docs.opengeospatial.org/is/18-010r7/18-010r7.html ? 
This document also asserts "Where no implied unit can be inferred then in 
this document the default implied linear unit shall be metre" (last 
sentence in 7.4). Has the development of standards omitted to take into 
account situations in which the Cartesian length unit really should not be 
exposed, and where the imposition of "metre" is thus unwarranted?

Best wishes,

Roger

-- 
Roger Bivand
Emeritus Professor
Department of Economics, Norwegian School of Economics,
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway.
e-mail: Roger.Bivand at nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the PROJ mailing list