[PROJ] Is an unknown unit feasible?
Roger Bivand
Roger.Bivand at nhh.no
Thu Dec 1 01:40:11 PST 2022
In running reverse dependency checks for 9.1.1 (all OK), and looking at
https://github.com/dieghernan/tidyterra/issues/64 with
https://github.com/r-spatial/sf/issues/2049, I have a (probably silly)
question.
The problem that has appeared is that converting a set of features to GPKG
without a defined SRS gives "Undefined geographic SRS":
test.csv is:
Latitude,Longitude,Name
48.1,0.25,"First point"
49.2,1.1,"Second point"
47.5,0.75,"Third point"
from https://gdal.org/drivers/vector/csv.html#vector-csv. If the
coordinates are outside the valid range for geographical cooordinates, the
same happens. Example:
$ ogr2ogr -f GPKG test0.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* \
-oo Y_POSSIBLE_NAMES=Lat* -oo KEEP_GEOM_COLUMNS=NO
$ ogrinfo -ro -al test0.gpkg
In GDAL, -a_srs allows an "Undefined Cartesian SRS" to be inserted:
$ ogr2ogr -f GPKG -a_srs 'LOCAL_CS["Undefined Cartesian SRS"]' \
test1.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* -oo Y_POSSIBLE_NAMES=Lat* \
-oo KEEP_GEOM_COLUMNS=NO
$ ogrinfo -ro -al test1.gpkg
This might help (in R packages, we have always assumed that a missing SRS
means Cartesian), but:
$ projinfo 'LOCAL_CS["Undefined Cartesian SRS"]'
WKT2:2019 string:
ENGCRS["Undefined Cartesian SRS",
EDATUM[""],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
where LENGTHUNIT is "metre". Trying to edit a PROJJSON version of the
same, entering "", "unknown" for "unit" =, or omitting "unit" altogether
always leads to trouble at some point. Because GPKG (and probably other
OGR/GDAL drivers) expect that SRS is defined, falling back on "Undefined
geographic SRS" if nothing is given (which is probably wrong much of the
time but for which degree units are valid), or can be tricked into
"Undefined Cartesian SRS" with units which must be known.
There are plenty of legacy data sets in spatial statistics that are planar
but for which no units are known (typically raw digitizer coordinates,
subsequently saved as Esri Shapefile without a *.prj file in ESRI WKT1).
There are also cases where the exact positions must be shielded from
immediate recognition (individual patient data in epidemiology, protected
species, etc.) in which the units are say converted to [0, 1] on the
longest axis and other obfuscating transformations are used. So the units
really are "unknown" by design.
This problem arises as file formats expect data sets to be provided with
valid SRS. We are trying to encourage users to follow this path, but there
are real cases where declaring a unit as "metre" when it is unknown will
lead to unforced errors of interpretation by subsequent users of a file
with this assertion.
What might be the technical issues that could arise from an equivalent to
EDATUM[""] for LENGTHUNIT[""]? Could the conversion factor be for example
NaN, 7.4.2 in http://docs.opengeospatial.org/is/18-010r7/18-010r7.html ?
This document also asserts "Where no implied unit can be inferred then in
this document the default implied linear unit shall be metre" (last
sentence in 7.4). Has the development of standards omitted to take into
account situations in which the Cartesian length unit really should not be
exposed, and where the imposition of "metre" is thus unwarranted?
Best wishes,
Roger
--
Roger Bivand
Emeritus Professor
Department of Economics, Norwegian School of Economics,
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway.
e-mail: Roger.Bivand at nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
More information about the PROJ
mailing list