[PROJ] Is an unknown unit feasible?

Even Rouault even.rouault at spatialys.com
Thu Dec 1 02:36:02 PST 2022


Roger,

The GeoPackage spec mandates a SRS to be associated with feature tables 
(GDAL itself can manage layers without CRS, as the CSV driver proves). 
Cf http://www.geopackage.org/spec130/#_gpkg_geometry_columns where the 
gpkg_geometry_columns table where you register feature tables has a 
"srs_id INTEGER NOT NULL" column with a foreign key to the 
"gpkg_spatial_ref_sys" table.

Loud thinking, not sure if it is a good idea: one could potentially 
imagine that GDAL inserts a "INSERT INTO gpkg_spatial_ref_sys 
VALUES('NULL',-2,'GDAL',0,'undefined','NULL');" entry in the 
gpkg_spatial_ref_sys table when creating a layer with a geometry column 
but no CRS associated (currently it arbitrarily selects the "Unknown 
geographic CRS" provisioned by the GeoPackage spec with srs_id=0), and 
on reading, when finding the layer is associated with that dummy CRS, do 
not report it at all. But I can imagine that would cause 
interoperability issues with other software that wouldn't know what to 
do with that undefined CRS which is not one of the 2 defined in the 
spec, and it looks like such entry would violate the requirements of 
http://www.geopackage.org/spec130/#spatial_ref_sys where the 
"definition" column must be WKT, except for srs_id=0 or 1. Or one could 
insert a dummy 'LOCAL_CS["NULL"]' definition. But it could still be 
challenging for larger interoperability.

Regarding the unknown unit, you could potentially try using 
'ENGCRS["Undefined Cartesian SRS with unknown 
unit",EDATUM[""],CS[Cartesian,2],AXIS["(E)",east,ORDER[1],LENGTHUNIT["unknown",0]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["unknown",0]]]' 
. I'm not completely sure the 0 conversion factor is legal. It looks 
like it passes the constraints of the WKT BNF grammar, but perhaps not 
its spirit or other underlying specs. However 0 or NaN will easily 
trigger "interesting" division-by-zero (I have fixed a number of issues 
reported by C/C++ sanitizers when PROJ attempts to do coordinate 
transformation when fed with fuzzed input CRS that contain such 
conversion_factor=0. The WKT CRS import code doesn't reject them 
upfront, to give some change to the user to access the CRS definition, 
but you can't definitely use them to do any sort of coordinate 
operation) or the usual issues with working with something that is not 
equal to itself (NaN) down the road though. 
https://lists.ogc.org/mailman/listinfo/coordtran.wg would probably be a 
better forum to discuss if unknown unit is an intentional or on-purpose 
omission.

Even

Le 01/12/2022 à 10:40, Roger Bivand a écrit :
> In running reverse dependency checks for 9.1.1 (all OK), and looking 
> at https://github.com/dieghernan/tidyterra/issues/64 with 
> https://github.com/r-spatial/sf/issues/2049, I have a (probably silly) 
> question.
>
> The problem that has appeared is that converting a set of features to 
> GPKG without a defined SRS gives "Undefined geographic SRS":
>
> test.csv is:
>
> Latitude,Longitude,Name
> 48.1,0.25,"First point"
> 49.2,1.1,"Second point"
> 47.5,0.75,"Third point"
>
> from https://gdal.org/drivers/vector/csv.html#vector-csv. If the 
> coordinates are outside the valid range for geographical cooordinates, 
> the same happens. Example:
>
> $ ogr2ogr -f GPKG test0.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* \
>  -oo Y_POSSIBLE_NAMES=Lat* -oo KEEP_GEOM_COLUMNS=NO
> $ ogrinfo -ro -al test0.gpkg
>
> In GDAL, -a_srs allows an "Undefined Cartesian SRS" to be inserted:
>
> $ ogr2ogr -f GPKG -a_srs 'LOCAL_CS["Undefined Cartesian SRS"]' \
>  test1.gpkg test.csv -oo X_POSSIBLE_NAMES=Lon* -oo 
> Y_POSSIBLE_NAMES=Lat* \
>  -oo KEEP_GEOM_COLUMNS=NO
> $ ogrinfo -ro -al test1.gpkg
>
> This might help (in R packages, we have always assumed that a missing 
> SRS means Cartesian), but:
>
> $ projinfo 'LOCAL_CS["Undefined Cartesian SRS"]'
> WKT2:2019 string:
> ENGCRS["Undefined Cartesian SRS",
>     EDATUM[""],
>     CS[Cartesian,2],
>         AXIS["(E)",east,
>             ORDER[1],
>             LENGTHUNIT["metre",1,
>                 ID["EPSG",9001]]],
>         AXIS["(N)",north,
>             ORDER[2],
>             LENGTHUNIT["metre",1,
>                 ID["EPSG",9001]]]]
>
> where LENGTHUNIT is "metre". Trying to edit a PROJJSON version of the 
> same, entering "", "unknown" for "unit" =, or omitting "unit" 
> altogether always leads to trouble at some point. Because GPKG (and 
> probably other OGR/GDAL drivers) expect that SRS is defined, falling 
> back on "Undefined geographic SRS" if nothing is given (which is 
> probably wrong much of the time but for which degree units are valid), 
> or can be tricked into "Undefined Cartesian SRS" with units which must 
> be known.
>
> There are plenty of legacy data sets in spatial statistics that are 
> planar but for which no units are known (typically raw digitizer 
> coordinates, subsequently saved as Esri Shapefile without a *.prj file 
> in ESRI WKT1). There are also cases where the exact positions must be 
> shielded from immediate recognition (individual patient data in 
> epidemiology, protected species, etc.) in which the units are say 
> converted to [0, 1] on the longest axis and other obfuscating 
> transformations are used. So the units really are "unknown" by design.
>
> This problem arises as file formats expect data sets to be provided 
> with valid SRS. We are trying to encourage users to follow this path, 
> but there are real cases where declaring a unit as "metre" when it is 
> unknown will lead to unforced errors of interpretation by subsequent 
> users of a file with this assertion.
>
> What might be the technical issues that could arise from an equivalent 
> to EDATUM[""] for LENGTHUNIT[""]? Could the conversion factor be for 
> example NaN, 7.4.2 in 
> http://docs.opengeospatial.org/is/18-010r7/18-010r7.html ? This 
> document also asserts "Where no implied unit can be inferred then in 
> this document the default implied linear unit shall be metre" (last 
> sentence in 7.4). Has the development of standards omitted to take 
> into account situations in which the Cartesian length unit really 
> should not be exposed, and where the imposition of "metre" is thus 
> unwarranted?
>
> Best wishes,
>
> Roger
>
-- 
http://www.spatialys.com
My software is free, but my time generally not.



More information about the PROJ mailing list