[MetaCRS] Standard (and simple) format for conversion tests.

Wed Nov 4 15:53:24 EST 2009

Hello All . . .

I too am interested in a general test format and universal test case database.  I believe a set of standard test cases would be a great thing for the MetaCRS project.  For legal and other reasons, I believe we should simply qualify the "target" values of all of our test cases as suggested results with some generous tolerance values; and issue a disclaimer as to the accuracy of the published results.

Having wrestled with this problem for many years, I have some comments:

1> I prefer a simple .CSV type of test file format.  The test file would then be a totally portable, non-binary, text file (limited to 8 bit characters for more portability?), be easily parsed in any language or application, and it can be easily maintained using something like Excel, MySql, etc (anything that can export a simple .CSV file).

2> I would like to see a "test type" field in the record format which will support testing things addition to the basic "convert this coordinate test".  Thus, datum shift tests, geoid height tests, grid scale, convergence, vertical datum tests, etc. could all be included in a single database.

3> We should strive for a "Source of Test Data" field requirement in the database which indicates the source of the test.  That is, where did the test case data come from.  The source should always (?) be something outside of the MetaCRS project base.

4> Test cases derived from the various projects of MetaCRS could/should be included and classified as being regression tests only.

5> Some sort of environment field would be nice.  That is, a bit map sort of thing that would enable a program to skip certain tests based on the environment (i.e. presence of the Canadian NTv2 data file for example).

6> Separate tolerances on the source and target is a nice idea enabling an automatic inverse test for each test case.  A simpler database would result if we require separate entries in the database to test both the forward and inverse cases.  I prefer the latter, as inverse testing is not always appropriate and it supports item 9 below.

7> Test data values should be entered in the form as the source material (to the degree possible), implying (for example) that geographic coordinates may be entered as degrees, minutes, and seconds or decimal degrees.

8> Tolerances in the test database should be based on the quality or nature of the "Source of Test Data".  It could be a serious legal issue if we publish something suggesting that this is the correct result.

9> None of our projects will produce the exact same result, nor will any other library match any of ours precisely.  At this level I do not think it appropriate for MetaCRS to make the call as to which is the correct one.  Therefore I suggest that the format be designed such that any library (MetaCRS or otherwise) be able to simply publish a file with the result produced by the library as opposed to a Boolean condition indicating whether or not they meet the MetaCrs standard. standards.  It is then up to the consumer of that information to decide which one is correct.  This may be an important legal issue as well.  (Notice that EPSG has never included test cases in their database.)

10> Coordinate system references should be by EPSG number where ever possible.  I suggest a format of the "EPSG:3745" type.  In cases where this won't work, the test database should include a namespace qualifier and then the definition:

	CSMAP:LL84
	PROJ4:'+proj=utm +zone=11 +datum=WGS84'
	ORACLE:80114
	.
	.
	.
Test applications would, of course, skip any test which it is incapable of deciphering the CRS's referenced.

The CS-MAP distribution includes a test data file named TEST.DAT which includes a couple thousand test cases.  The comments in this file usually indicate the "Source of Test Data" to some degree.  Many need to be commented out due to environmental reasons, thus item 5 above.

Norm

-----Original Message-----
From: metacrs-bounces at lists.osgeo.org [mailto:metacrs-bounces at lists.osgeo.org] On Behalf Of Frank Warmerdam
Sent: Wednesday, November 04, 2009 11:50 AM
To: Landon Blake
Cc: metacrs at lists.osgeo.org
Subject: Re: [MetaCRS] Standard (and simple) format for conversion tests.

Landon Blake wrote:
> I will be helping Martin Davis on some testing and improvements to 
> Proj4J. One of my tasks will be to test some of the improvements we are 
> making to the coordinate conversion calculations. I think this testing 
> is currently being done with Java unit tests. A while back on this list 
> I remember we discussed a simple format for test data that could be 
> provided to software tests. I think the goal would be to assemble a 
> standard library of test data files that could be used by different 
> coordinate conversion projects.
> 
>  
> 
> Is there still an interest in this?

Landon,

I am interested in such a thing existing.  In my Python script for
testing PROJ.4 (through OGRCoordinateTransformation) I have:

###############################################################################
# Table of transformations, inputs and expected results (with a threshold)
#
# Each entry in the list should have a tuple with:
#
# - src_srs: any form that SetFromUserInput() will take.
# - (src_x, src_y, src_z): location in src_srs.
# - src_error: threshold for error when srs_x/y is transformed into dst_srs and
#              then back into srs_src.
# - dst_srs: destination srs.
# - (dst_x,dst_y,dst_z): point that src_x/y should transform to.
# - dst_error: acceptable error threshold for comparing to dst_x/y.
# - unit_name: the display name for this unit test.
# - options: eventually we will allow a list of special options here (like one
#   way transformation).  For now just put None.
# - min_proj_version: string with minimum proj version required or null if unknown

transform_list = [ \

     # Simple straight forward reprojection.
     ('+proj=utm +zone=11 +datum=WGS84', (398285.45, 2654587.59, 0.0), 0.02,
      'WGS84', (-118.0, 24.0, 0.0), 0.00001,
      'UTM_WGS84', None, None ),

     # Ensure that prime meridian changes are applied.
     ('EPSG:27391', (20000, 40000, 0.0), 0.02,
      'EPSG:4273', (6.397933,58.358709,0.000000), 0.00001,
      'NGO_Oslo_zone1_NGO', None, None ),

     # Verify that 26592 "pcs.override" is working well.
     ('EPSG:26591', (1550000, 10000, 0.0), 0.02,
      'EPSG:4265', (9.449316,0.090469,0.00), 0.00001,
      'MMRome1_MMGreenwich', None, None ),
...

I think one important thing is to provide an acceptable error threshold with
each test in addition to the expected output value.  I also think each test
should support a chunk of arbitrary test which could be used to explain
the purpose of the test (special issues being examined) and pointing off
to a ticket or other relavent document.

Actually one more thing is a name for the test, hopefully slightly
self-documenting.  I suppose if each test is a distinct file, we
could use meaningful filenames.

The other dilemma is how to define the coordinate systems.  I feel that
limiting things to EPSG defined coordinate systems is a problem though of
course otherwise we have serious problems with defining in the coordinate
system in an interoperable fashion.   So, perhaps starting with EPSG codes
is reasonable with an understanding that eventually some tests might need
to be done another way - perhaps OGC WKT.

If you wanted to roll out something preliminary I would be interested
writing a Python script that would run the test against OGR/PROJ.4.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

_______________________________________________
MetaCRS mailing list
MetaCRS at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/metacrs