[GRASS-dev] Test suite for GRASS - proposal, discussion welcome

Thu Jun 9 04:01:20 EDT 2011

Soeren Gebbert wrote:

> > Dependencies aren't really an issue. You build all of GRASS first,
> > then test. Any modules which are used for generating test maps or
> > analysing data are assumed to be correct (they will have test cases of
> > their own; the most that's required is that such modules are marked as
> > "critical" so that any failure will be presumed to invalidate the
> > results of all other tests).
> 
> I assume such critical modules are coded in the framework, not in the
> test scripts?

I was thinking about a directive (e.g. "@critical") in the test script
so that any failure during the test would generate a more prominent
message. If any such errors occured as a result of "make test", you
would ignore all the other failures. In the same way that if error.log
has an error for e.g. lib/gis, you wouldn't bother about all of the
subsequent errors and focus on what was wrong with lib/gis.

> But this also means that the test scripts
> must be interpreted and executed line by line by the framework to
> identify critical modules used for data generation?

Test failures should not occur for critical modules. If they do, you
deal with the critical module, and ignore everything else until that
has been dealt with.

The test scripts would need to be processed a command at a time for
other reasons (assuming that the framework is going to be doing more
than simply executing the commands).

> Example for a synthetic r.series test using r.mapcalc for data
> generation. r.mapcalc is marked as critical in the framework:
> 
> In case r.mapcalc is marked as critical and the framework finds the
> keyword "r.mapcalc" in the script, appearing as first word outside
> of a comment, it checks if the r.mapcalc test(s) already run
> correctly and stop the r.series test if they not.

I wouldn't bother with this part. If the user runs "make test" from
the top level, r.mapcalc's tests will end up getting run. If they
fail, then the user will get an error message informing them that a
critical module failed and that they should ignore everything else
until that has been addressed.

If you're doing repeated tests on a specific module that you're
working on, you don't want to be re-running the r.mapcalc, r.out.ascii
etc tests every time.

> In case r.mapcalc
> tests are valid it starts the r.mapcalc commands and checks there
> return values. If the return values are correct, then the rest of the
> script is executed. After reaching the end of this script the
> framework looks for any generated data in the current mapset (raster,
> raster3d, vector, color, regions, ...) and looks for corresponding
> validation files in the test directory. In this case it will find the
> raster maps input1, input2 and result in the current mapset and
> validation.ref in the test directory. It will use r.out.ascii on
> result map choosing a low precision (dp=3??) and compares the output
> with result.ref which was hopefully generated using the same
> precision.

Only result.ref would exist, so there's no need to export and validate
input1 and input2. In general, you don't need to traverse the entire
mapset directory, but only test for specific files for which a
corresponding ".ref" file exists.

I'd export as much precision as is likely to be meaningful, erring on
the side of slightly too much precision. The default comparison
tolerance should be just large enough that it won't produce noise for
the majority of modules. Modules which require more tolerance (e.g. 
due to numerical instability) should explicitly enlarge the tolerance
and/or set an "allowed failures" limit.

> The test framework will be integrated in the source code of grass and
> will use the make system to execute tests.
> The make system should be used to:
> * run single module or library tests
> * run all module (raster|vector|general|db ...) tests
> * run all library tests
> * run all tests (library than modules)
> * in case of an all-modules-test it should run critical module tests
> automatically first

Any directory with a Makefile should support "make test" one way or
another. Usually via appropriate rules in Lib.make, Module.make,
Dir.make, etc. Dir.make would just run "make test" recursively (see
the %-recursive pattern rule in Dir.make); the others would look for a
test script then use the framework to execute it.

The top-level Makefile includes Dir.make, so "make test" would use the
recursive rule (a special rule for testing critical modules could be
added as a prerequisite). Testing the libraries would just be
"make -C lib test" (i.e. recursive in the "lib" directory); similarly
for raster, vector, etc.

> Two test locations (LL and UTM?)

Possibly X-Y as well; even if we don't add any test data to them, a
test script can create a map easier than creating a location.

> Each module and library has its own test directory. The test
> directories contain the test cases, reference text files and data for
> import (for *.in.* modules).

I'm not sure we need a separate subdirectory for the test data.

> ** Equal and almost equal key value tests (g.region -g, r.univar, ...)
> of text files <-- i am not sure how to realize this

Equal is easy, but almost equal requires being able to isolate
numbers, and possibly determine their context so that context-specific
comparison parameters can be used.

> v.random output=random_points n=100
> v.info -t random_points > result

If the framework understands vector maps, it shouldn't be necessary to
use v.info; it should be able to compare the random_points map to
random_points.ref.

One thing that I hadn't thought about much until now is that maps can
have a lot of different components, different modules affect different
components, and the correct way to perform comparisons would vary
between components.

Having the framework auto-detect maps doesn't tell it which components
of the maps it should be comparing. But having the test script perform
export to text files doesn't tell the framework anything about how to
perform the comparison (unless the framework keeps track of which
commands generated which text file, and has specific rule for specific
commands).

The only "simple" solution (in terms of writing test scripts) is to
have the framework compare all components of the map against the
reference, which means that the export needs to be comprehensive (i.e. 
include all metadata).

-- 
Glynn Clements <glynn at gclements.plus.com>