[GRASS-dev] Test suite for GRASS - proposal, discussion welcome

Mon Jun 13 15:50:51 EDT 2011

Hello,

>> I assume such critical modules are coded in the framework, not in the
>> test scripts?
>
> I was thinking about a directive (e.g. "@critical") in the test script
> so that any failure during the test would generate a more prominent

Such kind of annotations/directives are a great idea. I was thinking
about a similar approach. I have had in mind to add such directives
into test cases to identify data preprocessing steps, test calls and
critical modules. I would like to use them as part of the test case
documentation:

r.mapclac_test.sh
{{{
# This test case is designed to test r.mapcalc,
# which is @critical for many other tests.

# We need to perform a @preprocess step
# with g.region to set up a specific LL test region
g.region s=0 w=0 n=90 e = 180 res=1

# The first @test generates a CELL raster map with value 1
r.mapcalc expression="result1 = 1"
...
}}}

IMHO each test case should be well documented, so why not using
annotations as part of the documentation? Additionally i would like to
add the tests to the bottom of the HTML manual pages automatically as
examples.

> message. If any such errors occured as a result of "make test", you
> would ignore all the other failures. In the same way that if error.log
> has an error for e.g. lib/gis, you wouldn't bother about all of the
> subsequent errors and focus on what was wrong with lib/gis.

Is it possible to stop "make test" in case a library test failed or a
critical module?

>
>> But this also means that the test scripts
>> must be interpreted and executed line by line by the framework to
>> identify critical modules used for data generation?
>
> Test failures should not occur for critical modules. If they do, you
> deal with the critical module, and ignore everything else until that
> has been dealt with.

Indeed. I would suggest to put critical modules on top of the
directory makefiles to assure
that they are executed first and recursive testing stops when one of
them fails.
In case any library test or critical module failed, no further module
test should performed.

>
> The test scripts would need to be processed a command at a time for
> other reasons (assuming that the framework is going to be doing more
> than simply executing the commands).

I had in mind that the return value of each command is checked and
stderr is logged for further analysis. The framework must be able to
identify which command failed/succeeded and for which command data
validation was available and successful. This data should be available
in the detailed test case specific HTML log files.

>
>> Example for a synthetic r.series test using r.mapcalc for data
>> generation. r.mapcalc is marked as critical in the framework:
>>
>> In case r.mapcalc is marked as critical and the framework finds the
>> keyword "r.mapcalc" in the script, appearing as first word outside
>> of a comment, it checks if the r.mapcalc test(s) already run
>> correctly and stop the r.series test if they not.
>
> I wouldn't bother with this part. If the user runs "make test" from
> the top level, r.mapcalc's tests will end up getting run. If they
> fail, then the user will get an error message informing them that a
> critical module failed and that they should ignore everything else
> until that has been addressed.
>
> If you're doing repeated tests on a specific module that you're
> working on, you don't want to be re-running the r.mapcalc, r.out.ascii
> etc tests every time.

I don't know if this can be avoided in an automated test system.
Especially when each time a test case gets executed a temporary mapset
is created.
Except the developer comments the preprocessing steps out and executes
the script manually in the test location.

>
>> In case r.mapcalc
>> tests are valid it starts the r.mapcalc commands and checks there
>> return values. If the return values are correct, then the rest of the
>> script is executed. After reaching the end of this script the
>> framework looks for any generated data in the current mapset (raster,
>> raster3d, vector, color, regions, ...) and looks for corresponding
>> validation files in the test directory. In this case it will find the
>> raster maps input1, input2 and result in the current mapset and
>> validation.ref in the test directory. It will use r.out.ascii on
>> result map choosing a low precision (dp=3??) and compares the output
>> with result.ref which was hopefully generated using the same
>> precision.
>
> Only result.ref would exist, so there's no need to export and validate
> input1 and input2. In general, you don't need to traverse the entire
> mapset directory, but only test for specific files for which a
> corresponding ".ref" file exists.

Thats indeed much more efficient.

>
> I'd export as much precision as is likely to be meaningful, erring on
> the side of slightly too much precision. The default comparison
> tolerance should be just large enough that it won't produce noise for
> the majority of modules. Modules which require more tolerance (e.g.
> due to numerical instability) should explicitly enlarge the tolerance
> and/or set an "allowed failures" limit.

Where to set the precision in the test case? As @precision directive
which will be used for each test in the test case file or as
environment variable? The former discussed hierarchical python class
solution would provide specific function for this case ... .

> Any directory with a Makefile should support "make test" one way or
> another. Usually via appropriate rules in Lib.make, Module.make,
> Dir.make, etc. Dir.make would just run "make test" recursively (see
> the %-recursive pattern rule in Dir.make); the others would look for a
> test script then use the framework to execute it.
>
> The top-level Makefile includes Dir.make, so "make test" would use the
> recursive rule (a special rule for testing critical modules could be
> added as a prerequisite). Testing the libraries would just be
> "make -C lib test" (i.e. recursive in the "lib" directory); similarly
> for raster, vector, etc.

Yes. Do we need special rules for critical modules or is the order of
the module directories in the Makefile's combined with a critical
annotation sufficient?

>
>> Two test locations (LL and UTM?)
>
> Possibly X-Y as well; even if we don't add any test data to them, a
> test script can create a map easier than creating a location.
>
>> Each module and library has its own test directory. The test
>> directories contain the test cases, reference text files and data for
>> import (for *.in.* modules).
>
> I'm not sure we need a separate subdirectory for the test data.

I am sure we need several. I would suggest a separate test directory
for each test location: "test_UTM" and "test_LL". Several modules will
only work in UTM locations, other in booth. Each directory may contain
several test case files for different modules (r.univar/r3.univar) and
several .ref files.

>
>> ** Equal and almost equal key value tests (g.region -g, r.univar, ...)
>> of text files <-- i am not sure how to realize this
>
> Equal is easy, but almost equal requires being able to isolate
> numbers, and possibly determine their context so that context-specific
> comparison parameters can be used.

The framework should support almost equal comparison and
identification for shell style output available in several modules:

north=234532.45
south=5788374.45
...

Almost equal comparison for raster, voxel and vector maps must be
realized using the precision option of the *.out.ascii modules. In
case of database output i am not sure how to realize almost equal
comparison for floating point data.

>
>> v.random output=random_points n=100
>> v.info -t random_points > result
>
> If the framework understands vector maps, it shouldn't be necessary to
> use v.info; it should be able to compare the random_points map to
> random_points.ref.

v.info may not be necessary in case a seed option is available for v.random.

>
> One thing that I hadn't thought about much until now is that maps can
> have a lot of different components, different modules affect different
> components, and the correct way to perform comparisons would vary
> between components.

Do you refer to different feature types in vector maps? Point, line,
border, area, centroid and so on?

>
> Having the framework auto-detect maps doesn't tell it which components
> of the maps it should be comparing. But having the test script perform
> export to text files doesn't tell the framework anything about how to
> perform the comparison (unless the framework keeps track of which
> commands generated which text file, and has specific rule for specific
> commands).

IMHO its not a good framework design if it should know specific rules
for specific commands.

>
> The only "simple" solution (in terms of writing test scripts) is to
> have the framework compare all components of the map against the
> reference, which means that the export needs to be comprehensive (i.e.
> include all metadata).

This is the solution which i have in mind. But in case of vector data
we may need to combine v.out.ascii type''=standard + v.info +
db.select.

Best regards
Soren