[gdal-dev] GDAL testing

Fri Oct 16 21:14:37 PDT 2015

Mateusz,

I'll try to do my best to answer inline.  What we have with autotest is
super important.  I've been trying to figure out how we take the first 17
years of knowledge built up in GDAL and make keep going forward over the
next couple decades.  What will help people contribute to GDAL in the
future?  I started autotest2 with the assumption that the GDAL community
might not be interested in including it in the main GDAL tree, may want to
restructure it and include it, or might only want some pieces.  I'm fine
with any of that.  It's been really helpful for me so far.

Not sure these are the best answers, but here is what I've written off the
top of my head.  I think I'm being a bit redundant.

However, above everything else, we can't give up the coverage that we
currently have.  It's a lot and it's taken a huge effort to get to this
point, but I think it is not nearly enough.  I think that means that
autotest in it's current form will be with us for the foreseeable future.
Nor is it enough to just get the Coverity issue count to 0.

-kurt

On Fri, Oct 16, 2015 at 6:01 AM, Mateusz Loskot <mateusz at loskot.net> wrote:

> Hi Kurt,
>
> I'm interested in this topic, so I'd like to pull some more details to
> better
> understand the idea, if you don't mind.
>
>
> Kurt Schwehr-2 wrote
> > For my production work, I'm not able to use the autotest python code
> > because of its non-unittest architecture.
>
> Could you elaborate more on what properties exactly are missing
> from the current autotest suite?
> IOW, what kind of gap autotest2 fills.
>
>
Things that are missing from autotest python code:

- By not being python unittest/unittest2/mock based, a lot of tools don't
know how to discover the tests.  nose <https://pypi.python.org/pypi/nose/>,
pytest <http://pytest.org/>, and lots of folks have their own test runners.
 e.g. Google has the internal version of Blaze: Bazel.  I tried writing a
thin unittest wrapper and it turned into a mess.  I'm sure it can be done,
but the env I work in is super constrained.  And the test reporting with
unittest is a lot stronger than what we have with autotest.
- It has a syntax that is unfamiliar to the average python coder.  The
junit style is super common, so there are more folks available who really
know unittest based tools.
- There is a lot of well developed utility in the the unittest/unittest2
infrastructure that is very robust
- Isolated tests.  There are a lot of tests that cascade.  That makes
running tests in parallel much more difficult
- The python mocks library is pretty great for making tests of things like
databases.  We could have standard mocks for database interactions that
would help folks test their python code that uses
gdal+{mysql,postgis,mssql,oracle,etc}.  That could really accelerate
development for many people.
- unittest and other related tools have nice ways to allow running subsets
of tests

Python unittest is often all of the things you describe.  It doesn't do
well with UI and integration testing of super different libraries, but it
is just as capable as what is currently in autotest.

It's worth taking a look at some examples.  e.g. Fiona's tests
<https://github.com/Toblerity/Fiona/tree/master/tests>

> Yes, autotest is not a unit testing suite.
> To me, it combines functional/integration/blackbox/system/regression
> testing.
>
>
> Kurt Schwehr-2 wrote
> > So... I started creating python
> > unittest and C++ gunit based tests.
>
> Since you mention unittest, then gmock too?
>
>
Yes, gunit and gmock are usually used together.

> Kurt Schwehr-2 wrote
> > The tests are more focused on test isolation than autotest.  This allows
> > for a lot more parallelism in testing.
>
> Isolation at what level?
> I suppose, you mean test case isolation, but not isolation of code/unit
> under test.
>

Both types of isolation would be nice.  GDAL isn't designed for isolated
component testing of the drivers.  I've mostly been able to do isolated
testing on ports/cpl_*.  The biggest isolation issue is between each test
function/method.  Using the same filenames or using global variables and
having one test create a file and then follow on tests use that file.  Some
of these show up like this:

cd autotest
egrep 'gdaltest.[a-z]+_ds =' */*.py | grep -v None | head
gdrivers/fast.py:    gdaltest.fast_ds = gdal.Open(
'data/L71118038_03820020111_HPN.FST' )
gdrivers/georaster.py:    gdaltest.oci_ds = ogr.Open(
os.environ.get('OCI_DSNAME') )
gdrivers/mem.py:    gdaltest.mem_ds = drv.Create( 'mem_1.mem', 50, 3 )
gdrivers/pcidsk.py:    gdaltest.pcidsk_ds = driver.Create(
'tmp/pcidsk_5.pix', 400, 600, 1,
gdrivers/pcidsk.py:    gdaltest.pcidsk_ds = gdal.Open( 'tmp/pcidsk_5.pix',
gdal.GA_Update )
gdrivers/pcidsk.py:    gdaltest.pcidsk_ds = gdal.Open( 'tmp/pcidsk_5.pix',
gdal.GA_Update )
gdrivers/pcidsk.py:    gdaltest.pcidsk_ds = gdal.Open( 'tmp/pcidsk_5.pix',
gdal.GA_Update )
gdrivers/vrtrawlink.py:    gdaltest.rawlink_ds = gdal.Open(
'tmp/rawlink.vrt', gdal.GA_Update )
gdrivers/vrtrawlink.py:    gdaltest.rawlink_ds = gdal.Open(
'tmp/rawlink.vrt', gdal.GA_Update )
gdrivers/vrtwarp.py:    gdaltest.vrtwarp_ds = gdal.AutoCreateWarpedVRT(
gcp_ds )

egrep 'gdaltest.[a-z]+_ds =' */*.py | grep -v None | tail
ogr/ogr_sqlite.py:    gdaltest.sl_ds = ogr.Open( 'tmp/sqlite_test.db',
update = 1  )
ogr/ogr_sqlite.py:    gdaltest.sl_ds = ogr.Open( 'tmp/sqlite_test.db'  )
ogr/ogr_sqlite.py:    gdaltest.sl_ds = ogr.Open( 'tmp/sqlite_test.db',
update = 1  )
ogr/ogr_sqlite.py:    gdaltest.sl_ds = ogr.Open( 'tmp/sqlite_test.db',
update = 1 )
ogr/ogr_svg.py:        gdaltest.svg_ds = ogr.Open( 'data/test.svg' )
ogr/ogr_sxf.py:        gdaltest.sxf_ds = ogr.Open( 'data/100_test.sxf' )
ogr/ogr_vfk.py:    gdaltest.vfk_ds = ogr.Open('data/bylany.vfk')
ogr/ogr_vfk.py:    gdaltest.vfk_ds = ogr.Open('data/bylany.vfk')
ogr/ogr_vrt.py:        gdaltest.vrt_ds = ogr.Open( 'data/vrt_test.vrt' )
ogr/ogr_wasp.py:    gdaltest.wasp_ds = wasp_drv.CreateDataSource( 'tmp.map'
)

>
> So, the former means each test case runs in isolated environment,
> with exclusive access to files/databases/memory/resources required by
> particular test.
> For example, each run of ogr_mysql.py, also in parallel, exclusively
> targets
> a 'unique' database.
>
> Correct?
>
>
Correct.  Or that it is possible to enable isolation.

>
> Kurt Schwehr-2 wrote
> > Here are some samples:
> >
> > C++ tests use  C++11, gunit, google logging, gflags:  (Hoping for C++14
> > soon.. e.g. make_unique)
> > - autotest2/cpp/port/cpl_conv_test.cc
> > <https://gist.github.com/schwehr/13137d826763763fb031> (Yes, this
> is
> > massively boring code)
>
> AFAIS, architecture of tests won't change much, typically, test cases will
> target functional modules (drivers).
> Or, do you plan for proper unit tests targeting individual
> interfaces/classes
> in internal implementation of drivers, etc.?
>

Looking at drivers, I think the level and type of testing that is
appropriate varies quite a bit.  Some drivers are super simple and some are
very complex.  In general, extensive testing of internals isn't super
important.  What we do need a ton more of is testing the failure paths.
 e.g. I took AFL and ran it against ogr2ogr and a custom little loader
program to read geojson.  GDAL code should never crash or hang, but it most
definitely does.  We need to turn those input files into test cases.

We also can use AFL to generate test cases that cover more of the failure
modes.   I did that with minixml:  r30854
<https://trac.osgeo.org/gdal/changeset/30854/>

>
>
> Kurt Schwehr-2 wrote
> > I'm (mostly) following Google's style guides.
> > All C++ should be formatted with "clang-format --style=Google"
>
> Unless it is not planned to include autotest2 into GDAL source code,
> shouldn't it follow GDAL style?
>
>
Not necessarily.   I'm working for Google and starting from scratch with
Google engineers doing the reviews, so I'm using the Google Style Guides.
The style for GDAL isn't what users of the library are likely to do.  A
part of what I'm trying to do with autotest2 is also give examples of what
I think are good ways to use GDAL.  People are likely going to use C++11/14
and Hungarian-ish notation is not something I've seen in any other project
I've worked on.  The Google style guide is pretty well flushed out, where
as RFC 8 for GDAL is very minimalistic.  I prefer to work in a space where
I can just follow.

> Kurt Schwehr-2 wrote
> > Would like to eventually do (unsorted):
> > - Fuzz testing, ASAN/MSAN/TSAN/Valgrind/Heap checks  (I've done some MSAN
> > &
> > heap checkers by hand)
> > - Performance testing - time and memory usage
>
> IMHO, those belong to separate suites.
>
>
Why separate the memory and thread tests?  If you write really solid
unittests in C++, you can flip on or off those modes.  As long as your
unittests properly cleanup, then they are great code to use for
ASAN/MSAN/etc.  For fuzzing, it is definitely helpful to isolate things
like the geojson driver from all the mechanics of which driver to use.
We'd like to be able to pass anything to the driver directly to trigger
code paths.  For performance, it really is a challenge and likely going to
be challenging to figure out what is worth doing.

>
> - Test the C API at the C level
> </qoute>
>

> If plain unittests are planned, mocking might be possible.
>

What do you mean by this?

>
> Best regards,
> Mateusz
>
>
>
> --
> View this message in context:
> http://osgeo-org.1560.x6.nabble.com/Re-GDAL-testing-tp5222783p5230596.html
> Sent from the GDAL - Dev mailing list archive at Nabble.com.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20151017/2686d728/attachment-0001.html>