[gdal-dev] The Not Rocket Science Rule Of Software Engineering

Tue Mar 29 06:36:26 PDT 2022

Le 29/03/2022 à 14:45, Andrew C Aitchison a écrit :
>
> I have been reading The Not Rocket Science Rule Of Software Engineering
> https://graydon2.dreamwidth.org/1597.html
> which is:
>        automatically maintain a repository of code
>        that always passes all the tests
>
> Does https://github.com/OSGeo/gdal have a branch with this property ?
All branches should have that property most of the time, at least to the 
extent our CI is representative and around the time the commit was done 
(our CI depends on external stuff that might independently change/break. 
So something that used to pass might be broken if replayed later). Just 
observe the nice badges at https://github.com/OSGeo/gdal#readme or 
navigate through the branches in github UI and look at the status of the 
head commit. That's the purpose of using pull requests to ensure we 
don't push broken stuff (at least we don't push stuff *known* to be 
broken because there's a test to assert it)
>
> If we do not yet, it should in principal be easy:
> whenever HEAD of master passes all test, pull that to a 
> current-working branch
> (in practice we have 16 top-level tests, so would need a wrapper which 
> says
> whether they all passed or not).
>
>
> A comment on the above blog post is:
>   Aegis has a delightful wrinkle on this: each changeset has to not just
>   pass all existing tests, but also come with a new test, independently
>   tracked and associated with the change, that must fail before the
>   change, and pass after it.
>
> That would be a useful addition too.

We try to keep with that policy too as much as practical. But that's 
definitely not 100% of the time, as sometimes creating a reliable test 
can be really difficult and time consuming (a 5 minute fix could involve 
sometimes hours, days or weeks of effort to create a test case). A few 
typical exceptions I've in mind:

- a commit depends on a too large dataset, that is not reasonable to 
push into the repository given it size, and creating a smaller version 
is not practical, especially for binary formats (could require writing 
dedicated & complex code to generate a smaller dataset)

- a fix depends on a fuzzed dataset. We could incorporate fuzzed 
datasets into the repository, but, due to the nature of fuzzing, they 
are most of the time broken in many ways, and consequently they don't 
necessarily exercise only the code path that is fixed. So, if 
incorporated, a further change in the driver could cause them to be 
rejected for another reason, and they would no longer test the code path 
they initially exercised.

- changes that are related to performance improvements. Measuring 
reliably speed of execution is already tricky locally, and even more in 
cloud environments used by CI. But that's definitely something we could 
try to put in place. One potential solution is in the same worker to 
build a version before (or the last released version) and after the 
change, and compare.

Even

-- 
http://www.spatialys.com
My software is free, but my time generally not.