<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">Le 15/10/2023 à 13:34, Javier Jimenez
Shaw via gdal-dev a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:CADRrdKsPBb+V1H8iDbfxgqHUYNS_o=TJt2bXPo2CK=3oEhwSUQ@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="auto">Hi Even. Thanks, it sounds good.
<div dir="auto">However I see a potential problem. I see that
you use once "SetCacheMax". We should not forget about that
in the future for sensible tests. The cache of gdal is
usually a percentage of the total memory, that may change
among the environments and time.<br>
</div>
</div>
</div>
</blockquote>
<p>Javier,</p>
<p>What is sure is that the timings got in one session of the perf
tests in CI are comparable to nothing else that timings done in
the same session (and that's already challenging!). So the effect
of the RAM available in the CI worker might affect the speed of
the tests, but it will affect it in the same way for both the
reference run and the tested run (while the GDAL_CACHEMAX=5%
setting remains the same and the general working of the block
cache remains similar). I anticipate that at some points changes
in GDAL might make the perf test suite no longer comparable to the
current reference version and that we will have to upgrade the
commit of the reference version while that happens. Actually if
the perf test suite is extended, it might be useful to upgrade the
commit of the reference version at release time of feature
releases. For example, when GDAL 3.8.0 is released, it will become
the reference point for 3.9.0 development, and so on (otherwise we
wouldn't get perf regression testing of added tests). The downside
of this is that this wouldn't catch progressive slowdowns over
several release cycles. But given that I had to raise the
threshold for failure to > 30% regression to avoid false
positives, the perf test suite (at least when run in CI with all
its unpredictability) can only catch major "instant" regressions.</p>
<p>Even<br>
</p>
<blockquote type="cite"
cite="mid:CADRrdKsPBb+V1H8iDbfxgqHUYNS_o=TJt2bXPo2CK=3oEhwSUQ@mail.gmail.com"><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, 11 Oct 2023, 07:53
Laurențiu Nicola via gdal-dev, <<a
href="mailto:gdal-dev@lists.osgeo.org" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
No experience with pytest-benchmark, but I maintain an
unrelated project that runs some benchmarks on CI, and here
are some things worth mentioning:<br>
<br>
- we store the results as a newline-delimited JSON file in a
different GitHub repository (<a
href="https://raw.githubusercontent.com/rust-analyzer/metrics/master/metrics.json"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://raw.githubusercontent.com/rust-analyzer/metrics/master/metrics.json</a>,
warning, it's a 5.5 MB unformatted JSON)<br>
- we have an in-browser dashboard that retrieves the whole
file and displays them: <a
href="https://rust-analyzer.github.io/metrics/"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://rust-analyzer.github.io/metrics/</a><br>
- we do track build time and overall run time, but we're more
interested in correctness<br>
- the display is a bit of a mess (partly due to trying to
keep the setup as simple as possible), but you can look for
the "total time", "total memory" and "build" to get an idea<br>
- we store the runner CPU type and memory in that JSON;
they're almost all Intel, but they do upgrade from time to
time<br>
- we even have two AMD EPYC runs, note that boost is disabled
in a different way there (we don't try to disable it, though)<br>
- we also try to measure the CPU instruction count (the perf
counter), but it doesn't work on GitHub and probably in most
VMs<br>
- the runners have been very reliable, but not really
consistent in performance<br>
- a bigger problem for us was that somebody actually needs to
look at the dashboard to spot any regressions and investigate
them (some are caused by external changes)<br>
- in 3-5 years we'll probably have to trim down the JSON or
switch to a different storage<br>
<br>
Laurentiu<br>
<br>
On Tue, Oct 10, 2023, at 21:08, Even Rouault via gdal-dev
wrote:<br>
> Hi,<br>
><br>
> I'm experimenting with adding performance regression
testing in our CI. <br>
> Currently our CI has quite extensive functional coverage,
but totally <br>
> lacks performance testing. Given that we use pytest, I've
spotted <br>
> pytest-benchmark (<a
href="https://pytest-benchmark.readthedocs.io/en/latest/"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://pytest-benchmark.readthedocs.io/en/latest/</a>)
as <br>
> a likely good candidate framework.<br>
><br>
> I've prototyped things in <a
href="https://github.com/OSGeo/gdal/pull/8538"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/OSGeo/gdal/pull/8538</a><br>
><br>
> Basically, we now have a autotest/benchmark directory
where performance <br>
> tests can be written.<br>
><br>
> Then in the CI, we checkout a reference commit, build it
and run the <br>
> performance test suite in --benchmark-save mode<br>
><br>
> And then we run the performance test suite on the PR in <br>
> --benchmark-compare mode with a
--benchmark-compare-fail="mean:5%" <br>
> criterion (which means that a test fails if its mean
runtime is 5% <br>
> slower than the reference one)<br>
><br>
> From what I can see, pytest-benchmark behaves correctly
if tests are <br>
> removed or added (that is not failing, just skipping them
during <br>
> comparison). The only thing one should not do is modify
an existing test <br>
> w.r.t the reference branch.<br>
><br>
> Does someone has practical experience of
pytest-benchmark, in particular <br>
> in CI setups? With virtualization, it is hard to
guarantee that other <br>
> things happening on the host running the VM might not
interfer. Even <br>
> locally on my own machine, I initially saw strong
variations in timings, <br>
> which can be reduced to acceptable deviation by disabling
Intel <br>
> Turboboost feature (echo 1 | sudo tee <br>
> /sys/devices/system/cpu/intel_pstate/no_turbo)<br>
><br>
> Even<br>
><br>
> -- <br>
> <a href="http://www.spatialys.com"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">http://www.spatialys.com</a><br>
> My software is free, but my time generally not.<br>
><br>
> _______________________________________________<br>
> gdal-dev mailing list<br>
> <a href="mailto:gdal-dev@lists.osgeo.org"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><br>
> <a
href="https://lists.osgeo.org/mailman/listinfo/gdal-dev"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" rel="noreferrer"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev"
rel="noreferrer noreferrer" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</blockquote>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</body>
</html>