<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">Le 15/10/2023 à 13:34, Javier Jimenez

      Shaw via gdal-dev a écrit :<br>

    </div>

    <blockquote type="cite"

cite="mid:CADRrdKsPBb+V1H8iDbfxgqHUYNS_o=TJt2bXPo2CK=3oEhwSUQ@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="auto">Hi Even. Thanks, it sounds good.

          <div dir="auto">However I see a potential problem. I see that

            you use once "SetCacheMax". We should not forget about that

            in the future for sensible tests. The cache of gdal is

            usually a percentage of the total memory, that may change

            among the environments and time.<br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>Javier,</p>

    <p>What is sure is that the timings got in one session of the perf

      tests in CI are comparable to nothing else that timings done in

      the same session (and that's already challenging!). So the effect

      of the RAM available in the CI worker might affect the speed of

      the tests, but it will affect it in the same way for both the

      reference run and the tested run (while the GDAL_CACHEMAX=5%

      setting remains the same and the general working of the block

      cache remains similar). I anticipate that at some points changes

      in GDAL might make the perf test suite no longer comparable to the

      current reference version and that we will have to upgrade the

      commit of the reference version while that happens. Actually if

      the perf test suite is extended, it might be useful to upgrade the

      commit of the reference version at release time of feature

      releases. For example, when GDAL 3.8.0 is released, it will become

      the reference point for 3.9.0 development, and so on (otherwise we

      wouldn't get perf regression testing of added tests). The downside

      of this is that this wouldn't catch progressive slowdowns over

      several release cycles. But given that I had to raise the

      threshold for failure to > 30% regression to avoid false

      positives, the perf test suite (at least when run in CI with all

      its unpredictability) can only catch major "instant" regressions.</p>

    <p>Even<br>

    </p>

    <blockquote type="cite"

cite="mid:CADRrdKsPBb+V1H8iDbfxgqHUYNS_o=TJt2bXPo2CK=3oEhwSUQ@mail.gmail.com"><br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Wed, 11 Oct 2023, 07:53

          Laurențiu Nicola via gdal-dev, <<a

            href="mailto:gdal-dev@lists.osgeo.org" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote"

style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

          <br>

          No experience with pytest-benchmark, but I maintain an

          unrelated project that runs some benchmarks on CI, and here

          are some things worth mentioning:<br>

          <br>

           - we store the results as a newline-delimited JSON file in a

          different GitHub repository (<a

href="https://raw.githubusercontent.com/rust-analyzer/metrics/master/metrics.json"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://raw.githubusercontent.com/rust-analyzer/metrics/master/metrics.json</a>,

          warning, it's a 5.5 MB unformatted JSON)<br>

           - we have an in-browser dashboard that retrieves the whole

          file and displays them: <a

            href="https://rust-analyzer.github.io/metrics/"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://rust-analyzer.github.io/metrics/</a><br>

           - we do track build time and overall run time, but we're more

          interested in correctness<br>

           - the display is a bit of a mess (partly due to trying to

          keep the setup as simple as possible), but you can look for

          the "total time", "total memory" and "build" to get an idea<br>

           - we store the runner CPU type and memory in that JSON;

          they're almost all Intel, but they do upgrade from time to

          time<br>

           - we even have two AMD EPYC runs, note that boost is disabled

          in a different way there (we don't try to disable it, though)<br>

           - we also try to measure the CPU instruction count (the perf

          counter), but it doesn't work on GitHub and probably in most

          VMs<br>

           - the runners have been very reliable, but not really

          consistent in performance<br>

           - a bigger problem for us was that somebody actually needs to

          look at the dashboard to spot any regressions and investigate

          them (some are caused by external changes)<br>

           - in 3-5 years we'll probably have to trim down the JSON or

          switch to a different storage<br>

          <br>

          Laurentiu<br>

          <br>

          On Tue, Oct 10, 2023, at 21:08, Even Rouault via gdal-dev

          wrote:<br>

          > Hi,<br>

          ><br>

          > I'm experimenting with adding performance regression

          testing in our CI. <br>

          > Currently our CI has quite extensive functional coverage,

          but totally <br>

          > lacks performance testing. Given that we use pytest, I've

          spotted <br>

          > pytest-benchmark (<a

            href="https://pytest-benchmark.readthedocs.io/en/latest/"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://pytest-benchmark.readthedocs.io/en/latest/</a>)

          as <br>

          > a likely good candidate framework.<br>

          ><br>

          > I've prototyped things in <a

            href="https://github.com/OSGeo/gdal/pull/8538"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/OSGeo/gdal/pull/8538</a><br>

          ><br>

          > Basically, we now have a autotest/benchmark directory

          where performance <br>

          > tests can be written.<br>

          ><br>

          > Then in the CI, we checkout a reference commit, build it

          and run the <br>

          > performance test suite in --benchmark-save mode<br>

          ><br>

          > And then we run the performance test suite on the PR in <br>

          > --benchmark-compare mode with a

          --benchmark-compare-fail="mean:5%" <br>

          > criterion (which means that a test fails if its mean

          runtime is 5% <br>

          > slower than the reference one)<br>

          ><br>

          >  From what I can see, pytest-benchmark behaves correctly

          if tests are <br>

          > removed or added (that is not failing, just skipping them

          during <br>

          > comparison). The only thing one should not do is modify

          an existing test <br>

          > w.r.t the reference branch.<br>

          ><br>

          > Does someone has practical experience of

          pytest-benchmark, in particular <br>

          > in CI setups? With virtualization, it is hard to

          guarantee that other <br>

          > things happening on the host running the VM might not

          interfer. Even <br>

          > locally on my own machine, I initially saw strong

          variations in timings, <br>

          > which can be reduced to acceptable deviation by disabling

          Intel <br>

          > Turboboost feature (echo 1 | sudo tee <br>

          > /sys/devices/system/cpu/intel_pstate/no_turbo)<br>

          ><br>

          > Even<br>

          ><br>

          > -- <br>

          > <a href="http://www.spatialys.com"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">http://www.spatialys.com</a><br>

          > My software is free, but my time generally not.<br>

          ><br>

          > _______________________________________________<br>

          > gdal-dev mailing list<br>

          > <a href="mailto:gdal-dev@lists.osgeo.org"

            rel="noreferrer" target="_blank" moz-do-not-send="true"

            class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><br>

          > <a

            href="https://lists.osgeo.org/mailman/listinfo/gdal-dev"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>

          _______________________________________________<br>

          gdal-dev mailing list<br>

          <a href="mailto:gdal-dev@lists.osgeo.org" rel="noreferrer"

            target="_blank" moz-do-not-send="true"

            class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><br>

          <a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev"

            rel="noreferrer noreferrer" target="_blank"

            moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>

        </blockquote>

      </div>

      <br>

      <fieldset class="moz-mime-attachment-header"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

gdal-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>

<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>

</pre>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>

My software is free, but my time generally not.</pre>

  </body>

</html>