[QGIS-Developer] QGIS repository management

Tue Oct 17 12:18:46 PDT 2023

>
> IMHO, the main issue here is that the CI is too often broken for no
> reasons related to the PR content. And most of the time recently, it's
> because of one of the mingw jobs.

I suggest we move this conversation into more practical areas and
focus instead on the actual CI issues.

In my experience there's a couple of main issues here:

1. The mingw build fails often with the error:

cd /__w/QGIS/QGIS/build_mingw64/resources && /usr/bin/cmake -E copy
/__w/QGIS/QGIS/resources/srs6.db
/__w/QGIS/QGIS/build_mingw64/resources/srs.db
make[2]: write error: stdout

It always succeeds if you restart the job though. I don't know what's
causing this... a disk space issue? (but then why would it succeed on
the following attempt?)

2. The msys2 build can take HOURS (5+ in some cases). The qt 5/ qt6
builds also see occasional extremely slow builds.

I think the root cause of this is that we've exhausted all the
available space allowed for caching objects on github. All the builds
are trying to cache things to speed up the workflows, but the
different caches are randomly getting deleted because together they've
well exceeded the total allowed size. The solutions I could see here
are:

- approaching github to ask nicely for more space
- dropping one or more workflows. Maybe we could combine the msys2 and
mingw workflow into one to cut out one workflow. But that's blocked
because the msys2 workflow currently has a lot of things disabled
(including python), so it's currently not a suitable replacement for
the mingw builds which give end users an easy way to test PRs on
windows.

3. Random test failures. Main culprits are:
- ept provider test (but I **think** I've fixed this with
improvements/fixes made over the 3.34 bug fixing period). At least, I
don't recall seeing this in the last week, when it used to be a daily
random fail.
- raster file writer test: randomly fails for an unknown reason. The
test passes valgrind/asan fine, so I suspect the random failures are
caused by some interplay with other tests modifying data in the test
data folder. More work is needed.
- raster analysis utils test: same as above.
- network access manager test: depends on the badssl external website,
which goes down occasionally. We could potentially run this site
ourselves as part of the CI setup, but that's trading the dependence
on an external service for added CI maintenance burden on ourselves.

4. Docker/ubuntu repository fails. This happens occasionally -- the
external repos will have transient connectivity issues. Not much we
can do here except manually re-running the jobs.

Is there anything I've missed?

Nyall

>
> Like proposed by Matthias, I would be very much in favor in modifying
> our CI jobs to rely on non-rolling release distribution.
>
> Regards,
> Julien
>
>
> > In my experience, the peer reviews have proven to be an effective tool to improve the code quality.
> > I think it can be explained with a four eyes principle. The first two eyes are involved in writing the code, the second pair of eyes that validates
> > needs to be a trusted pair.
> > For a code base of the maturity of QGIS, this seems to me to be justified and reasonable.
> >
> > As for who can do a review, a potential conflict of interest has been raised. This arises most noticeable within a single company, but can also
> > very well be the case whenever a collaboration occurs between multiple individuals or if a well-connected client is involved. In my experience, in
> > most cases people will first ask the person that they trust to be most qualified to review a specific area of the code, regardless the organisation
> > they work for. In some cases, this may be your colleagues. What I and others have done in the past in such cases, is to leave an additional review
> > window of a couple of days before merging to give everyone the opportunity to jump in if they wanted to, just to be sure to play it fair. In my
> > experience, this worked quite nicely, but I am also open to other approaches.
> >
> > As for failing tests, I think one possibly low hanging fruit would be to switch all CI workflows to be based on non-rolling distributions. And we
> > could potentially also be a bit more strict with disabling flaky tests or making them optional.
> >
> > Best wishes
> > Matthias
> >
> > On Tue, Oct 17, 2023 at 9:47 AM Sandro Santilli via QGIS-Developer <qgis-developer at lists.osgeo.org> wrote:
> >
> >  On Tue, Oct 17, 2023 at 03:08:34PM +1300, Nyall Dawson via QGIS-Developer wrote:
> >  > On Tue, 17 Oct 2023 at 03:41, Sandro Santilli <strk at kbt.io> wrote:
> >  > >
> >  > > On Mon, Oct 16, 2023 at 09:59:35AM +1300, Nyall Dawson via QGIS-Developer wrote:
> >  > >
> >  > > > If you flip the situation, you'll see that yes, you do have trust!
> >  > > >
> >  > > > - a complete stranger CANNOT approve their own changes
> >  > > > - a complete stranger CANNOT approve other stranger's changes
> >  > > > - a complete stranger CANNOT approve an approved member's changes
> >  > > >
> >  > > > vs
> >  > > >
> >  > > > - an approved member CANNOT approve their own changes
> >  > > > - an approved member CAN approve a complete stranger's changes
> >  > > > - an approved member CAN approve a another approved member's changes
> >  > >
> >  > > That's partial trust. I'm trusted to be able to judge someone else's
> >  > > work but not to judge my own work !
> >  >
> >  > Ok, it's partial trust. What's wrong with that?
> >
> >  I feel I'm not explaining myself correctly.
> >  I do believe in peer review, and I'm happy for others to look at the
> >  changes I propose. I'd love everyone to look at them!
> >
> >  The word "partial" was not appropriate, probably, it's more about
> >  "inconsistency".
> >
> >  > In short: I'm more than OK with only partial trusting everyone. We ALL
> >  > make mistakes, so let's stick with the processes we have which reduce
> >  > the risk of these mistakes hitting end users..
> >
> >  I know I make mistakes, but I'm able to merge changes proposed by a
> >  random contributor, where my own review is enough.
> >
> >  I'm FULLY trusted in that case, and this is conflicting with not
> >  being so when it's me proposing the changes.
> >
> >  Why should I be able to merge in that case then ? Is it because
> >  2 people review the code ? Me and the proponent ? Then why the
> >  random contributor review would not count on MY prs ?
> >
> >  I hope you see the discrepancy here.
> >
> >  --strk;
> >
> >    Libre GIS consultant/developer
> >    https://strk.kbt.io/services.html
> >  _______________________________________________
> >  QGIS-Developer mailing list
> >  QGIS-Developer at lists.osgeo.org
> >  List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> >  Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> >
> > _______________________________________________
> > QGIS-Developer mailing list
> > QGIS-Developer at lists.osgeo.org
> > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
>
>
> --
>
> Julien Cabieces
> Senior Developer at Oslandia
> julien.cabieces at oslandia.com
> _______________________________________________
> QGIS-Developer mailing list
> QGIS-Developer at lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer