<div dir="ltr"><div dir="ltr"></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Oct 17, 2023 at 9:19 PM Nyall Dawson via QGIS-Developer <<a href="mailto:qgis-developer@lists.osgeo.org">qgis-developer@lists.osgeo.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">><br>

> IMHO, the main issue here is that the CI is too often broken for no<br>

> reasons related to the PR content. And most of the time recently, it's<br>

> because of one of the mingw jobs.<br>

<br>

I suggest we move this conversation into more practical areas and<br>

focus instead on the actual CI issues.<br>

<br>

In my experience there's a couple of main issues here:<br>

<br>

1. The mingw build fails often with the error:<br>

<br>

cd /__w/QGIS/QGIS/build_mingw64/resources && /usr/bin/cmake -E copy<br>

/__w/QGIS/QGIS/resources/srs6.db<br>

/__w/QGIS/QGIS/build_mingw64/resources/srs.db<br>

make[2]: write error: stdout<br>

<br>

It always succeeds if you restart the job though. I don't know what's<br>

causing this... a disk space issue? (but then why would it succeed on<br>

the following attempt?)<br></blockquote><div><br></div><div>mingw runs on ubuntu-latest within a fedora container. ubuntu images come with plenty of bloat preinstalled that can be easily deleted.</div><div>cf. <a href="https://github.com/easimon/maximize-build-space/blob/master/action.yml#L114-L130">https://github.com/easimon/maximize-build-space/blob/master/action.yml#L114-L130</a></div><div>I just don't know how to do that if a "container" is specified as the delete action will need to happen on the base system.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

2. The msys2 build can take HOURS (5+ in some cases). The qt 5/ qt6<br>

builds also see occasional extremely slow builds.<br>

<br>

I think the root cause of this is that we've exhausted all the<br>

available space allowed for caching objects on github. All the builds<br>

are trying to cache things to speed up the workflows, but the<br>

different caches are randomly getting deleted because together they've<br>

well exceeded the total allowed size. The solutions I could see here<br>

are:<br>

<br>

- approaching github to ask nicely for more space<br>

- dropping one or more workflows. Maybe we could combine the msys2 and<br>

mingw workflow into one to cut out one workflow. But that's blocked<br>

because the msys2 workflow currently has a lot of things disabled<br>

(including python), so it's currently not a suitable replacement for<br>

the mingw builds which give end users an easy way to test PRs on<br>

windows.<br></blockquote><div><br></div><div>What is the advantage of the msys2 workflow?</div><div>It does "some build on windows" but IIUC does not replicate the actual windows build using OSGEO4W. Is there something else that it brings to the mix?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

3. Random test failures. Main culprits are:<br>

- ept provider test (but I **think** I've fixed this with<br>

improvements/fixes made over the 3.34 bug fixing period). At least, I<br>

don't recall seeing this in the last week, when it used to be a daily<br>

random fail.<br>

- raster file writer test: randomly fails for an unknown reason. The<br>

test passes valgrind/asan fine, so I suspect the random failures are<br>

caused by some interplay with other tests modifying data in the test<br>

data folder. More work is needed.<br>

- raster analysis utils test: same as above.<br>

- network access manager test: depends on the badssl external website,<br>

which goes down occasionally. We could potentially run this site<br>

ourselves as part of the CI setup, but that's trading the dependence<br>

on an external service for added CI maintenance burden on ourselves.<br>

<br>

4. Docker/ubuntu repository fails. This happens occasionally -- the<br>

external repos will have transient connectivity issues. Not much we<br>

can do here except manually re-running the jobs.<br>

<br>

Is there anything I've missed?<br></blockquote><div><br></div><div>There is also the occasional "QGIS is not compatible with the bleeding edge [sip, name your dependency] version that [rolling distro] has decided to start shipping today".</div><div><br></div><div>Matthias</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Nyall<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

><br>

> Like proposed by Matthias, I would be very much in favor in modifying<br>

> our CI jobs to rely on non-rolling release distribution.<br>

><br>

> Regards,<br>

> Julien<br>

><br>

><br>

> > In my experience, the peer reviews have proven to be an effective tool to improve the code quality.<br>

> > I think it can be explained with a four eyes principle. The first two eyes are involved in writing the code, the second pair of eyes that validates<br>

> > needs to be a trusted pair.<br>

> > For a code base of the maturity of QGIS, this seems to me to be justified and reasonable.<br>

> ><br>

> > As for who can do a review, a potential conflict of interest has been raised. This arises most noticeable within a single company, but can also<br>

> > very well be the case whenever a collaboration occurs between multiple individuals or if a well-connected client is involved. In my experience, in<br>

> > most cases people will first ask the person that they trust to be most qualified to review a specific area of the code, regardless the organisation<br>

> > they work for. In some cases, this may be your colleagues. What I and others have done in the past in such cases, is to leave an additional review<br>

> > window of a couple of days before merging to give everyone the opportunity to jump in if they wanted to, just to be sure to play it fair. In my<br>

> > experience, this worked quite nicely, but I am also open to other approaches.<br>

> ><br>

> > As for failing tests, I think one possibly low hanging fruit would be to switch all CI workflows to be based on non-rolling distributions. And we<br>

> > could potentially also be a bit more strict with disabling flaky tests or making them optional.<br>

> ><br>

> > Best wishes<br>

> > Matthias<br>

> ><br>

> > On Tue, Oct 17, 2023 at 9:47 AM Sandro Santilli via QGIS-Developer <<a href="mailto:qgis-developer@lists.osgeo.org" target="_blank">qgis-developer@lists.osgeo.org</a>> wrote:<br>

> ><br>

> >  On Tue, Oct 17, 2023 at 03:08:34PM +1300, Nyall Dawson via QGIS-Developer wrote:<br>

> >  > On Tue, 17 Oct 2023 at 03:41, Sandro Santilli <<a href="mailto:strk@kbt.io" target="_blank">strk@kbt.io</a>> wrote:<br>

> >  > ><br>

> >  > > On Mon, Oct 16, 2023 at 09:59:35AM +1300, Nyall Dawson via QGIS-Developer wrote:<br>

> >  > ><br>

> >  > > > If you flip the situation, you'll see that yes, you do have trust!<br>

> >  > > ><br>

> >  > > > - a complete stranger CANNOT approve their own changes<br>

> >  > > > - a complete stranger CANNOT approve other stranger's changes<br>

> >  > > > - a complete stranger CANNOT approve an approved member's changes<br>

> >  > > ><br>

> >  > > > vs<br>

> >  > > ><br>

> >  > > > - an approved member CANNOT approve their own changes<br>

> >  > > > - an approved member CAN approve a complete stranger's changes<br>

> >  > > > - an approved member CAN approve a another approved member's changes<br>

> >  > ><br>

> >  > > That's partial trust. I'm trusted to be able to judge someone else's<br>

> >  > > work but not to judge my own work !<br>

> >  ><br>

> >  > Ok, it's partial trust. What's wrong with that?<br>

> ><br>

> >  I feel I'm not explaining myself correctly.<br>

> >  I do believe in peer review, and I'm happy for others to look at the<br>

> >  changes I propose. I'd love everyone to look at them!<br>

> ><br>

> >  The word "partial" was not appropriate, probably, it's more about<br>

> >  "inconsistency".<br>

> ><br>

> >  > In short: I'm more than OK with only partial trusting everyone. We ALL<br>

> >  > make mistakes, so let's stick with the processes we have which reduce<br>

> >  > the risk of these mistakes hitting end users..<br>

> ><br>

> >  I know I make mistakes, but I'm able to merge changes proposed by a<br>

> >  random contributor, where my own review is enough.<br>

> ><br>

> >  I'm FULLY trusted in that case, and this is conflicting with not<br>

> >  being so when it's me proposing the changes.<br>

> ><br>

> >  Why should I be able to merge in that case then ? Is it because<br>

> >  2 people review the code ? Me and the proponent ? Then why the<br>

> >  random contributor review would not count on MY prs ?<br>

> ><br>

> >  I hope you see the discrepancy here.<br>

> ><br>

> >  --strk;<br>

> ><br>

> >    Libre GIS consultant/developer<br>

> >    <a href="https://strk.kbt.io/services.html" rel="noreferrer" target="_blank">https://strk.kbt.io/services.html</a><br>

> >  _______________________________________________<br>

> >  QGIS-Developer mailing list<br>

> >  <a href="mailto:QGIS-Developer@lists.osgeo.org" target="_blank">QGIS-Developer@lists.osgeo.org</a><br>

> >  List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

> >  Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

> ><br>

> > _______________________________________________<br>

> > QGIS-Developer mailing list<br>

> > <a href="mailto:QGIS-Developer@lists.osgeo.org" target="_blank">QGIS-Developer@lists.osgeo.org</a><br>

> > List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

> > Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

><br>

><br>

> --<br>

><br>

> Julien Cabieces<br>

> Senior Developer at Oslandia<br>

> <a href="mailto:julien.cabieces@oslandia.com" target="_blank">julien.cabieces@oslandia.com</a><br>

> _______________________________________________<br>

> QGIS-Developer mailing list<br>

> <a href="mailto:QGIS-Developer@lists.osgeo.org" target="_blank">QGIS-Developer@lists.osgeo.org</a><br>

> List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

> Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

_______________________________________________<br>

QGIS-Developer mailing list<br>

<a href="mailto:QGIS-Developer@lists.osgeo.org" target="_blank">QGIS-Developer@lists.osgeo.org</a><br>

List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-developer" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>

</blockquote></div></div>