[QGIS-Developer] QGIS repository management
Julien Cabieces
julien.cabieces at oslandia.com
Wed Oct 18 00:01:38 PDT 2023
> Is there anything I've missed?
A missing external ressource (OTB, oracle binaries...) that we get with curl. This one for
instance recently :
https://github.com/qgis/QGIS/actions/runs/6549790513/job/17793674746?pr=54963
Maybe we could store those ressources somewhere (a github repository for
instance?) to avoid those errors.
> network access manager test: depends on the badssl external website,
> which goes down occasionally. We could potentially run this site
> ourselves as part of the CI setup, but that's trading the dependence
> on an external service for added CI maintenance burden on ourselves.
>
That's what we do for webdav and minio tests :
https://github.com/qgis/QGIS/blob/master/.docker/docker-compose-testing.yml#L14
I don't know much about bad ssl returned response but is it that much
complicated to configure ?
Regards,
Julien
> On Tue, Oct 17, 2023 at 9:19 PM Nyall Dawson via QGIS-Developer <qgis-developer at lists.osgeo.org> wrote:
>
> >
> > IMHO, the main issue here is that the CI is too often broken for no
> > reasons related to the PR content. And most of the time recently, it's
> > because of one of the mingw jobs.
>
> I suggest we move this conversation into more practical areas and
> focus instead on the actual CI issues.
>
> In my experience there's a couple of main issues here:
>
> 1. The mingw build fails often with the error:
>
> cd /__w/QGIS/QGIS/build_mingw64/resources && /usr/bin/cmake -E copy
> /__w/QGIS/QGIS/resources/srs6.db
> /__w/QGIS/QGIS/build_mingw64/resources/srs.db
> make[2]: write error: stdout
>
> It always succeeds if you restart the job though. I don't know what's
> causing this... a disk space issue? (but then why would it succeed on
> the following attempt?)
>
> mingw runs on ubuntu-latest within a fedora container. ubuntu images come with plenty of bloat preinstalled that can be easily deleted.
> cf. https://github.com/easimon/maximize-build-space/blob/master/action.yml#L114-L130
> I just don't know how to do that if a "container" is specified as the delete action will need to happen on the base system.
>
>
> 2. The msys2 build can take HOURS (5+ in some cases). The qt 5/ qt6
> builds also see occasional extremely slow builds.
>
> I think the root cause of this is that we've exhausted all the
> available space allowed for caching objects on github. All the builds
> are trying to cache things to speed up the workflows, but the
> different caches are randomly getting deleted because together they've
> well exceeded the total allowed size. The solutions I could see here
> are:
>
> - approaching github to ask nicely for more space
> - dropping one or more workflows. Maybe we could combine the msys2 and
> mingw workflow into one to cut out one workflow. But that's blocked
> because the msys2 workflow currently has a lot of things disabled
> (including python), so it's currently not a suitable replacement for
> the mingw builds which give end users an easy way to test PRs on
> windows.
>
> What is the advantage of the msys2 workflow?
> It does "some build on windows" but IIUC does not replicate the actual windows build using OSGEO4W. Is there something else that it brings to
> the mix?
>
>
> 3. Random test failures. Main culprits are:
> - ept provider test (but I **think** I've fixed this with
> improvements/fixes made over the 3.34 bug fixing period). At least, I
> don't recall seeing this in the last week, when it used to be a daily
> random fail.
> - raster file writer test: randomly fails for an unknown reason. The
> test passes valgrind/asan fine, so I suspect the random failures are
> caused by some interplay with other tests modifying data in the test
> data folder. More work is needed.
> - raster analysis utils test: same as above.
> - network access manager test: depends on the badssl external website,
> which goes down occasionally. We could potentially run this site
> ourselves as part of the CI setup, but that's trading the dependence
> on an external service for added CI maintenance burden on ourselves.
>
> 4. Docker/ubuntu repository fails. This happens occasionally -- the
> external repos will have transient connectivity issues. Not much we
> can do here except manually re-running the jobs.
>
> Is there anything I've missed?
>
> There is also the occasional "QGIS is not compatible with the bleeding edge [sip, name your dependency] version that [rolling distro] has
> decided to start shipping today".
>
> Matthias
>
>
> Nyall
>
> >
> > Like proposed by Matthias, I would be very much in favor in modifying
> > our CI jobs to rely on non-rolling release distribution.
> >
> > Regards,
> > Julien
> >
> >
> > > In my experience, the peer reviews have proven to be an effective tool to improve the code quality.
> > > I think it can be explained with a four eyes principle. The first two eyes are involved in writing the code, the second pair of eyes that
> validates
> > > needs to be a trusted pair.
> > > For a code base of the maturity of QGIS, this seems to me to be justified and reasonable.
> > >
> > > As for who can do a review, a potential conflict of interest has been raised. This arises most noticeable within a single company, but can
> also
> > > very well be the case whenever a collaboration occurs between multiple individuals or if a well-connected client is involved. In my
> experience, in
> > > most cases people will first ask the person that they trust to be most qualified to review a specific area of the code, regardless the
> organisation
> > > they work for. In some cases, this may be your colleagues. What I and others have done in the past in such cases, is to leave an
> additional review
> > > window of a couple of days before merging to give everyone the opportunity to jump in if they wanted to, just to be sure to play it fair.
> In my
> > > experience, this worked quite nicely, but I am also open to other approaches.
> > >
> > > As for failing tests, I think one possibly low hanging fruit would be to switch all CI workflows to be based on non-rolling distributions.
> And we
> > > could potentially also be a bit more strict with disabling flaky tests or making them optional.
> > >
> > > Best wishes
> > > Matthias
> > >
> > > On Tue, Oct 17, 2023 at 9:47 AM Sandro Santilli via QGIS-Developer <qgis-developer at lists.osgeo.org> wrote:
> > >
> > > On Tue, Oct 17, 2023 at 03:08:34PM +1300, Nyall Dawson via QGIS-Developer wrote:
> > > > On Tue, 17 Oct 2023 at 03:41, Sandro Santilli <strk at kbt.io> wrote:
> > > > >
> > > > > On Mon, Oct 16, 2023 at 09:59:35AM +1300, Nyall Dawson via QGIS-Developer wrote:
> > > > >
> > > > > > If you flip the situation, you'll see that yes, you do have trust!
> > > > > >
> > > > > > - a complete stranger CANNOT approve their own changes
> > > > > > - a complete stranger CANNOT approve other stranger's changes
> > > > > > - a complete stranger CANNOT approve an approved member's changes
> > > > > >
> > > > > > vs
> > > > > >
> > > > > > - an approved member CANNOT approve their own changes
> > > > > > - an approved member CAN approve a complete stranger's changes
> > > > > > - an approved member CAN approve a another approved member's changes
> > > > >
> > > > > That's partial trust. I'm trusted to be able to judge someone else's
> > > > > work but not to judge my own work !
> > > >
> > > > Ok, it's partial trust. What's wrong with that?
> > >
> > > I feel I'm not explaining myself correctly.
> > > I do believe in peer review, and I'm happy for others to look at the
> > > changes I propose. I'd love everyone to look at them!
> > >
> > > The word "partial" was not appropriate, probably, it's more about
> > > "inconsistency".
> > >
> > > > In short: I'm more than OK with only partial trusting everyone. We ALL
> > > > make mistakes, so let's stick with the processes we have which reduce
> > > > the risk of these mistakes hitting end users..
> > >
> > > I know I make mistakes, but I'm able to merge changes proposed by a
> > > random contributor, where my own review is enough.
> > >
> > > I'm FULLY trusted in that case, and this is conflicting with not
> > > being so when it's me proposing the changes.
> > >
> > > Why should I be able to merge in that case then ? Is it because
> > > 2 people review the code ? Me and the proponent ? Then why the
> > > random contributor review would not count on MY prs ?
> > >
> > > I hope you see the discrepancy here.
> > >
> > > --strk;
> > >
> > > Libre GIS consultant/developer
> > > https://strk.kbt.io/services.html
> > > _______________________________________________
> > > QGIS-Developer mailing list
> > > QGIS-Developer at lists.osgeo.org
> > > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > >
> > > _______________________________________________
> > > QGIS-Developer mailing list
> > > QGIS-Developer at lists.osgeo.org
> > > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> >
> >
> > --
> >
> > Julien Cabieces
> > Senior Developer at Oslandia
> > julien.cabieces at oslandia.com
> > _______________________________________________
> > QGIS-Developer mailing list
> > QGIS-Developer at lists.osgeo.org
> > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> _______________________________________________
> QGIS-Developer mailing list
> QGIS-Developer at lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
--
Julien Cabieces
Senior Developer at Oslandia
julien.cabieces at oslandia.com
More information about the QGIS-Developer
mailing list