[gdal-dev] git(hub) migration ?
Even Rouault
even.rouault at spatialys.com
Wed Sep 6 06:14:26 PDT 2017
Hi,
I've heard a few voices speaking/asking/begging for a git/github migration. At some point
we'll certainly have to do it, as SVN vs git is beginning to feel more and more like CVS vs SVN
15 years ago.
I can see different options :
1) migrate to git, and remain within the OSGeo infrastructure. This is for example the case of
GEOS which uses the Trac git plugin and the GOGS (or is gitea?) git hosting (https://
git.osgeo.org/gogs/geos/geos.git). gogs/gitea tries to replicate most github functionalities,
but feature parity is still not there (you cannot comment on a commit e.g). We could still
offer github as a mirror, which would ease contributions a bit compared to the current git-
>svn situation, but the "Merge" button from github couldn't (shouldn't) be used.
2) migrate code to github, accept pull requests (PR) directly there. Tickets still managed in
Trac. But then we have no automated link between Trac and github (unless there's a Trac
plugin for that). A few other OSGeo projects are in a similar situation: QGIS with Redmine for
tickets and github for PR&code, GeoServer with JIRA for tickets and github for PR&code. I've
some experience with the QGIS situation: Redmine can include github commit references if
the git commit message includes the ticket number. But, of course, the other way doesn't
work: from github UI, the #XXXX link doesn't work (or would point to a unrelated PR with
same number). So this is middly satisfactory, and a regression from our current situation.
3) migrate code and tickets to github. I guess this would match most (especially occasional)
contributor wishes regarding the "social" aspect. What would be needed is Trac -> github
ticket migration. Thomas Bonfort did it for MapServer at some point, but he lost the script if I
remember well (I can see in https://stackoverflow.com/questions/6671584/how-to-export-trac-to-github-issues a number of possibilities listed). One issue also is we have numbers
taken by existing github pull requests, so there would be collisions on import (we could
decide either to sacrifice colliding Trac tickets, as there are really old, currently the colision
appear for tickets older than 2003, or move them to an available github ticket number. Or to
sacrifice existing PR, but there are a few pending ones)
There's also the valid concern about being tied with github.com regarding tickets. Recently I
found https://github.com/josegonzalez/python-github-backup
which can backup code, issues, pull requests, etc.. using the github API.
Quickly tested it on their own repo. Seems to work (**), although a bit slow ( requires 2 GET
per issue / pull request to retrieve extra details that are not retrieved by the global request
you showed above). It has an incremental mode though which should make it efficient.
My synthetic view of the situation:
1) is a pure free sofware & free hosting approach. Relies on SAC being appropriately man and
machine powered (same as with SVN / Trac currently)
2) mixed solution. we still have ownership on all our data (code & tickets). but the separation
between a ticket system and code+PR isn't ideal from a usability point of view
3) offers probably the best contributor experience. We loose a bit of control, but a backup(*)
strategy exists (at least for now). I'd tend to favor this approach.
I'm not sure if the current git mirror shouldn't be re-done from scratch. Its main drawback is
that svn tags are reported as git branches, instead of git tags. I probably mis-configured
things when I initiated the mirror a few years ago. Not completely sure this is worth the
effort though. We can probably live with those existing mis-created tags, and use proper git
tags for the future.
The release procedure / script would also have to be updated. Probably other things too.
There's also the question of the Trac Wiki, although this one might be defered for a later
stage.
So this email is mostly to say I'm open to the idea, but I'd appreciate if someone else could
take the lead on this. I'd be happy to help. A RFC to formalize the move would be needed.
Even
(*) Backup might not be the inappropriate term, since this implies that you can easily restore
things. If github closes or requires (insane) fees for open source projects, those saved tickets
will have to be re-injected in some to-be-defined alternative, but at least their content is
readable (json)
(**) steps:
1) pip install github-backup
2) in github UI, create a personnal access token, so as to be able to use authenticated
requests to github API, to bypass the rate limit of unauthenticated requests (you can use it
even for repositories that you don't own)
3) github-backup -i -R python-github-backup josegonzalez --issues --issue-comments --issue-
events --pulls --pull-comments --pull-commits -t ${your_github_token}
--
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170906/efe9faac/attachment-0001.html>
More information about the gdal-dev
mailing list