[gdal-dev] git(hub) migration ?

Even Rouault even.rouault at spatialys.com
Wed Sep 6 06:14:26 PDT 2017


Hi,

I've heard a few voices speaking/asking/begging for a git/github migration. At some point 
we'll certainly have to do it, as SVN vs git is beginning to feel more and more like CVS vs SVN 
15 years ago.

I can see different options :

1) migrate to git, and remain within the OSGeo infrastructure. This is for example the case of 
GEOS which uses the Trac git plugin and the GOGS (or is gitea?) git hosting (https://
git.osgeo.org/gogs/geos/geos.git). gogs/gitea tries to replicate most github functionalities, 
but feature parity is still not there (you cannot comment on a commit e.g). We could still 
offer github as a mirror, which would ease contributions a bit compared to the current git-
>svn situation, but the "Merge" button from github couldn't (shouldn't) be used.

2) migrate code to github, accept pull requests (PR) directly there. Tickets still managed in 
Trac. But then we have no automated link between Trac and github (unless there's a Trac 
plugin for that). A few other OSGeo projects are in a similar situation: QGIS with Redmine for 
tickets and github for PR&code, GeoServer with JIRA for tickets and github for PR&code. I've 
some experience with the QGIS situation: Redmine can include github commit references if 
the git commit message includes the ticket number. But, of course, the other way doesn't 
work: from github UI, the #XXXX link doesn't work (or would point to a unrelated PR with 
same number). So this is middly satisfactory, and a regression from our current situation.

3) migrate code and tickets to github. I guess this would match most (especially occasional) 
contributor wishes regarding the "social" aspect. What would be needed is Trac -> github 
ticket migration. Thomas Bonfort did it for MapServer at some point, but he lost the script if I 
remember well (I can see in https://stackoverflow.com/questions/6671584/how-to-export-trac-to-github-issues a number of possibilities listed). One issue also is we have numbers 
taken by existing github pull requests, so there would be collisions on import (we could 
decide either to sacrifice colliding Trac tickets, as there are really old, currently the colision 
appear for tickets older than 2003, or move them to an available github ticket number. Or to 
sacrifice existing PR, but there are a few pending ones)
There's also the valid concern about being tied with github.com regarding tickets. Recently I 
found https://github.com/josegonzalez/python-github-backup 
which can backup code, issues, pull requests, etc.. using the github API. 
Quickly tested it on their own repo. Seems to work (**), although a bit slow ( requires 2 GET 
per issue / pull request to retrieve extra details that are not retrieved by the global request 
you showed above). It has an incremental mode though which should make it efficient.

My synthetic view of the situation:
1) is a pure free sofware & free hosting approach. Relies on SAC being appropriately man and 
machine powered (same as with SVN / Trac currently)
2) mixed solution. we still have ownership on all our data (code & tickets). but the separation 
between a ticket system and code+PR isn't ideal from a usability point of view
3) offers probably the best contributor experience. We loose a bit of control, but a backup(*) 
strategy exists (at least for now). I'd tend to favor this approach.

I'm not sure if the current git mirror shouldn't be re-done from scratch. Its main drawback is 
that svn tags are reported as git branches, instead of git tags. I probably mis-configured 
things when I initiated the mirror a few years ago. Not completely sure this is worth the 
effort though. We can probably live with those existing mis-created tags, and use proper git 
tags for the future.

The release procedure / script would also have to be updated. Probably other things too.

There's also the question of the Trac Wiki, although this one might be defered for a later 
stage.

So this email is mostly to say I'm open to the idea, but I'd appreciate if someone else could 
take the lead on this. I'd be happy to help. A RFC to formalize the move would be needed.

Even

(*) Backup might not be the inappropriate term, since this implies that you can easily restore 
things. If github closes or requires (insane) fees for open source projects, those saved tickets 
will have to be re-injected in some to-be-defined alternative, but at least their content is 
readable (json)

(**) steps:
1) pip install github-backup
2) in github UI, create a personnal access token, so as to be able to use authenticated 
requests to github API, to bypass the rate limit of unauthenticated requests (you can use it 
even for repositories that you don't own)
3) github-backup -i -R python-github-backup josegonzalez --issues --issue-comments --issue-
events --pulls --pull-comments --pull-commits -t ${your_github_token}


-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170906/efe9faac/attachment-0001.html>


More information about the gdal-dev mailing list