[GRASS-PSC] DataCite: Persistent identifers for GRASS GIS - and OSGeo

"Peter Löwe" peter.loewe at gmx.de
Tue Jan 3 07:59:30 PST 2017


Dear GRASS PSC, Allen, and Helmut, (cc: Britta Dreyer, DataCite)

a happy new year to all of you !

Thanks for your feedback regarding the use of persistent identifiers (PID) (like DOI and handles) and sharing the experiences by CoMSES/ASU. 

Sorry for taking so long to reply !

As Michael has pointed out, GRASS is indeed a dynamic, living digital object, consisting of different branches of code and documentation, augmented with add-on modules. 
The use of persistent digital identifiers like Digital Object Identifiers (DOI) or handles for all aspects of the research cycle (including software like GRASS GIS) is still evolving. Currently, it is easy to asssign a PID to relatively static content, like a software release or a data set, while a dynamic digital object(-cluster?), like GRASS, is still a challenge. On the good side, there are already options available how to handle this (recommendations by FORCE11, depending on the use case).

A dialogue on how to address and overcome such challenges would be welcomed by the DataCite organisation, which drives scientific data citation by DOI. DataCite is a NGO, headquartered in Germany. DataCite itself is non-commercial (https://en.wikipedia.org/wiki/DataCite). However, DataCite members _can_ elect to be commercial, by requiring service charges for DOI minting from its customers as they provide infrastructure  (for landing pages, servers, archiving/backup).

I suggest to start a discussion within the GRASS community, to clarify potential benefits by assigning PIDs to aspects of the GRASS project, like GRASS releases, devel-snapshots, add-on modules and the like (-> a topic for FOSS4G ?).

As a lot of potential GRASS add-ons are created in various fields of Academia, I assume that a workflow which assigns PID to GRASS add-on module code which is uploaded into the repository (meeting basic quality criteria) would be beneficial in at least three ways: 1) the researchers receive a reward in the "coin of their realm" for their scientic code, which can be cited in publications; 2) the GRASS community benefits as additional code/functionalities becomes visible/usable for all of us, 3) GRASS as a whole receives more visibility within Academia through search portals for PID. 

Example: The DataCite search engine already provides a significant number of GRASS-related content (http://search.datacite.org/works?query=GRASS+GIS), dating back to 1987 (https://doi.org/10.5446/12963 )). BTW, it is also worthwile to run queries for the names of PSC members on it...

Provided that the GRASS community, or some other OSGeo software project, should decide that they want to start using PID, especially DOI, some strategic opportunities arise:

- Using the services of an existing DataCite member would require an annual fee of "x" USD/EUR to mint up to "y" DOI (ca. 150USD for 1000DOI). Currently this is only viable for git-based repositiories. As GRASS is still using SVN, there's a problem.

- Alternatively, GRASS, or OSGeo(!), could elect to become a DataCite member by itself. This also comes for an annual fee, but would enable GRASS/OSGeo to mint as many DOI as needed and would solve the SVN-issue. With the already existing OSGeo infrastructure, meeting the DataCite infrastructure requirements (landing pages, archiving/backup, best practices) will not be difficult. 

As already said, DataCite would highly value a dialogue with GRASS / OSGeo on how to better address software communities. They are suggesting to enable access to their API via a test account, to investigate mappings between the metadata standards of GRASS / OSGeo and DataCite. This would allow us on the GRASS/OSGeo side to mint DOI with an expiration date: These "short-lived DOI" expire after 3 months, as they are only good for testing/evaluation purposes.

This could be investigated as part of a Google Summer Of Code (GSOC) Project, like we already had in past years (example: https://grasswiki.osgeo.org/wiki/GRASS_GSoC_2012_Image_Segmentation).

Please let me know what you think of this.
 

best,
Peter


<peter.loewe at gmx.de>


> Gesendet: Montag, 21. November 2016 um 18:28 Uhr
> Von: "Michael Barton" <Michael.Barton at asu.edu>
> An: "Helmut Kudrnovsky" <hellik at web.de>
> Cc: "grass-psc at lists.osgeo.org" <grass-psc at lists.osgeo.org>
> Betreff: Re: [GRASS-PSC] Introducing DOI for software, documentation and data in the GRASS project
>
> In fact, our plan is to mint DOI's via Zenodo to replace handles minted by ASU Libraries in the future.
> 
> Michael
> ____________________
> C. Michael Barton
> Director, Center for Social Dynamics & Complexity 
> Professor of Anthropology, School of Human Evolution & Social Change
> Head, Graduate Faculty in Complex Adaptive Systems Science
> Arizona State University
> 
> voice:  480-965-6262 (SHESC), 480-965-8130/727-9746 (CSDC)
> fax: 480-965-7671 (SHESC),  480-727-0709 (CSDC)
> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > On Nov 21, 2016, at 7:20 AM, Helmut Kudrnovsky <hellik at web.de> wrote:
> > 
> > Michael Barton wrote
> >> Markus and Co.
> >> 
> >> This is something CoMSES Net (Network for Computational Modeling in Social
> >> and Ecological Sciences: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.comses.net&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=2XqftSPoWzzKpQBZsef2l78EzFJ7V-qFsxgPlEbR98I&e= ) has been working with for
> >> some years now. We maintain a software code library, where researchers can
> >> publish model code. We also provide for the option of code peer review,
> >> which can happen when code is submitted to the library for review along
> >> with a paper sent to a journal, or independent of any paper review. Code
> >> that has passed peer review is currently assigned a “handle” from
> >> handle.net<https://urldefense.proofpoint.com/v2/url?u=http-3A__handle.net-26gt-3B&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=UGR-DI_wHshdnJafI6Bq9052kj0tJtFttN95PJGI9c0&e= . Handle.net<https://urldefense.proofpoint.com/v2/url?u=http-3A__handle.net-26gt-3B&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=UGR-DI_wHshdnJafI6Bq9052kj0tJtFttN95PJGI9c0&e= 
> >> is the organization that oversees the digital identifier ecosystem. DOI’s
> >> are commercial instances and handles are open source instances, but both
> >> are ultimately under the purview of handle.net<https://urldefense.proofpoint.com/v2/url?u=http-3A__handle.net-26gt-3B&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=UGR-DI_wHshdnJafI6Bq9052kj0tJtFttN95PJGI9c0&e= .
> >> With a new grant from NSF, CoMSES Net is now part of a new national data
> >> infrastructure network in the US. One of our plans is to transition from
> >> handles to DOI’s because these are more widely recognized.
> >> 
> >> Given all this, we’ve had to think quite a bit about how to ‘publish’
> >> model code and assign identifiers. As Vaclav points out there are
> >> significant issues with versioning. What happens with a new version? We’ve
> >> adopted a conceptual position that we are not a versioning repository
> >> primarily, but a place where authors can publish ‘finished’ code used in a
> >> research project or product. We are trying to treat this like a library
> >> and journal environment in that sense. We allow for minor revisions to
> >> correct errors (including as a response to reviews). But if a new product
> >> (e.g., a research paper) uses a new version of model code, we consider
> >> that a new digital object published, which could get a new handle/DOI
> >> distinct from a version of a model used for an earlier product. This
> >> remains something that is complicated to implement in practice. But the
> >> concept involves the reason for giving out the handle/DOI in the first
> >> place.
> >> 
> >> Currently, only about 10% of published model based science makes code
> >> available for review or reuse. We think it is increasingly important that
> >> researchers share the code that is an important component to scientific
> >> practice in the same way they share research protocols and results—and are
> >> increasingly encouraged to share data. But sharing code takes effort, and
> >> even researchers with the best intentions may find it difficult to find
> >> the time or energy to make code available. So we are trying to create
> >> incentives that will have some value in the academic/research world,
> >> including citable products. All models published in the CoMSES Net library
> >> have automatically generated citations. Those that have passed peer
> >> review, verifying some degree of software quality, are also given
> >> permanent identifiers (handles/DOIs), with the idea that researchers can
> >> put them on their CVs where they at least have the possibility of gaining
> >> them some recognition for the work carried out. That is, we consider a DOI
> >> as an incentive for sharing code and a bit of a lever to get others to
> >> cite that code if they use it.
> >> 
> >> We are still trying to work out how best to handle improvements (bug
> >> fixes) to a model vs. new models. We are moving our library to a Git
> >> environment, but are still working out how to implement our concept of
> >> “published” snapshots of code in a library/journal in versions and
> >> releases in Git. We do have a roadmap and are working on it, but we don’t
> >> yet have a solution in place.
> >> 
> >> Where is all this leading? We need to ask what is the value to assigning
> >> DOIs to GRASS code, how might they benefit GRASS developers, and how might
> >> they be used by GRASS software users? I don’t see that they provide the
> >> kind of incentives that CoMSES Net is envisioning for computational model
> >> developers. Most DOIs are assigned to finished products as digital
> >> objects. From that perspective, GRASS could get a DOI, but not its
> >> component modules. But what about each version of GRASS?  GRASS has formal
> >> releases, but not its components. Some code is in the released code base
> >> and other is in addons. There is ongoing development in the SVN. GRASS is
> >> a digital object of course, as are its component code modules, but it is a
> >> dynamic, living one and not a static one. Perhaps there are other benefits
> >> to working out the complications of where and when to assign DOIs in the
> >> GRASS ecosystem. But it will be good to start with a discussion of why and
> >> for whom we would do it.
> >> 
> >> (I’m copying Allen Lee from the CoMSES Net leadership team as he has
> >> thought a lot about this and might have other things to add.)
> >> 
> >> Cheers
> >> Michael
> > 
> > some kind of related:
> > 
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__ivory.idyll.org_blog_2016-2Dusing-2Dzenodo-2Dto-2Darchive-2Dgithub.html&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=ACWKZ0nzJky9D7hlnBU6flBzCk55BFWCmCbo7wV6UBk&e= 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__zenodo.org_&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=vtzV_W3cp5WYp-O8pUNtup92VVKHv73gaWTqeihxJWY&e= 
> > 
> > 
> > 
> > 
> > 
> > 
> > -----
> > best regards
> > Helmut
> > --
> > View this message in context: https://urldefense.proofpoint.com/v2/url?u=http-3A__osgeo-2Dorg.1560.x6.nabble.com_Introducing-2DDOI-2Dfor-2Dsoftware-2Ddocumentation-2Dand-2Ddata-2Din-2Dthe-2DGRASS-2Dproject-2Dtp5296235p5296759.html&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=6JOIwjz9cSQaMT28L4dz6rClrqSvZTuYJqZNt1vDlK0&e= 
> > Sent from the GRASS-PSC mailing list archive at Nabble.com.
> > _______________________________________________
> > grass-psc mailing list
> > grass-psc at lists.osgeo.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.osgeo.org_mailman_listinfo_grass-2Dpsc&d=CwIGaQ&c=AGbYxfJbXK67KfXyGqyv2Ejiz41FqQuZFk4A-1IxfAU&r=vxOW6PLS28MPea_dWUwPfRf71TAIziRDuFqWJimQN1I&m=fxAwSdFUptqrNZhLpoUeV51d4MElroWSZqBOxXCcNhw&s=guZH59FlD0IYS2uVWrRZMpP4FKd1jnLg_9nj2iw_BHk&e=
> 
> _______________________________________________
> grass-psc mailing list
> grass-psc at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-psc


More information about the grass-psc mailing list