[GRASS-SVN] r72875 - sandbox/wenzeslaus/g.citation

svn_grass at osgeo.org svn_grass at osgeo.org
Thu Jun 21 20:18:43 PDT 2018


Author: wenzeslaus
Date: 2018-06-21 20:18:43 -0700 (Thu, 21 Jun 2018)
New Revision: 72875

Modified:
   sandbox/wenzeslaus/g.citation/g.citation.html
   sandbox/wenzeslaus/g.citation/g.citation.py
Log:
g.citation: add plain/basic JSON outputs

Modified: sandbox/wenzeslaus/g.citation/g.citation.html
===================================================================
--- sandbox/wenzeslaus/g.citation/g.citation.html	2018-06-22 02:02:54 UTC (rev 72874)
+++ sandbox/wenzeslaus/g.citation/g.citation.html	2018-06-22 03:18:43 UTC (rev 72875)
@@ -8,10 +8,33 @@
 <h4>Citation File Format</h4>
 
 <a href="https://citation-file-format.github.io/">Citation File Format</a>
-(CFF)
+(CFF) is a YAML based format for citations, specifically CITATION files
+to be included with software or code as <tt>CITATION.cff</tt>.
 
+<h4>JSON</h4>
+
+Currently, the keys and the overall structure are subject to change,
+but the plan is to stabilize it or to provide existing metadata format
+in JSON. Pretty-printed version is good, e.g., for saving into files,
+while the other, compact version is good for further processing.
+
+<h4>Pretty printed Python dictionary</h4>
+
+This format is essentially a dump of the internal data structure holding
+the citation entry. It should not be used in scripts, i.e. further
+parsed, for that there are other formats such as JSON. When this is
+advantageous is exploring what information the module was able to
+acquire for the citation.
+
 <h2>NOTES</h2>
 
+<ul>
+    <li>Don't use the <tt>format=dict</tt> for further processing.
+        It is meant for exploration of what information the module
+        acquired.
+    <li>The structure of the JSON output is yet not guaranteed.
+</ul>
+
 <h2>EXAMPLES</h2>
 
 <div class="code"><pre>
@@ -23,7 +46,10 @@
 <ul>
     <li>More output formats or styles are needed.
         The following formats were suggested so far:
-        <tt>csl,datacite,dublincore,json,json-ld,narcxml</tt>
+        <tt>csl,datacite,dublincore,json-ld,narcxml</tt>
+    <li>The structure of the JSON output is not guaranteed. It reflects
+        the internal structure (only the empty entries are removed).
+    <li>Version and date in CFF output are incomplete.
 </ul>
 
 <h2>SEE ALSO</h2>

Modified: sandbox/wenzeslaus/g.citation/g.citation.py
===================================================================
--- sandbox/wenzeslaus/g.citation/g.citation.py	2018-06-22 02:02:54 UTC (rev 72874)
+++ sandbox/wenzeslaus/g.citation/g.citation.py	2018-06-22 03:18:43 UTC (rev 72875)
@@ -36,8 +36,8 @@
 #% key: format
 #% type: string
 #% description: Citation format or style
-#% options: bibtex,cff,dict,plain
-#% descriptions: bibtex;BibTeX;cff;Citation File Format;dict;Pretty printed Python dictionary;plain;Plain text
+#% options: bibtex,cff,json,pretty-json,dict,plain
+#% descriptions: bibtex;BibTeX;cff;Citation File Format;json;JSON;pretty-json;Pretty printed JSON;dict;Pretty printed Python dictionary;plain;Plain text
 #% answer: bibtex
 #% required: yes
 #%end
@@ -80,11 +80,33 @@
 import os
 import re
 from collections import defaultdict
+import json
 from pprint import pprint
 
 import grass.script as gs
 
 
+def remove_empty_values_from_dict(d):
+    """Removes empty entries from a nested dictionary
+
+    Iterates and recurses over instances of dict or list and removes
+    all empty entries. The emptiness is evaluated by conversion to bool
+    in an if-statement. Values which are instances of bool are passed
+    as is.
+
+    Note that plain dict and list are returned, not the original types.
+    What is not an instance of instances of dict or list is left
+    untouched.
+    """
+    if isinstance(d, dict):
+        return {k: remove_empty_values_from_dict(v)
+                for k, v in d.items() if v or isinstance(v, bool)}
+    elif isinstance(d, list):
+        return [remove_empty_values_from_dict(i)
+                for i in d if i or isinstance(v, bool)]
+    else:
+        return d
+
 # TODO: copied from g.manual, possibly move to library
 # (lib has also online ones)
 def documentation_filename(entry):
@@ -298,6 +320,25 @@
     print("}")
 
 
+def print_json(citation):
+    """Create JSON dump from the citation dictionary"""
+    cleaned = remove_empty_values_from_dict(citation)
+    # since the format is already compact, let's make it even more
+    # compact by omitting the spaces after separators
+    print(json.dumps(cleaned, separators=(',', ':')))
+
+
+def print_pretty_json(citation):
+    """Create pretty-printed JSON dump from the citation dictionary"""
+    cleaned = remove_empty_values_from_dict(citation)
+    # the default separator for list items would leave space at the end
+    # of each line, so providing a custom one
+    # only small indent needed, so using 2
+    # sorting keys because only that can provide consistent output
+    print(json.dumps(cleaned, separators=(',', ': '), indent=2,
+                     sort_keys=True))
+
+
 def print_plain(citation):
     """Create citation from dictionary as plain text
 
@@ -327,8 +368,10 @@
 _FORMAT_FUNCTION = {
     'bibtex': print_bibtex,
     'cff': print_cff,
+    'json': print_json,
+    'pretty-json': print_pretty_json,
     'plain': print_plain,
-    'dict': pprint,
+    'dict': lambda d: pprint(dict(d)),  # only plain dict pretty prints
 }
 
 



More information about the grass-commit mailing list