[GRASS-dev] [GRASS GIS] #2579: Specify command to be exectued as parameter of grass command

Tue Feb 3 20:28:25 PST 2015

#2579: Specify command to be exectued as parameter of grass command
----------------------------------------------+-----------------------------
 Reporter:  wenzeslaus                        |       Owner:  grass-dev@…              
     Type:  enhancement                       |      Status:  new                      
 Priority:  normal                            |   Milestone:  7.1.0                    
Component:  Startup                           |     Version:  svn-trunk                
 Keywords:  batch job, GRASS_BATCH_JOB, init  |    Platform:  All                      
      Cpu:  Unspecified                       |  
----------------------------------------------+-----------------------------
 To run some modules from outside of GRASS you currently have to either
 setup the
 [http://grasswiki.osgeo.org/wiki/Working_with_GRASS_without_starting_it_explicitly
 environment yourself] which is hard, error prone and you won't get it
 right anyway or you can use `grass` command in a batch mode. For this you
 have to specify `GRASS_BATCH_JOB` environmental variable and then call
 GRASS GIS:

 {{{
 export GRASS_BATCH_JOB=.../test_script.sh
 grass7 ~/grassdata/location/mapset
 }}}

 Although this works it might be quite cumbersome especially in some
 languages. For example Python has much smoother interface where you just
 specify the script and its arguments:

 {{{
 python .../test_script.py arg1 arg2 ...
 }}}

 The attached patch is introducing an additional interface for the `grass`
 command which allows to call scripts like this:

 {{{
 grass7 --mapset ~/grassdata/location/mapset --batch .../test_script.sh
 }}}

 But it actually allows to also use parameters, GRASS modules, and
 generally any commands:

 {{{
 grass7 --mapset ~/grassdata/location/mapset --batch .../test_script.sh
 some parameters
 grass7 --mapset ~/grassdata/location/mapset --batch r.info map=elevation
 }}}

 If you are fine with what is in the rc file, you can use just:

 {{{
 grass7 --batch r.info map=elevation
 }}}

 But I'm not sure if it is a best practice.

 I wrote the patch in the way that you don't get any additional output,
 just the output from the module, unless something unusual is happening
 (e.g., creation of a new location):

 {{{
 $ grass71 --mapset ~/grassdata/location/mapset --batch r.info
 map="elevation" -g
 north=228500
 south=215000
 east=645000
 west=630000
 nsres=10
 ewres=10
 rows=1350
 cols=1500
 cells=2025000
 datatype=FCELL
 ncats=255
 }}}

 I tried to preserve the functionality of `GRASS_BATCH_JOB` including the
 GRASS textual output and sanity checks.

 When both `GRASS_BATCH_JOB` and `--batch` are provided `--batch` is used
 and `GRASS_BATCH_JOB` is ignored as Python documentation says: ''it is
 customary that command-line switches override environmental variables
 where there is a conflict'' (e.g. `gcc` follows the same practice).

 The names `--mapset` and `--batch` seemed to me at best choice, although
 there are other good options too such as `--run`.

 To test, try something like:

 {{{
 cat > test_script.sh <<EOF
 #!/bin/bash
 echo "Hello from GRASS GIS (`date`)"
 echo "This is what was called: $0 $@"
 EOF
 }}}

 {{{
 grass7 --mapset ~/grassdata/location/mapset --batch test_script.sh some
 parameters
 }}}

 {{{
 grass7 --mapset ~/grassdata/location/mapset --batch r.mapcalc "aaa = 5"
 grass7 --mapset ~/grassdata/location/mapset --batch r.info aaa
 }}}

 GUI works too, although I'm not sure if it is useful (could be even
 inconvenient for scripting).

 {{{
 grass7 --mapset ~/grassdata/location/mapset --batch r.info
 }}}

 From what I see now, the only issue with calling individual modules is
 that you cannot (or should not) parallelize the calls of `grass` command
 in the same mapset.

 == Additional ideas ==

 This is out of scope of this ticket but there is a potential to create one
 even more powerful interface similar let's say to `git`.

 {{{
 #!sh
 mkdir some_project
 cd some_project
 # init connects to existing database, location and mapset or creates a new
 one
 # creates .grassrc (.rc or .gisrc) file current directory
 grass7 init ~/grassdata/location/mapset [-c | -c geofile | -c
 EPSG:code[:datum_trans]]
 grass7 import .../some_image.tiff
 grass7 run r.info some_image
 grass7 run r.mapcalc "improved_image = 5 * some_image"
 grass7 export improved_image .../improved_image.tiff
 # next time you can cd into some_project directory and commands will work
 right away
 # because .grassrc file will be already there
 }}}

 Some commands such as `grass7 link` or `grass7 external` might be quite
 useful, although they would be, similarly to `grass7 import` and `grass7
 export` just appropriate `r.in.gdal`, `r.in.proj`, etc. calls.

 It would be even more interesting to have:

 {{{
 grass7 run r.slope.aspect elevation=file://.../elevation.tiff
 aspect=file://...aspect.tiff
 }}}

 The `grass` command would have to parse the command line, find the files
 which should be maps and link them. And perhaps if it wouldn't be `grass7
 run` but something different such as `grass7 runonly`, we could even skip
 the `.grassrc` and create location on the fly in `/tmp` and delete it
 after execution. If data would be just linked, not imported and exported,
 it could be pretty fast. (But obviously we could be hitting issues with
 projection and topology here, so it is a bit tricky.)

-- 
Ticket URL: <http://trac.osgeo.org/grass/ticket/2579>
GRASS GIS <http://grass.osgeo.org>