On 10/21/21 11:39 PM, B H wrote:
> I am trying to parallelize some scripts that currently run only one 
> way .( Currently scripts just create a new temporary mapset and run 
> some commands  using --exec option of grass78)
> My understanding is that I should follow this, however I am unable to 
> create a new mapset using this.
> https://grasswiki.osgeo.org/wiki/GRASS_and_Shell#Automated_batch_jobs:_Setting_the_GRASS_environmental_variables 
> <https://grasswiki.osgeo.org/wiki/GRASS_and_Shell#Automated_batch_jobs:_Setting_the_GRASS_environmental_variables>
> Here is what I have tried and failed
> 1)  grass78 executable cannot be run parallely from bash even  to just 
> create the mapset
> 2) g.mapset -c
> 3) g.proj -c
> If there is a different way to create a new mapset, please let me know 
> (I am not sure if its as simple as copying some files from a template 
> location...)

This seems to work for me, using GNU parallel.

I created a list of shapefiles:

micha at RMS:tmp$ head -3 list_of_shapefiles.txt

to use as input. Then I prepared a script to run grass on each, in it's 
own separate Location.

micha at RMS:tmp$ cat grass_process.sh
if [ $# -eq 0 ]; then
   echo "Input geospatial vector file is required."
   echo "Syntax: grass_process.sh <input_vector>"
output=`basename $input .shp`

# Prepare temporary name for Location (under the /tmp directory)

tmp_location=`mktemp -u`
# Create temporary GRASS location in that directory,

# using the shapefile as georeference, and run a command
grass78 -c $input ${tmp_location} --exec v.import input=$input 
output=$output --overwrite
sleep 10

The sleep at the end is just to give me time to check with ps -ef that 
all processes are running.

I made the script executable, of course.

Then here's the call to parallel:

micha at RMS:tmp$ parallel -j 10 ./grass_process.sh < list_of_shapefiles.txt

and after I fire it off, in a separate terminal I see:

micha at RMS:tmp$ ps -ef | grep grass
micha      45460   12434  0 10:04 pts/2    00:00:00 perl 
/usr/bin/parallel -j 10 ./grass_process.sh
micha      45481   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/cellular_antennas.shp
micha      45483   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/cities.shp
micha      45485   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/contour_20m.shp
micha      45488   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/contour_50m.shp
micha      45492   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/il_seas.shp
micha      45496   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/mideast_cities.shp
micha      45500   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/reshut_nikuz.shp
micha      45504   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/roads_ITM.shp
micha      45508   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/roads_negev.shp
micha      45512   45460  0 10:04 pts/2    00:00:00 /bin/bash 
./grass_process.sh /home/micha/GIS/Israel/roads.shp

all merrily going on together.

This will leave you with separate Locations/mapsets for each input file. 
I don't think you can get around that. GRASS itself is NOT parallelized, 
so you cannot run more than one GRASS process in the same mapset.

If you need to do a more complicated procedure, then wrap all the GRASS 
commands into another shell script and pass that script to the --exec 
parameter instead of individual commands.


> #current grass scripts....
> #Step 1: Create a new location with permanent mapset
> grass78 -c  /path/location/PERMANENT -e
> #Step2: run some commands using--exec option
> grass78--exec v.in.ogr input='$inputfile' output=$vecoutname 
> location=path/location/PERMANENT
