[GRASS-user] scripting efficiency

Pietro peter.zamb at gmail.com
Fri Oct 18 08:27:21 PDT 2013


Hi Wiley,

On Wed, Oct 16, 2013 at 6:59 AM, Wiley Bogren <wiley.bogren at gmail.com> wrote:
> hi GRASS community!
>
> I'm amazed how well this software handles vector operations - especially the
> overlay operation seems unparalleled in open source software.  Thank you
> very much to everyone who has been involved in the development process!
>
> I would like to script a workflow where I apply the same set of operations
> on a few hundred sets of shapefiles, consisting of v.in.ogr, several sets of
> v.overlay, some database operations and v.out.ogr.  The shapefiles are
> 20-30MB apiece, containing many polygons, each with many vertices.
>
> Is there a difference in speed or processor efficiency between the different
> scripting approaches?  By which I mean python vs bash shell, and within the
> GRASS environment vs calling the functions from outside the environment
> (like via python grass.script).

Sorry for the late response...
I've imported several files, using a multiprocessing approach in python, with:

{{{
from multiprocessing import Queue, Process, cpu_count
from os.path import split
from subprocess import Popen

from grass.pygrass.functions import findfiles


def spawn(func):
    def fun(q_in, q_out):
        while True:
            path, cmdstr = q_in.get()
            if path is None:
                break
            q_out.put(func(path, cmdstr))
    return fun


def mltp_importer(dirpath, match, cmdstr, func, nprocs=cpu_count()):
    q_in = Queue(1)
    q_out = Queue()
    procs = [Process(target=spawn(func), args=(q_in, q_out))
             for _ in range(nprocs)]
    for proc in procs:
        proc.daemon = True
        proc.start()

    # set the parameters
    sent = [q_in.put((path, cmdstr)) for path in findfiles(dirpath, match)]
    # set the end of the cycle
    [q_in.put((None, None)) for proc in procs]
    [proc.join() for proc in procs]
    return [q_out.get() for _ in range(len(sent))]


def importer(path, cmdstr):
    name = split(path)[-1][:-4]
    popen = Popen(cmdstr.format(path=path, name=name), shell=True)
    popen.wait()
    return path, name, False if popen.returncode else True

DIR = '/data/gis/data/Aviemore/shp'
CMD = 'v.in.ogr dsn={path} layer={name} output={name} -o --o'

processed = mltp_importer(DIR, '*.shp', CMD, importer)
# check for errors
errors = [p for p in processed if not p[2]]
if errors:
    # do something
    pass
}}}

I hope that this could help you...
the code is freely inspired by: http://stackoverflow.com/a/16071616


More information about the grass-user mailing list