[GRASS-user] multiprocessing in python

Moritz Lennert mlennert at club.worldonline.be
Wed Feb 7 00:59:24 PST 2018


[Please always keep the list in CC.]

On 06/02/18 22:57, Leonardo Hardtke wrote:
> Hi, thanks Moritz.
> I tried with your suggestion but I get the same error out...
> 
> As a side note, If the process does not read any data in it works as 
> expected (ie commenting the for loop).

Can you identify which specific call in the loop ?

Have you tried launching with

pool.map(tile_process, [1, 2]) ?

> 
> I have a similar approach working OK with plain gdal 
> (https://gist.github.com/leohardtke/b54e79ed93546c0db840c7b5e951a6ce).
> 
> There must be something with the grass raster python module, but I can't 
> figure it out.

Not sure if it is raster, or rather temporal dataset handling. I don't 
have time to look at this in detail now, so I'm putting grass-dev in CC 
so you might get some answers from people more knowledgeable in temporal 
data processing than me.

A bit more info (e.g. more details of the code, such as the definition 
of your pool, but also OS, versions, etc) might be helpful.

Moritz




> 
> Cheers
> 
> On 7 February 2018 at 00:47, Moritz Lennert 
> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
> 
>     On 06/02/18 12:09, Leonardo Hardtke wrote:
> 
>         Dear all,
>         I am working on a module to extract the phenological parameters
>         (like timesat) from a time series implemented in python/cython
>         and making use of gscript and other grass stuff.
>         It works great on a 256x256 and as the plan is applying it over
>         Australia at 250m over 17 years, I need to split the process in
>         small tiles. The idea is to run this processes in parallel and I
>         am having issues implementing it.
> 
>         This would be the first part of the process that runs on each tile:
> 
>         def tile_process(tile_index):
>               '''
>               Function for every worker:
>               Applies any function to the sub_region corresponding to
>         the tile_index.
>               '''
>               global Rows
>               global Cols
>               global RowBlockSize
>               global ColBlockSize
>               global full_region
>               global dates
>               global years
>               global indices
>               global data_serie
>               global yr_limits_extra
>               global yr_limits
>               global dbif
> 
>               sub_name='block'
>               TileRow, TileCol, sr =
>         sub_region(tile_index,full_region,RowBlockSize,ColBlockSize)
>               # # Define a temporary region based on the parameters
>         caluculated with the
>               start_row = TileRow * RowBlockSize
>               start_col = TileCol * ColBlockSize
>               n_rows = sr['rows']
>               n_cols = sr['cols']
> 
>               strds = tgis.SpaceTimeRasterDataset(data_serie)
>               strds.select(dbif=dbif)
>               maps = strds.get_registered_maps_as_objects(dbif=dbif)
> 
>               # Numer of time steps
>               steps = len(maps)
>               # Make an empty array
>               #print(steps)
>               EVI = np.empty([steps,n_rows,n_cols])
>               # fill the array
>               for step, map in enumerate(maps):
>                    map.select(dbif=dbif)
>                    image_name = map.get_name()+'@'+data_serie.split('@')[1]
>                    #print("reading: {}".format(image_name))
>                    EVI[step,:] =
>         raster2numpy_sub(image_name,start_row,n_rows,start_col,n_cols)
>               mean = EVI.mean()
>               print(mean)
>               ....
>               ....
>               ....
> 
> 
>         and this is how I start the multiprocess pool.
> 
>               pool.map(tile_process, xrange(RowBlockN*ColBlockN))
>               pool.close()
>               pool.join()
> 
>         and it gives me:
> 
>         AssertionError: can only test a child process
> 
> 
>         of course if I do: tile_process(0) or tile_process(1) etc ,the
>         right result comes out.
> 
>         Does any of you have experience with this? Any suggestion would
>         be welcome!
>         Sorry for the messy code. Is still in early stage.
> 
> 
>     Just a wild guess: have you tried with range (which returns a list)
>     instead of xrange (which returns an xrange object) ?
> 
>     Moritz
> 
> 
> 
> 
> -- 
> Dr. Leonardo A. Hardtke
> C3 UTS, Scientific Officer
> CB04.06.315.06
> Email:leonardoandres.hardtke at uts.edu.au 
> <mailto:leonardoandres.hardtke at uts.edu.au> orleohardtke at gmail.com 
> <mailto:leohardtke at gmail.com>




More information about the grass-user mailing list