[GRASS-dev] [SoC] Parallelization of Raster and Vector modules

Glynn Clements glynn at gclements.plus.com
Sun Apr 4 19:59:38 EDT 2010


Jordan Neumeyer wrote:

> > Most GRASS raster modules process data row-by-row, rather than reading
> > entire maps into memory. Reading maps into memory is frowned upon, as
> > GRASS is regularly used with maps which are too large to fit into
> > memory. Where the algorithm cannot operate row-by-row, use of a tile
> > cache is the next best alternative; see e.g. r.proj.seg (renamed to
> > r.proj in 7.0).
> >
> 
> That makes more sense. So a row is like chunk from the map data? Kind of
> like the first row of pixels from an image. So from the first pixel to width
> of image is one row, then width plus one starts the next, and so on and so
> forth. How large are the rows generally?

The width is determined by the current region. Input data is cropped,
padded with nulls and/or (nearest-neighbour) resampled to match the
current region.

The primary functions for reading raster data are:

6.x (and earlier):
	int G_get_raster_row(int fd, void *buf, int row, RASTER_MAP_TYPE type);
	int G_get_c_raster_row(int fd, CELL  *buf, int row);
	int G_get_f_raster_row(int fd, FCELL *buf, int row);
	int G_get_d_raster_row(int fd, DCELL *buf, int row);

7.0:
	void Rast_get_row(int fd, void *buf, int row, RASTER_MAP_TYPE type);
	void Rast_get_c_row(int fd, CELL  *buf, int row);
	void Rast_get_f_row(int fd, FCELL *buf, int row);
	void Rast_get_d_row(int fd, DCELL *buf, int row);

The buffer must contain space for G_window_cols() (Rast_window_cols() in
7.0) values. "row" must be non-negative and less than G_window_rows()
(Rast_window_rows() in 7.0); rows can be read in any order.

For writing, the functions are:

6.x (and earlier):
	int G_put_raster_row(int fd, const void *buf, RASTER_MAP_TYPE type);
	int G_put_c_raster_row(int fd, const CELL *buf);
	int G_put_f_raster_row(int fd, const FCELL *buf);
	int G_put_d_raster_row(int fd, const DCELL *buf);

7.0:
	void Rast_put_row(int fd, const void *buf, RASTER_MAP_TYPE type);
	void Rast_put_c_row(int fd, const CELL *buf);
	void Rast_put_f_row(int fd, const FCELL *buf);
	void Rast_put_d_row(int fd, const DCELL *buf);

Exactly G_window_rows()/Rast_window_rows() rows must be written,
sequentially and top-to-bottom, hence the lack of a "row" parameter.

Where possible, modules keep as few rows in memory as possible. E.g. 
r.series keeps a single row from each input map and a single row of
output data. r.resamp.interp keeps 1 (nearest), 2 (bilinear), or 4
(bicubic) rows of input and one row of output. r.neighbors keeps as
many map rows as there are rows in the neighbourhood window (i.e. a
sliding window).

The core of a simple raster module might look something like:

	rows = G_window_rows();
	cols = G_window_cols();

	for (row = 0; row < rows; row++ {
		G_get_d_raster_row(infd, inbuf, row);
		...
		for (col = 0; col < cols; col++ {
			...
		}
		...
		G_put_d_raster_row(outfd, outbuf);
	}

> > Holding an entire map in memory is only considered acceptable if the
> > algorithm is inherently so slow that processing a gigabyte-sized map
> > simply wouldn't be feasible, or the access pattern is such that even a
> > tile-cache approach isn't feasible.
> >
> > In general, GRASS should be able to process multi-gigabyte maps even
> > on 32-bit systems, and work on multi-user systems where a process
> > cannot assume that it can use a significant proportion of the system's
> > total physical memory.
> 
> Which is good. I didn't realize how big the data set could be. What's
> biggest map you've seen?

I'm a programmer who works on GRASS mainly out of interest, so I don't
have to actually deal with large datasets. But 10000x10000 doesn't
seem to be that uncommon here. The 2GiB limit on the (compressed) data
used to be a common problem until we added large file support. Using
r.series with 365 input maps isn't uncommon.

> > When I refer to I/O, I'm referring not just to read() and write(), but
> > also the (de)compression, conversion and resampling, i.e. everything
> > performed by the get/put-row functions. For many GRASS modules, this
> > takes more time than the actual processing.
> 
> I can see why, especially for big maps since it's doing that row-by-row.
> So when a GRASS module loads a map the basic algorithm looks something like:
> 1) Read row
> 2) get-row function does necessary preprocessing
> 3) row is cached or held in memory. Does the caching take place after
> 4) row is processed
> 5) Display/write process ? (Or is this after a couple iterations, all of
> them?)
> 5) repeat (1)
> 
> Would it be beneficial/practical to parallelize some of the preprocessing
> like conversion and resampling before the caching occurs?

Reading involves:

1. read()
2. Decompress (RLE or zlib)
3. Convert from portable representation to internal representation
(e.g. byte swapping, sign-bit to two's complement).
4. Crop, pad, resample
5. Embed nulls
6. Embed mask
7. Reclass

The result of step 3 is cached; a subsequent read of the same underlying
row will repeat steps 4-7 (when upsampling, consecutive region rows will
often correspond to the same underlying row in the input file).

Writing is simpler, and only involves (the reverse of) steps 1-3 and
5.

> Would I be working on GRASS 6.x or 7.x?

7.x.

> Is there a minimum compiler version
> when using GCC/MingW? Just curious because openMP tasks are only supported
> on GCC >= 4.2. Which may or not be useful, but can be a valuable tool when
> you don't know how much data or how many "tasks" you have. Like processing a
> linked-list or binary trees.

We try to support any C89-conformant compiler; extensions shouldn't be
used without a fallback (e.g. we use __attribute__ but define it to an
empty macro if the compiler isn't gcc). Code should compile and run on
all common platforms (Linux, Windows, Mac OSX, and any reasonably
modern Unix) wherever possible.

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list