[GRASS-dev] Interested in parallelization of GRASS
Glynn Clements
glynn at gclements.plus.com
Sun Mar 27 14:17:07 EDT 2011
Maris Nartiss wrote:
> > Supporting concurrent reads on a single raster map would make the code
> > significantly more complex. Concurrent writes would be even worse
> > unless compression was disabled.
>
> And this brings us back to question - is current GRASS raster storage
> the best one? Could GRASS benefit from moving to some quad-tree like
> storage? Current gislib doesn't expose internal data structures to
> most of analysis modules and thus it shouldn't require any module
> changes.
> I know - for worst case scenarios such approach could provide no
> speedup or worse - be slower than current implementation, still I
> would love to see solid numbers on different storage approaches before
> saying "YES/NO" to alternatives.
> Anybody has a CS student interested in algorithms and data storage problems?
The biggest advantage of the current format is that skipping entire
rows (when the region's vertical resolution is coarser than the map's)
is trivial.
If I was going to change anything about the raster format, it would be
to allow the data to be partitioned horizontally, so that we can avoid
reading and decompressing entire rows when the region's east-west
bounds are only a small portion of the map's.
For modules which need non-sequential access, I'd suggest replacing
the segment library with something more efficient, e.g. something like
the code from r.proj. Or even just a flat file which can be mmap()ed,
and require the use of a 64-bit platform if you want to run such
modules on more than ~3 GiB of data.
--
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list