[GRASS-dev] Interested in parallelization of GRASS

Markus Metz markus.metz.giswork at googlemail.com
Tue Mar 29 02:51:13 EDT 2011


Hamish wrote:
> Glynn:
>> The biggest advantage of the current format is that skipping entire
>> rows (when the region's vertical resolution is coarser than the map's)
>> is trivial.
>
> note that is also a great feature when the map only takes up a small
> part of the working region, e.g. a stamp sized map in the corner of a
> big r.patch or display operation. It flies past the rows which are all
> NULLs. (but doesn't help much if the map is tall and thin or diagonal,
> and your region is square) ...
>
>> If I was going to change anything about the raster format, it would be
>> to allow the data to be partitioned horizontally, so that we can avoid
>> reading and decompressing entire rows when the region's east-west
>> bounds are only a small portion of the map's.
>
> unrealted aside: Since the old-old bug tracker days there was a wish
> to replace the i.rectify code with the same-origin GDALwarp API library,
> which is much faster. Last Summer (of Code) Seth's main project was
> to add OpenCL capability to gdalwarp (which also allows you to do multi-
> core if you don't have a capable GPU). The OpenCL r.sun work was tacked
> on to the end of that project.
>
Note that r.proj, i.rectify, and i.ortho.rectify now use very similar
code, all use the caching code of r.proj. i.rectify is now quite a bit
faster. IOW, improvements to any of these three modules can be applied
to all three modules.

>
>> For modules which need non-sequential access, I'd suggest replacing
>> the segment library with something more efficient, e.g. something like
>> the code from r.proj. Or even just a flat file which can be mmap()ed,
>> and require the use of a 64-bit platform if you want to run such
>> modules on more than ~3 GiB of data.
>
> Although it's an obvious target, I didn't mention the segment library
> in the parallelization suggestions as I seemed to recall the
> disatisfaction with the current implimentation.
>
The segment lib implementation in trunk is considerably faster than in
6.x, but not as efficient as the caching method of r.proj. Some
reasons why the segment library is slower than the caching code in
r.proj are 1) write support in the segment lib, 2) flexible data
storage size in the segment lib, 3) flexible tile size in the segment
lib, 4) bad abuse of the segment library, e.g. in r.los.

Markus M


> If would be cool if a replacement for the segment library could be
> (re)built from the ground up with parallelization in mind.
>
>
> Hamish:
>> > So I try to think of modules which are CPU bound.. the first
>> > task is to replace inefficient algorithms with better ones (e.g.
>> > Glynn's r.cost work,
>>
>> Huh? I haven't done anything related to r.cost.
>
> fixing the over-by-one bug in the segment library sped up r.cost by
> over 50x in my tests (YMMV). (sometime before 6.4.0RC1)  Not technically
> a fix to r.cost, but it sure is nice to have it go heaps faster.
> I misspoke a bit; no inefficient algorithm was replaced there.

Was this speed-up constant for different regions? In theory, this
speed-up should be larger for small regions and barely noticeable for
larger regions.

Markus M

>
>
>> If you're thinking of r.grow.distance, that can't be used as a
>> substitute for r.cost except in the case of constant cost, as it
>> relies upon "distance" being monotonic with respect to x and y.
>
> I still have to play with that, as it could be added as an optional
> flag in v.surf.icw (from addons), where r.cost was always the slowest
> step. That is a heavy user of r.mapcalc, so will be a good test for
> pthreads too.
>
>
>
> thanks,
> Hamish
>
>
>
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev
>


More information about the grass-dev mailing list