[GRASS-dev] Re: [GRASS-user] Benchmarking Grass 4.3, 5.4, 6.0, 6.2 raster commands

Helena Mitasova hmitaso at unity.ncsu.edu
Mon Apr 30 16:14:39 EDT 2007


Glynn,

does the null implementation affect also the runs with rasters that  
have no nulls
and there is no MASK present? As I have recently written, I have
noticed what seems to be totally unrelated changes in performance in
v.surf.rst compared the 4.3 - mostly along the line of changing to  
G_ludcmp
from an internal lineq solver, but it may be somewhere else.

10x faster is a huge difference - it may be worthwhile to find out
whether it is true for integer maps without nulls and whether it
is really nulls slowing it down so badly.

There were many discussions about the null implementation and as  
Glynn correctly
points out the main driver for the current design was to sacrifice  
the performance
to preserve the backwards compatibility. Wishes of old users (many of  
whom
contributed funds to GRASS development) were given very high priority.

Helena



Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/



On Apr 30, 2007, at 2:17 PM, Glynn Clements wrote:

>
> [CC to grass-dev]
>
> Roy Sanderson wrote:
>
>> I've been trying to persuade our users to stop working with Grass  
>> 4.3 and
>> Grass 5.4 for some time now, and as I have to upgrade the OS on our
>> applications server have told them that they now have no choice.
>>
>> However, a couple of users stated that they preferred to use Grass  
>> 4.3 as
>> it was faster, and for large tasks, more stable than the newer  
>> versions.  I
>> checked this on a map of 52,000 rows by 28,000 columns and  
>> commands like
>> r.mapcalc, r.clump, r.volume operated about 10x faster in Grass  
>> 4.3 than
>> the more recent versions.
>>
>> This might simply arise from the age of the applications server OS  
>> (still
>> running RH7.3), or because I've mis-configured the newer versions  
>> of Grass.
>>  For example, I did not compile Grass 5 or 6 with large-file support
>> enabled, although the file sizes are only around 180Mb, but the  
>> speedy
>> performance of 4.3 vs 5.4, 6.0 and 6.2.1 surprised me.  Perhaps  
>> there's an
>> additional overhead associated with the introduction of nulls and
>> floating-points, which were major changes from 4.3 to 5.4.   
>> However, the
>> performance difference is still present when working with integer  
>> maps.  As
>> I haven't benchmarked versions, and also because personally I only  
>> work
>> with Grass version 6, I hadn't spotted the differences until now.
>
> I strongly suspect that the support for nulls is to blame. The
> implementation is really quite inefficient in several ways.
>
> It doesn't help that the null file is repeatedly opened and closed
> (the null bitmap is read in chunks of 8 rows at a time, with the file
> being opened anew for each read). Depending upon the speed of
> filesystem calls (open(), access() etc) relative to actual I/O, that
> could be a significant factor.
>
> Keeping the null file open would eliminate that part of the overhead,
> but would double the number of descriptors used. On older versions,
> that would halve the maximum number of open maps, although that limit
> has been eliminated in recent 6.3 CVS versions.
>
> Also, that would only eliminate part of the overhead. Actually
> decoding and embedding the null data is also non-trivial.
>
> Embedding the nulls in the data file, eliminating the null bitmap
> altogether, would eliminate all of the null overhead, but would also
> either enlarge the files significantly or break compatibility.
>
> The existing format is optimised for small, non-negative integers.
> Each row is stored using only as many bytes are required for the
> largest value, where all values are treated as unsigned (i.e. negative
> values always require 4 bytes). The integer value used for nulls is
> 0x80000000 (i.e. INT_MIN, -2^31); embedding this value directly would
> cause many files to always use 4 bytes per cell when 1 byte would
> otherwise be enough.
>
> We could change the encoding to be more friendly to embedded nulls,
> but that would break compatibility with earlier versions. AFAICT, a
> 6.3 integer raster can still be read by 4.3 (assuming that it uses RLE
> rather than zlib compression), with any nulls being read as zeroes.
>
> -- 
> Glynn Clements <glynn at gclements.plus.com>
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at grass.itc.it
> http://grass.itc.it/mailman/listinfo/grass-dev




More information about the grass-dev mailing list