[GRASS-dev] GRASS GIS raster files: LZW compression?

Markus Neteler neteler at osgeo.org
Tue Dec 5 01:08:00 PST 2017


On Mon, Dec 4, 2017 at 7:32 PM, Markus Metz
<markus.metz.giswork at gmail.com> wrote:
> ZSTD compression has been added to trunk with r71889-92.

Great!

> ZSTD compression can be added with configure --with-zstd=yes or simpler
> configure --with-zstd
>
> In order to get some wider testing, ZSTD is now the default compression
> method in trunk if ZSTD is available.
>
> ZSTD is an improvement over ZLIB: generally it compresses faster and higher
> than ZLIB. Decompression is consistently much faster than with ZLIB. For
> more information on ZSTD (Zstandard), see
> http://facebook.github.io/zstd/
> https://github.com/facebook/zstd
>
> Please test!

Here an initial test with a WorldClim DECLL map:


GRASS 7.5.svn (latlong_wgs84): >

g.region raster=tmean_12 -p
projection: 3 (Latitude-Longitude)
zone:       0
datum:      wgs84
ellipsoid:  wgs84
north:      90N
south:      60S
west:       180W
east:       180E
nsres:      0:00:30
ewres:      0:00:30
rows:       18000
cols:       43200
cells:      777600000

r.compress -p tmean_12
<tmean_12> is compressed (method 2: ZLIB). Data type: DCELL
<tmean_12> has a compressed NULL file


# create a ZSTD compressed copy:
export GRASS_COMPRESS_NULLS=1
export GRASS_COMPRESSOR=ZSTD

r.mapcalc "tmean_12_ZSTD = tmean_12"

r.compress -p tmean_12_ZSTD
<tmean_12_ZSTD> is compressed (method 5: ZSTD). Data type: DCELL
<tmean_12_ZSTD> has a compressed NULL file

# run r.univar
time -p r.univar -g tmean_12
n=222262507
null_cells=555337493
cells=777600000
min=-51.9
max=33.3
range=85.2
mean=-0.434087630455532
mean_of_abs=18.7414294225023
stddev=21.0312119498642
variance=442.31187608011
coeff_var=-4844.92311559167
sum=-96481405.002736
real 20.43
user 20.21
sys 0.11

# comparison (results & timing)
time -p r.univar -g tmean_12_ZSTD
n=222262507
null_cells=555337493
cells=777600000
min=-51.9
max=33.3
range=85.2
mean=-0.434087630455532
mean_of_abs=18.7414294225023
stddev=21.0312119498642
variance=442.31187608011
coeff_var=-4844.92311559167
sum=-96481405.002736
real 15.43
user 15.31
sys 0.07

# expectedly, the ZSTD is faster and it takes r.univar only ~ 75%
compared to ZLIB:
> 15.43/20.43
[1] 0.7552619

####
# file sizes comparison:
find . -name tmean_12 | sort | xargs du -h
4.0K    ./cats/tmean_12
4.0K    ./cellhd/tmean_12
2.7M    ./cell_misc/tmean_12
0    ./cell/tmean_12
12K    ./colr/tmean_12
256M    ./fcell/tmean_12
4.0K    ./hist/tmean_12

find . -name tmean_12_ZSTD | sort | xargs du -h
4.0K    ./cats/tmean_12_ZSTD
4.0K    ./cellhd/tmean_12_ZSTD
2.7M    ./cell_misc/tmean_12_ZSTD
0    ./cell/tmean_12_ZSTD
12K    ./colr/tmean_12_ZSTD
195M    ./fcell/tmean_12_ZSTD
4.0K    ./hist/tmean_12_ZSTD

# fcell size comparison in detail, here ZSTD is of ~75% smaller file
compared to ZLIB:
> 203521319/268155000
[1] 0.758969


# my system: Fedora 26, 64bit, SSD disk, Intel(R) Core(TM) i5-6300U
CPU @ 2.40GHz

Great!

markusN


More information about the grass-dev mailing list