[GRASS-dev] [GRASS GIS] #2750: LZ4 when writing raster rows; better than double I/O bound r.mapcalc speed
GRASS GIS
trac at osgeo.org
Fri Oct 9 20:55:34 PDT 2015
#2750: LZ4 when writing raster rows; better than double I/O bound r.mapcalc speed
--------------------------+---------------------------
Reporter: sprice | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.1.0
Component: Raster | Version: svn-trunk
Resolution: | Keywords: ZLIB LZ4 ZSTD
CPU: OSX/Intel | Platform: MacOSX
--------------------------+---------------------------
Comment (by wenzeslaus):
The tests are running well for me also with `GRASS_INT_LZ4HC=1`,
`GRASS_INT_ZSTD=1` and `GRASS_INT_ZLIB=0` (RLE). I haven't tried
`GRASS_COMPRESS_NULLS=1`.
Here is a report (polished) created by the attached benchmark script based
on what you posted which uses completely random data. I have used
`GRASS_COMPRESS_NULLS=1` and region with 30,000,000 cells. The disk was
SSD, OS was Linux.
{{{
#!rst
ZLIB compression writing
========================
Performance counter stats for 'r.mapcalc
expression=test_rast_orig=double(test_rast_z_base)' (10 runs):
::
10415.903368 task-clock (msec) # 0.993 CPUs utilized
( +- 0.36% )
2798 context-switches # 0.269 K/sec
( +- 2.11% )
23 cpu-migrations # 0.002 K/sec
( +- 4.20% )
325702 page-faults # 0.031 M/sec
( +- 0.00% )
30804357778 cycles # 2.957 GHz
( +- 0.42% )
9055572140 stalled-cycles-frontend # 29.40% frontend cycles
idle ( +- 1.81% )
47982328290 instructions # 1.56 insns per cycle
# 0.19 stalled cycles per
insn ( +- 0.02% )
7087070642 branches # 680.409 M/sec
( +- 0.02% )
341325584 branch-misses # 4.82% of all branches
( +- 0.05% )
10.489354952 seconds time elapsed
( +- 0.41% )
RLE compression writing
=======================
Performance counter stats for 'r.mapcalc
expression=test_rast_rle=double(test_rast_z_base)' (10 runs):
::
10367.674362 task-clock (msec) # 0.999 CPUs utilized
( +- 0.53% )
1642 context-switches # 0.158 K/sec
( +- 18.72% )
22 cpu-migrations # 0.002 K/sec
( +- 5.32% )
325702 page-faults # 0.031 M/sec
( +- 0.00% )
30666690391 cycles # 2.958 GHz
( +- 0.38% )
8921313281 stalled-cycles-frontend # 29.09% frontend cycles
idle ( +- 1.80% )
47975696799 instructions # 1.56 insns per cycle
# 0.19 stalled cycles per
insn ( +- 0.02% )
7085878436 branches # 683.459 M/sec
( +- 0.02% )
340649966 branch-misses # 4.81% of all branches
( +- 0.04% )
10.382500561 seconds time elapsed
( +- 0.53% )
LZ4 compression writing
=======================
Performance counter stats for 'r.mapcalc
expression=test_rast_lz4=double(test_rast_z_base)' (10 runs):
::
2490.815692 task-clock (msec) # 0.999 CPUs utilized
( +- 0.23% )
321 context-switches # 0.129 K/sec
( +- 13.63% )
20 cpu-migrations # 0.008 K/sec
( +- 5.02% )
684 page-faults # 0.274 K/sec
( +- 0.12% )
7259170408 cycles # 2.914 GHz
( +- 0.12% )
2305705372 stalled-cycles-frontend # 31.76% frontend cycles
idle ( +- 0.20% )
13796117271 instructions # 1.90 insns per cycle
# 0.17 stalled cycles per
insn ( +- 0.06% )
2790495244 branches # 1120.314 M/sec
( +- 0.05% )
33371582 branch-misses # 1.20% of all branches
( +- 0.41% )
2.492994675 seconds time elapsed
( +- 0.23% )
LZ4HC compression writing
=========================
Performance counter stats for 'r.mapcalc
expression=test_rast_lz4hc=double(test_rast_z_base)' (10 runs):
::
6867.635439 task-clock (msec) # 0.999 CPUs utilized
( +- 0.25% )
648 context-switches # 0.094 K/sec
( +- 0.29% )
21 cpu-migrations # 0.003 K/sec
( +- 5.28% )
745 page-faults # 0.108 K/sec
( +- 0.18% )
20199681252 cycles # 2.941 GHz
( +- 0.28% )
6449729534 stalled-cycles-frontend # 31.93% frontend cycles
idle ( +- 0.62% )
31860120047 instructions # 1.58 insns per cycle
# 0.20 stalled cycles per
insn ( +- 0.03% )
5196919230 branches # 756.726 M/sec
( +- 0.03% )
184132785 branch-misses # 3.54% of all branches
( +- 0.04% )
6.873512386 seconds time elapsed
( +- 0.25% )
ZSTD compression writing
========================
Performance counter stats for 'r.mapcalc
expression=test_rast_zstd=double(test_rast_z_base)' (10 runs):
::
3540.287381 task-clock (msec) # 0.999 CPUs utilized
( +- 0.20% )
382 context-switches # 0.108 K/sec
( +- 3.67% )
24 cpu-migrations # 0.007 K/sec
( +- 5.61% )
776 page-faults # 0.219 K/sec
( +- 0.13% )
10367186950 cycles # 2.928 GHz
( +- 0.05% )
3160263203 stalled-cycles-frontend # 30.48% frontend cycles
idle ( +- 0.10% )
19098247069 instructions # 1.84 insns per cycle
# 0.17 stalled cycles per
insn ( +- 0.04% )
3831842251 branches # 1082.353 M/sec
( +- 0.04% )
35124859 branch-misses # 0.92% of all branches
( +- 0.16% )
3.543262199 seconds time elapsed
( +- 0.20% )
Original raster map test
========================
Performance counter stats for 'r.univar test_rast_z_base' (10 runs):
::
2024.195978 task-clock (msec) # 0.998 CPUs utilized
( +- 0.29% )
646 context-switches # 0.319 K/sec
( +- 0.64% )
0 cpu-migrations # 0.000 K/sec
457 page-faults # 0.226 K/sec
( +- 0.04% )
5934175598 cycles # 2.932 GHz
( +- 0.05% )
1712911175 stalled-cycles-frontend # 28.87% frontend cycles
idle ( +- 0.14% )
11404604123 instructions # 1.92 insns per cycle
# 0.15 stalled cycles per
insn ( +- 0.00% )
2280049632 branches # 1126.398 M/sec
( +- 0.00% )
32906874 branch-misses # 1.44% of all branches
( +- 0.37% )
2.029035083 seconds time elapsed
( +- 0.28% )
ZLIB compression reading
========================
Performance counter stats for 'r.univar test_rast_orig' (10 runs):
::
2000.246389 task-clock (msec) # 0.998 CPUs utilized
( +- 0.42% )
640 context-switches # 0.320 K/sec
( +- 1.09% )
0 cpu-migrations # 0.000 K/sec
458 page-faults # 0.229 K/sec
( +- 0.02% )
5930779846 cycles # 2.965 GHz
( +- 0.08% )
1716273412 stalled-cycles-frontend # 28.94% frontend cycles
idle ( +- 0.18% )
11406021691 instructions # 1.92 insns per cycle
# 0.15 stalled cycles per
insn ( +- 0.01% )
2280208665 branches # 1139.964 M/sec
( +- 0.01% )
32553520 branch-misses # 1.43% of all branches
( +- 0.24% )
2.005018871 seconds time elapsed
( +- 0.42% )
RLE compression reading
=======================
Performance counter stats for 'r.univar test_rast_rle' (10 runs):
::
2016.279711 task-clock (msec) # 0.998 CPUs utilized
( +- 0.34% )
653 context-switches # 0.324 K/sec
( +- 1.34% )
0 cpu-migrations # 0.000 K/sec
( +- 50.92% )
458 page-faults # 0.227 K/sec
( +- 0.04% )
5931202618 cycles # 2.942 GHz
( +- 0.07% )
1711592367 stalled-cycles-frontend # 28.86% frontend cycles
idle ( +- 0.13% )
11406103365 instructions # 1.92 insns per cycle
# 0.15 stalled cycles per
insn ( +- 0.01% )
2280223560 branches # 1130.906 M/sec
( +- 0.01% )
32763877 branch-misses # 1.44% of all branches
( +- 0.41% )
2.021075900 seconds time elapsed
( +- 0.34% )
LZ4 compression reading
=======================
Performance counter stats for 'r.univar test_rast_lz4' (10 runs):
::
690.267191 task-clock (msec) # 0.998 CPUs utilized
( +- 0.37% )
235 context-switches # 0.341 K/sec
( +- 1.55% )
0 cpu-migrations # 0.000 K/sec
449 page-faults # 0.650 K/sec
( +- 0.04% )
2003905382 cycles # 2.903 GHz
( +- 0.11% )
586090598 stalled-cycles-frontend # 29.25% frontend cycles
idle ( +- 0.29% )
3982189156 instructions # 1.99 insns per cycle
# 0.15 stalled cycles per
insn ( +- 0.04% )
971667430 branches # 1407.669 M/sec
( +- 0.02% )
64052 branch-misses # 0.01% of all branches
( +- 2.47% )
0.691904075 seconds time elapsed
( +- 0.37% )
LZ4HC compression reading
=========================
Performance counter stats for 'r.univar test_rast_lz4hc' (10 runs):
::
692.453563 task-clock (msec) # 0.998 CPUs utilized
( +- 0.18% )
243 context-switches # 0.351 K/sec
( +- 0.96% )
0 cpu-migrations # 0.000 K/sec
449 page-faults # 0.649 K/sec
( +- 0.03% )
1999520778 cycles # 2.888 GHz
( +- 0.09% )
581415687 stalled-cycles-frontend # 29.08% frontend cycles
idle ( +- 0.23% )
3982233099 instructions # 1.99 insns per cycle
# 0.15 stalled cycles per
insn ( +- 0.04% )
971675561 branches # 1403.236 M/sec
( +- 0.02% )
63306 branch-misses # 0.01% of all branches
( +- 1.98% )
0.694124867 seconds time elapsed
( +- 0.18% )
ZSTD compression reading
========================
Performance counter stats for 'r.univar test_rast_zstd' (10 runs):
::
1168.682507 task-clock (msec) # 0.998 CPUs utilized
( +- 0.41% )
377 context-switches # 0.323 K/sec
( +- 1.35% )
0 cpu-migrations # 0.000 K/sec
460 page-faults # 0.394 K/sec
( +- 0.06% )
3397563517 cycles # 2.907 GHz
( +- 0.06% )
780090112 stalled-cycles-frontend # 22.96% frontend cycles
idle ( +- 0.28% )
8084726269 instructions # 2.38 insns per cycle
# 0.10 stalled cycles per
insn ( +- 0.02% )
1325103816 branches # 1133.844 M/sec
( +- 0.02% )
426732 branch-misses # 0.03% of all branches
( +- 1.51% )
1.171226998 seconds time elapsed
( +- 0.41% )
Check of types and compression
==============================
::
<test_rast_z_base> is compressed (level 2: DEFLATE). Data type:
<DCELL>
<test_rast_orig> is compressed (level 2: DEFLATE). Data type: <DCELL>
<test_rast_rle> is compressed (level 1: RLE). Data type: <DCELL>
<test_rast_lz4> is compressed (level 3: LZ4). Data type: <DCELL>
<test_rast_lz4hc> is compressed (level 4: LZ4HC). Data type: <DCELL>
<test_rast_zstd> is compressed (level 5: ZSTD). Data type: <DCELL>
File sizes
==========
::
240045009 Oct 9 23:26 fcell/test_rast_lz4
240045009 Oct 9 23:27 fcell/test_rast_lz4hc
229517654 Oct 9 23:24 fcell/test_rast_orig
229517654 Oct 9 23:26 fcell/test_rast_rle
229517654 Oct 9 23:22 fcell/test_rast_z_base
227636390 Oct 9 23:28 fcell/test_rast_zstd
105009 Oct 9 23:26 cell_misc/test_rast_lz4/null2
105009 Oct 9 23:27 cell_misc/test_rast_lz4hc/null2
125009 Oct 9 23:24 cell_misc/test_rast_orig/null2
125009 Oct 9 23:26 cell_misc/test_rast_rle/null2
125009 Oct 9 23:22 cell_misc/test_rast_z_base/null2
175009 Oct 9 23:28 cell_misc/test_rast_zstd/null2
}}}
--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2750#comment:10>
GRASS GIS <https://grass.osgeo.org>
More information about the grass-dev
mailing list