[gdal-dev] [GRASS-user] Slow import of GHSL
Nikos Alexandris
nik at nikosalexandris.net
Fri Mar 24 02:25:52 PDT 2017
(Sorry for silence, was without my personal computer for a week.)
* Markus Metz <markus.metz.giswork at gmail.com> [2017-03-22 22:11:01 +0100]:
>On Wed, Mar 22, 2017 at 9:52 PM, Markus Neteler <neteler at osgeo.org> wrote:
>>
>> On Wed, Mar 22, 2017 at 9:28 PM, Markus Metz
>> <markus.metz.giswork at gmail.com> wrote:
>> > On Wed, Mar 22, 2017 at 8:12 PM, Markus Neteler <neteler at osgeo.org>
>wrote:
>> ...
>> >> Nikos, for an even bigger map try
>> >>
>> >> Global Surface Water (2000-2012, 30 m, Data coverage is from 80° north
>> >> to 60° south):
>> >> http://landcover.usgs.gov/glc/WaterDescriptionAndDownloads.php
>> >> by USGS. 1.6GB in size.
Interesting this is. See also:
https://global-surface-water.appspot.com/, at 30m, Landsat-based as
well.
>> >> Using gdalbuildvrt I created a VRT from the 504 GeoTIFF files.
>> >>
>> >> After import into GRASS GIS, here the timings:
>> >>
>> >> # final map size:
>> >> g.region -p
>> >> ...
>> >> rows: 493200
>> >> cols: 1296001
>> >> cells: 639187693200
>> >>
>> >> (handling only works in GRASS GIS 7.3.svn since Markus Metz's recent
>> >> improvements on global data import are needed).
>> >
>> > (my changes were bug fixes, not improvements)
>> >
>> >>
>> >> Benchmarks:
>> >> - Import took 2h while reading the data from a CIFS mounted storage
>> >> box (slow) and writing on SSD.
Markus N, I am interested: did you use the "memory" option?
>> >> - Displaying the entire map (639 giga-pixel) in GRASS GIS' display
>> >> (d.mon) took ~15 sec over a ssh tunnel from my laptop to the server,
>> >> since I am at a conference.
>> >>
>> >> Fair deal I would say :-)
>> >
>> > A bit more information would help to compare:
>> > - what is your GDAL version?
>>
>> GDAL 2.1.2
>>
>> > - are 504 GeoTIFF files compressed? If yes, which method?
>>
>> Yes, COMPRESSION=LZW
>>
>> > - what are the block dimensions of the input GeoTIFFs?
>>
>> Size is 36001, 36001 - Block=36001x1
Now that's important too. What about GHSL's block size of 4K^2?
My understanding is that it would make a difference, for GRASS, if I
would redo the GHSL layers with a row-shaped "block". Makes sense?
>This is row by row compression as in GRASS. That could help import with
>r.in.gdal which also reads and writes row by row.
>
>> Type=Byte
>>
>> > - what kind of GRASS compression did you use?
>>
>> Default raster + NULL compression enabled. I.e.,
>>
>> r.compress -p watermask2010
>> <watermask2010> is compressed (method 2: ZLIB). Data type: CELL
>
>You might save disk space at the cost of longer reading times with BZIP2.
>
>> <watermask2010> has a compressed NULL file
>>
>> Again, the fact that I had to read from an attached storage box likely
>> slowed down the import.
>> Just thought to post these numbers here.
>
>Impressive that such a large raster can be imported at all, and relatively
>fasto!
Indeed, impressive.
Nikos
>Reading about 1.6 GB (also from an attached storage box) should not take 2
>hours, therefore I think the limit is software input decompression and
>output compression.
>
>Markus M
More information about the gdal-dev
mailing list