[gdal-dev] [GRASS-user] Slow import of GHSL

Nikos Alexandris nik at nikosalexandris.net
Tue Mar 14 02:01:17 PDT 2017


Nikos Alexandris

>>>> Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
>>>> layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
>>>> db progress slow?

Markus M

>because it is a very large raster map: Size is 507904, 647168

>> (Apologies for cross-posting to gdal-dev)

Markus Neteler:

>>> Can you elaborate a bit more? I have downloaded and checked:
>>>
>>> That is 9835059101  bytes in 19885 files or I downloaded the wrong one
>>> (please post an URL).
>>
>> For example <http://ghsl.jrc.ec.europa.eu/ghs_bu.php>,
>>
>> see
>>
>> GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)
>GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
>GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
>GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)
>>
>> "3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
>> there is a VRT along with overviews for it.  No overviews for the TIFFs.
>>
>> For example:
>>
>> GHSL_data_access_v1.3.pdf
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif
>>
>>
>> Even trying to clip, with gdal_translate, might create file(s) of
>> hundreds of GBs. This might be due to missing compression.

>then use compression. The source tiffs use LZW with blocks of 4096x4096
>cells.


>> The import of p1 or p2 or of the VRT file in GRASS' data base, via
>> r.in.gdal/r.import, does not progress at all.

>Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with r.in.gdal
>took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.
>
>Recompressing with BZIP2 took 2:20 hours and the size of the cell file was
>reduced to a mere 143 MB.

Some messy rough timings:

1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
for "p2.tif", each stuck at 3% for almost 14h

2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
processes with -projwin, the VRT file as an input and GeoTIFF as output,
at 40% since yesterday afternoon

3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
1), stuck at 0% of progress for more than 16h.

SSD can be seen as a "necessity".

Nikos

[rest deleted]


More information about the gdal-dev mailing list