[gdal-dev] gdalwarp running very slow
Clive Swan
cliveswan at gmail.com
Wed Dec 14 08:33:44 PST 2022
I want to *APPEND* the UK data into the international.tif
The updated international size should also be: 450000, 225000
*I first tried *
gdalbuildvrt -o /data/coastal-2020.vrt /vsis3/summer/3/coastal-2020.tif
/vsis3/summer/5/coastal-2020.tif
gdal_translate /data/coastal-2020.vrt /data/3/coastal-2020.tif
/data/5/coastal-2020.tif -n -9999 -co BIGTIFF=YES -co COMPRESS=LZW -co
BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS --config
CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES --config
*The output was rubbish*
The UK image size is: 18376, 17086
The international size is: 450000, 225000
I tried
/data/3/coastal-2020-test.tif = 7GB
/data/5/coastal-2020.tif = 700MB
gdalwarp -r near -overwrite /data/3/coastal-2020.tif
/data/3/coastal-2020-test1.tif -co BIGTIFF=YES -co COMPRESS=LZW -co
BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS -co PREDICTOR=3
--config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES & disown -h
The AWS Instance with over 60 VCPU ran for over 8 hours
I tried:
/data/5/coastal-2020.tif = 700MB
/data/3/coastal-2020-test.tif = 7GB
gdalwarp -r near -overwrite /data/5/coastal-2020.tif
/data/3/coastal-2020-test.tif -co BIGTIFF=YES -co COMPRESS=LZW -co
BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS -co PREDICTOR=3
--config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES
The output is: 18376, 17086 *not* 450000, 225000
Any assistance appreciated
Thanks
Clive
On Wed, 14 Dec 2022 at 09:23, Rahkonen Jukka <
jukka.rahkonen at maanmittauslaitos.fi> wrote:
> Hi,
>
>
>
> I don’t mean that you should try this and that blindly but to describe
> what data you have in your hands and what you are planning to do with it so
> that the other GDAL users could consider what reasonable alternatives you
> could have. I have never done anything that is even close to your use case
> but due to other experience I can see potential issues in a few places:
>
> - You try to update image A that has a size 450000 by 225000 pixels
> with image B that has the same size. The result would be A updated into a
> full copy of B if all pixels in B are valid.
> - However, image B probably has very much NoData (we do not know
> because you have not told that) and if GDAL deals with NoData correctly the
> result would be A updated with valid pixels from B and that is probably
> what is desired.
> - However, we do not know how effectively GDAL skips the nodata pixels
> of B. It may be fast or not. If we know that most part of the world is
> NoData it might be good to crop image B to include just the area where
> there is data. That’s maybe UK in your case. If skipping the NoData is fast
> then cropping won’t give speedup but it is cheap to test.
> - You have compressed images. LZW algorithm is compressing some data
> more effectively than some other. If you expect that you can replace a
> chunk of LZW compressed data inside a TIFF file with another chunk of LZW
> compressed data in place you are wrong. The new chunk of data may be larger
> and it just cannot fit into the same space. Assumption that updating a 6 GB
> image with 600 MB new data would yield a 6 GB image is not correct with
> compressed data.
> - I can imagine that there could be other technical reasons to write
> the replacing data at the end of the existing TIFF and update the image
> directories. If the image size is critical it may require re-writing the
> updated TIFF into a new TIFF file. The complete re-write can be done in
> most optimal way. See this wiki page
> https://trac.osgeo.org/gdal/wiki/UserDocs/GdalWarp#GeoTIFFoutput-coCOMPRESSisbroken
> - If the images are in AWS it is possible that the process should be
> somehow different than with local images. I have no experience about AWS
> yet.
> - A 450000 by 225000 image is rather big. It is possible that it would
> be faster to split the image into smaller parts, update the parts that need
> updating, and combine the parts back into a big image. Or keep the parts
> and combine them virtually with gdalbuildvrt into VRT.
>
>
>
> Your use case is not so usual and it is rather heavy but there are
> certainly several ways to do what you want. What should be avoided it to
> select an inefficient method and try to optimize it.
>
>
>
> Good luck with your experiments,
>
>
>
> -Jukka-
>
>
>
>
>
>
>
>
>
> *Lähettäjä:* Clive Swan <cliveswan at gmail.com>
> *Lähetetty:* keskiviikko 14. joulukuuta 2022 10.29
> *Vastaanottaja:* Rahkonen Jukka <jukka.rahkonen at maanmittauslaitos.fi>
> *Aihe:* Re: [gdal-dev] gdalwarp running very slow
>
>
>
> Hi Jukka,
>
>
>
> Thanks for that, was really stressed.
>
> I will export the UK extent, and rerun the script.
>
>
>
> Thanks
>
> Clive
>
>
>
> Sent from Outlook for Android
> <https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206354325%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y3osHPcjOOvs6KrQUG6q2u1%2Bzyp8dCprHYhf%2Fza4aKY%3D&reserved=0>
> ------------------------------
>
> *From:* Rahkonen Jukka <jukka.rahkonen at maanmittauslaitos.fi>
> *Sent:* Wednesday, December 14, 2022 7:18:50 AM
> *To:* Clive Swan <cliveswan at gmail.com>; gdal-dev at lists.osgeo.org <
> gdal-dev at lists.osgeo.org>
> *Subject:* Re: [gdal-dev] gdalwarp running very slow
>
>
>
> Hi,
>
>
>
> Thank you for the information about the source files. I do not yet
> understand what you are trying to do and why. The both images have the same
> size 450000 and 225000 and they cover the same area. Is the “image
> 5_UK_coastal-2020.tif” just NoData with pixel value -9999 everywhere
> outside the UK? The name of the image makes me think so.
>
>
>
> -Jukka Rahkonen-
>
>
>
>
>
> *Lähettäjä:* Clive Swan <cliveswan at gmail.com>
> *Lähetetty:* tiistai 13. joulukuuta 2022 19.22
> *Vastaanottaja:* gdal-dev at lists.osgeo.org
> *Kopio:* Rahkonen Jukka <jukka.rahkonen at maanmittauslaitos.fi>
> *Aihe:* [gdal-dev] gdalwarp running very slow
>
>
>
> Greetings,
>
> I am using the same files, I copied them from an AWS Bucket to a local AWS
> Instance.
>
> I tried gdal_merge << tries to create 300GB file
>
> I tried gdal_translate ran but created 2.5 GB not 6.9 GB file
>
> Now trying gdalwarp.
>
>
>
> the gdalinfo is the same in both datasets:
>
> coastal-2020.tif (6.9GB)
>
> Driver: GTiff/GeoTIFF
> Size is 450000, 225000
> Coordinate System is:
> GEOGCRS["WGS 84",
> DATUM["World Geodetic System 1984",
> ELLIPSOID["WGS 84",6378137,298.257223563,
> LENGTHUNIT["metre",1]]],
> PRIMEM["Greenwich",0,
> ANGLEUNIT["degree",0.0174532925199433]],
> CS[ellipsoidal,2],
> AXIS["geodetic latitude (Lat)",north,
> ORDER[1],
> ANGLEUNIT["degree",0.0174532925199433]],
> AXIS["geodetic longitude (Lon)",east,
> ORDER[2],
> ANGLEUNIT["degree",0.0174532925199433]],
> ID["EPSG",4326]]
> Data axis to CRS axis mapping: 2,1
> Origin = (-180.000000000000000,90.000000000000000)
> Pixel Size = (0.000800000000000,-0.000800000000000)
> Metadata:
> AREA_OR_POINT=Area
> datetime_created=2022-11-14 18:05:14.053301
> Image Structure Metadata:
> COMPRESSION=LZW
> INTERLEAVE=BAND
> PREDICTOR=3
> Corner Coordinates:
> Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
> Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
> Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
> Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
> Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N)
> Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
> Description = score
> NoData Value=-9999
> Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_value
> NoData Value=-9999
> Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_min
> NoData Value=-9999
> Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_max
> NoData Value=-9999
> Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = likelihood
> NoData Value=-9999
> Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = return_time
> NoData Value=-9999
> Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = likelihood_confidence
> NoData Value=-9999
> Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = climate_reliability
> NoData Value=-9999
> Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = hazard_reliability
> NoData Value=-9999
>
>
>
> 5_UK_coastal-2020.tif (600MB)
>
> Driver: GTiff/GeoTIFF
> Size is 450000, 225000
> Coordinate System is:
> GEOGCRS["WGS 84",
> DATUM["World Geodetic System 1984",
> ELLIPSOID["WGS 84",6378137,298.257223563,
> LENGTHUNIT["metre",1]]],
> PRIMEM["Greenwich",0,
> ANGLEUNIT["degree",0.0174532925199433]],
> CS[ellipsoidal,2],
> AXIS["geodetic latitude (Lat)",north,
> ORDER[1],
> ANGLEUNIT["degree",0.0174532925199433]],
> AXIS["geodetic longitude (Lon)",east,
> ORDER[2],
> ANGLEUNIT["degree",0.0174532925199433]],
> ID["EPSG",4326]]
> Data axis to CRS axis mapping: 2,1
> Origin = (-180.000000000000000,90.000000000000000)
> Pixel Size = (0.000800000000000,-0.000800000000000)
> Metadata:
> AREA_OR_POINT=Area
> datetime_created=2022-11-14 18:05:14.053301
> hostname=posix.uname_result(sysname='Linux',
> nodename='ip-172-31-12-125', release='5.15.0-1022-aws',
> version='#26~20.04.1-Ubuntu SMP Sat Oct 15 03:22:07 UTC 2022',
> machine='x86_64')
> Image Structure Metadata:
> COMPRESSION=LZW
> INTERLEAVE=BAND
> PREDICTOR=3
> Corner Coordinates:
> Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
> Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
> Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
> Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
> Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N)
> Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
> Description = score
> NoData Value=-9999
> Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_value
> NoData Value=-9999
> Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_min
> NoData Value=-9999
> Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = severity_max
> NoData Value=-9999
> Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = likelihood
> NoData Value=-9999
> Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = return_time
> NoData Value=-9999
> Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = likelihood_confidence
> NoData Value=-9999
> Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = climate_reliability
> NoData Value=-9999
> Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined
> Description = hazard_reliability
> NoData Value=-9999
>
> --
>
> Regards,
>
>
>
> Clive Swan
>
> --
>
> Hi,
>
>
>
> If you are still struggling with the same old problem could you please finally send the gdalinfo reports of your two input files which are this time:
>
> coastal-2020.tif
>
> 5_UK_coastal-2020.tif
>
>
>
> -Jukka Rahkonen-
>
>
>
>
>
> Lähettäjä: gdal-dev <gdal-dev-bounces at lists.osgeo.org <https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206354325%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CXQisAtn9pOYceYi%2FOb3t5q5cnSyvuCXcbUQcttrOWw%3D&reserved=0>> Puolesta Clive Swan
>
> Lähetetty: tiistai 13. joulukuuta 2022 17.23
>
> Vastaanottaja: gdal-dev at lists.osgeo.org <https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206510566%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7dYGZEvrPMXi%2BDKseAc4HeYW%2FdDa%2BAqEAQfwX%2B6bF5E%3D&reserved=0>
>
> Aihe: [gdal-dev] gdalwarp running very slow
>
>
>
> Greetings,
>
> I am running gdalwarp on a 6GB (output) and 600MB (input) tif image, the AWS Instance has approx 60 VCPU
>
> It has taken over 6 hours so far - still running, is it possible to optimise this and speed it up??
>
>
>
> gdalwarp -r near -overwrite coastal-2020.tif 5_UK_coastal-2020.tif -co BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES
>
>
--
Regards,
Clive Swan
--
M: +44 7766 452665
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20221214/4cebd162/attachment-0001.htm>
More information about the gdal-dev
mailing list