[gdal-dev] slow netCDF read times

Pablo Rozas Larraondo pablo.larraondo at anu.edu.au
Tue Nov 22 03:14:43 PST 2016


Thank you Julien,

Your comment is very helpful and I think the issue that you're pointing out
is very much related to my problem. I did play with setting different
chunking patterns and I saw that chunking per line solved pretty much the
performance issue, but I didn't understand why.

Also, ticket 5291 talks about BottomUp NetCDF causing grief to GDAL. I've
checked and my file is BottomUp. I've flipped it by doing:
gdal_translate -of netcdf -co "WRITE_BOTTOMUP=NO"
chirps-v2.0.1981.dekads.classic.nc out.nc

And transformed it again into NetCDF4 with its original chunking pattern
and deflate level:
nccopy -7 -c time/4,lat/250,lon/900 out.nc out2.nc

In this case GDAL reads the file almost as fast as the native netcdf
library does. From now on I'll make sure I don't produce netCDF files which
are bottom up. Does anyone know where this bottom up convention comes from
in netcdf files or why is there?

Cheers,
Pablo




On Tue, Nov 22, 2016 at 8:07 PM, Julien Demaria <Julien.Demaria at acri-st.fr>
wrote:

> Hi,
>
>
>
> Maybe this is a problem with your NetCDF internal chunks cache too small
> and related to this ticket: https://trac.osgeo.org/gdal/ticket/5291
>
> You can change this per-variable cache using this C function:
> http://www.unidata.ucar.edu/software/netcdf/docs/group__variables.html#
> ga2788cbfc6880ec70c304292af2bc7546
>
> Else a workaround may be to rechunk your data using nccopy to have chunks
> of the same size than your reading window.
>
> Another solution is to recompile your NetCDF library to set more chunks
> cache.
>
>
>
> Regards,
>
>
>
> Julien
>
>
>
> *De :* gdal-dev [mailto:gdal-dev-bounces at lists.osgeo.org] *De la part de*
> Pablo Rozas Larraondo
> *Envoyé :* mardi 22 novembre 2016 08:53
> *À :* gdal-dev at lists.osgeo.org
> *Objet :* [gdal-dev] slow netCDF read times
>
>
>
> Hello,
>
>
> I've come across some NetCDF4 files where GDAL is taking a surprisingly
> long time to read data from them. For example this is an example public
> file containing precipitation data:
>
>
>
> ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/
> global_dekad/netcdf/chirps-v2.0.2015.dekads.nc
>
>
>
> If I use GDAL to read a small top left block (500x500) from one of its
> time bands, it takes approximately 1 minute on my computer. Source code is
> available here:
>
>
>
> https://gist.github.com/monkeybutter/769a24bcf87682171eb87ac05c9347c5
>
>
>
> The equivalent operation is completed in less than a second using the
> NetCDF library and even reading the whole file takes around 6 seconds with
> the same library.
>
>
>
> I've tried to profile the GDAL program to get more insight and understand
> what's causing the overhead with not much success. All I know is that the
> deflate function is using 96% of the resources. I also guess that the way
> this file is chunked has something to do with its performance. Can anyone
> suggest any idea for better understanding what's happening here?
>
>
>
> Thank you for your help,
>
> Pablo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20161122/1399b98c/attachment.html>


More information about the gdal-dev mailing list