<div dir="ltr">Thank you Julien,<div><br></div><div>Your comment is very helpful and I think the issue that you're pointing out is very much related to my problem. I did play with setting different chunking patterns and I saw that chunking per line solved pretty much the performance issue, but I didn't understand why.</div><div><br></div><div>Also, ticket 5291 talks about BottomUp NetCDF causing grief to GDAL. I've checked and my file is BottomUp. I've flipped it by doing:</div><div>gdal_translate -of netcdf -co "WRITE_BOTTOMUP=NO" <a href="http://chirps-v2.0.1981.dekads.classic.nc">chirps-v2.0.1981.dekads.classic.nc</a> <a href="http://out.nc">out.nc</a></div><div><br></div><div>And transformed it again into NetCDF4 with its original chunking pattern and deflate level:</div><div>nccopy -7 -c time/4,lat/250,lon/900 <a href="http://out.nc">out.nc</a> <a href="http://out2.nc">out2.nc</a><br></div><div><br></div><div>In this case GDAL reads the file almost as fast as the native netcdf library does. From now on I'll make sure I don't produce netCDF files which are bottom up. Does anyone know where this bottom up convention comes from in netcdf files or why is there?</div><div><br></div><div>Cheers,</div><div>Pablo</div><div><br></div><div><br></div><div> </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 22, 2016 at 8:07 PM, Julien Demaria <span dir="ltr"><<a href="mailto:Julien.Demaria@acri-st.fr" target="_blank">Julien.Demaria@acri-st.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="FR" link="blue" vlink="purple">
<div class="m_-6974236268276877649WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Maybe this is a problem with your NetCDF internal chunks cache too small and related to this ticket:
<a href="https://trac.osgeo.org/gdal/ticket/5291" target="_blank">https://trac.osgeo.org/gdal/<wbr>ticket/5291</a><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">You can change this per-variable cache using this C function:
<a href="http://www.unidata.ucar.edu/software/netcdf/docs/group__variables.html#ga2788cbfc6880ec70c304292af2bc7546" target="_blank">
http://www.unidata.ucar.edu/<wbr>software/netcdf/docs/group__<wbr>variables.html#<wbr>ga2788cbfc6880ec70c304292af2bc<wbr>7546</a><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Else a workaround may be to rechunk your data using nccopy to have chunks of the same size than your reading window.<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Another solution is to recompile your NetCDF library to set more chunks cache.<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Regards,<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Julien<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">De :</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> gdal-dev [mailto:<a href="mailto:gdal-dev-bounces@lists.osgeo.org" target="_blank">gdal-dev-bounces@<wbr>lists.osgeo.org</a>]
<b>De la part de</b> Pablo Rozas Larraondo<br>
<b>Envoyé :</b> mardi 22 novembre 2016 08:53<br>
<b>À :</b> <a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<b>Objet :</b> [gdal-dev] slow netCDF read times<u></u><u></u></span></p><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">Hello,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><br>
I've come across some NetCDF4 files where GDAL is taking a surprisingly long time to read data from them. For example this is an example public file containing precipitation data:<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><a href="ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/global_dekad/netcdf/chirps-v2.0.2015.dekads.nc" target="_blank">ftp://ftp.chg.ucsb.edu/pub/<wbr>org/chg/products/CHIRPS-2.0/<wbr>global_dekad/netcdf/chirps-v2.<wbr>0.2015.dekads.nc</a><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">If I use GDAL to read a small top left block (500x500) from one of its time bands, it takes approximately 1 minute on my computer. Source code is available here:<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><a href="https://gist.github.com/monkeybutter/769a24bcf87682171eb87ac05c9347c5" target="_blank">https://gist.github.com/<wbr>monkeybutter/<wbr>769a24bcf87682171eb87ac05c9347<wbr>c5</a><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">The equivalent operation is completed in less than a second using the NetCDF library and even reading the whole file takes around 6 seconds with the same library.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">I've tried to profile the GDAL program to get more insight and understand what's causing the overhead with not much success. All I know is that the deflate function is using 96% of the resources. I also guess
that the way this file is chunked has something to do with its performance. Can anyone suggest any idea for better understanding what's happening here?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">Thank you for your help,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">Pablo<u></u><u></u></span></p>
</div>
</div>
</div></div></div>
</div>
</blockquote></div><br></div>