<div dir="ltr">To partially appease the crowd, the data provider has since acknowledged the issue on their end and are working on a fix - thankfully not one of those providers that take a month to respond with a shrug.<div><div><br></div><div><div>Cheers,</div><div>Daniel<br><div><br></div><div><br></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 10 Sept 2024 at 16:11, thomas bonfort <<a href="mailto:thomas.bonfort@gmail.com">thomas.bonfort@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">I'm not sure that providing a fix to work around this very broken behavior is the best way of action to make them fix their server... <br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 10, 2024 at 5:07 PM Even Rouault via gdal-dev <<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>

  
    
  
  <div>
    <p><br>
    </p>
    <div>Le 10/09/2024 à 16:10, Rahkonen Jukka
      via gdal-dev a écrit :<br>
    </div>
    <blockquote type="cite">
      
      
      
      <div>
        <p class="MsoNormal"><span>Hi,<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US">Have you tried with configuration option
            “CPL_VSIL_CURL_USE_HEAD=[YES/NO]: Defaults to YES. Controls
            whether to use a HEAD request when opening a remote URL.”</span></p>
      </div>
    </blockquote>
    <p>I was just going to suggest that too. It "works", but not really.
      It just postpones the core issue: the server doesn't support GET
      Range requests, so can't be used with /vsicurl/</p>
    <p>As it has a COG organization with overview data first in the
      file, If you want to read the smallest overview(s), you can use
      /vsicurl_streaming/ instead, but that won't be efficient to read
      the bottom-right most tile of the full resoultion late, which will
      require reading the whole file...</p>
    <p>Nothing GDAL can do about that.</p>
    <p>Actually... digging further... it somehow supports Range
      requests, but in what I believe a non-compliant way. It does
      return the expected content, but returns HTTP 200 and not HTTP 206
      (Partial content). And it never returns the Content-Length header.</p>
    <p>Well, I've implemented a workaround in
      <a href="https://github.com/OSGeo/gdal/pull/10760" target="_blank">https://github.com/OSGeo/gdal/pull/10760</a> that might be useful in
      other similar cases too.</p>
    <p>With that, the following works:</p>
    <pre><code>gdal_translate "/vsicurl?file_size=unlimited&url=<a href="https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif" target="_blank">https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif</a>" --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR out.tif -srcwin 5000 5000 50 50</code></pre>
    <p></p>
    <p>file_size=unlimited works here since the GTiff driver doesn't
      really need to have the right file size, it will just check we
      don't try to read beyond at some points, so unlimited is OK. In
      other situations/drivers, the exact value could be needed.<br>
    </p>
    <p>But they should really fix their servers<br>
    </p>
    <p>Even<br>
    </p>
    <blockquote type="cite">
      <div>
        <p class="MsoNormal"><span lang="EN-US"><u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US">-Jukka Rahkonen-<u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
        <div style="border-width:1pt medium medium;border-style:solid none none;border-color:rgb(225,225,225) currentcolor currentcolor;padding:3pt 0cm 0cm">
          <p class="MsoNormal"><b>Lähettäjä:</b> gdal-dev
            <a href="mailto:gdal-dev-bounces@lists.osgeo.org" target="_blank"><gdal-dev-bounces@lists.osgeo.org></a>
            <b>Puolesta </b>Daniel Evans via gdal-dev<br>
            <b>Lähetetty:</b> tiistai 10. syyskuuta 2024 16.57<br>
            <b>Vastaanottaja:</b> '<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>'
            (<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>) <a href="mailto:gdal-dev@lists.osgeo.org" target="_blank"><gdal-dev@lists.osgeo.org></a><br>
            <b>Aihe:</b> [gdal-dev] Ignore content-length in vsicurl?<u></u><u></u></p>
        </div>
        <p class="MsoNormal"><u></u> <u></u></p>
        <div>
          <p class="MsoNormal">Hi all,<u></u><u></u></p>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">I am attempting to read a dataset via
              /vsicurl/ where I believe the server is incorrectly
              returning `content-length: 0` in response to HEAD
              requests. This causes GDAL to believe it's a zero-length
              file, and it therefore can't be read.<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">If I download the file via HTTP GET,
              it's valid, and GDAL can read it locally. I've also
              confirmed I can use /vsicurl/ on some test datasets in the
              GDAL repo.<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">Is it possible to force GDAL to work
              around the faulty content-length header, or is it too
              fundamental a problem to ignore?<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">I've separately got in touch with the
              data provider to see if they are able to fix the issue at
              their end.<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">Cheers,<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal">Daniel<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">URL of the troublesome dataset:<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><a href="https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif" target="_blank">https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif</a><u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">Example HTTP header responses I'm
              seeing:<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">GET<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">HTTP/2 200<br>
              date: Tue, 10 Sep 2024 13:47:54 GMT<br>
              content-type: binary/octet-stream<br>
              content-length: 278198294<br>
              vary: Origin, Access-Control-Request-Method,
              Access-Control-Request-Headers<br>
              etag: "a79f3f685281d6681e4d362536c5b3eb-34"<br>
              last-modified: Thu, 25 Jul 2024 13:16:08 GMT<br>
              x-version: 0.0.16<br>
              access-control-allow-credentials: true<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">HEAD<u></u><u></u></p>
          </div>
          <div>
            <p class="MsoNormal"><u></u> <u></u></p>
          </div>
          <div>
            <p class="MsoNormal">HTTP/2 200<br>
              date: Tue, 10 Sep 2024 13:48:08 GMT<br>
              content-type: binary/octet-stream<br>
              content-length: 0<br>
              x-version: 0.0.16<br>
              access-control-allow-credentials: true<br>
              etag: "a79f3f685281d6681e4d362536c5b3eb-34"<br>
              last-modified: Thu, 25 Jul 2024 13:16:08 GMT<br>
              vary: Origin, Access-Control-Request-Method,
              Access-Control-Request-Headers<u></u><u></u></p>
          </div>
        </div>
      </div>
      <br>
      <fieldset></fieldset>
      <pre>_______________________________________________
gdal-dev mailing list
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
    </blockquote>
    <pre cols="72">-- 
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
  </div>

_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</blockquote></div></div>
</blockquote></div>