<div dir="ltr"><div dir="ltr"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt">Even,</p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><br></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt">I pulled the updated PR and it is working well in a Linux container:</p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><br></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="monospace">$ ./get_gdal_memory<br>GDAL version is 3.7.0dev<br>GDAL thinks it has 2097152000 bytes of physical memory<br>GDAL thinks it has 2097152000 bytes of usable physical memory<br>sysinfo() thinks it has 811526475776 bytes of physical memory</font></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><br></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt">Thank you for this change, it should make GDAL in LXCs much more stable. Maybe someday<font face="arial, sans-serif"> </font><font face="monospace"><a href="https://man7.org/linux/man-pages/man2/sysinfo.2.html">sysinfo()</a> </font><font face="arial, sans-serif">will be updated to understand cgroup v2 or a new library function will be created to make this easier for you.</font></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><br></font></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif">Angus</font></p></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 26, 2023 at 7:52 AM Even Rouault <<a href="mailto:even.rouault@spatialys.com">even.rouault@spatialys.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>Angus,</p>
    <p>I've just edited the pull request to take into account MemTotal
      of /proc/meminfo. Only tested on my host Linux, but hopefully that
      should work also for your setup given the elements you've
      mentionned.</p>
    <p>Laurențiu,</p>
    <p>are you 100% positive you've tested the updated version of the
      pull request? I've just given a try to running gdallimits under
      Docker from a Ubuntu 22.04 host and it successfully takes into
      account the <span style="font-family:menlo,consolas,monospace,sans-serif">/sys/fs/cgroup/memory.max
        limit</span></p>
    <p>Even<span style="font-family:menlo,consolas,monospace,sans-serif"><br>
      </span></p>
    <div>Le 26/01/2023 à 02:13, Angus Dickey a
      écrit :<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">Even,
        <div><br>
        </div>
        <div>Thanks, that is some quick turn around! I imagine <a href="https://www.proxmox.com/en/" target="_blank">Proxmox</a> or
          <a href="https://linuxcontainers.org/lxd/introduction/" target="_blank">LXD</a> are pretty much what everyone
          uses to create linux containers. LXC is the underlying
          technology but also has a set of command line tools that can
          be used to create containers. In your case it sounds like LXD
          can't choose a subnet for your linux bridge, which is
          mysterious and I don't know how to fix that.</div>
        <div><br>
        </div>
        <div>I tried your update inside a container and am still seeing
          the problem where GDAL thinks it has the full host memory:</div>
        <div><br>
          <font face="monospace">$ gdalinfo --version<br>
            GDAL 3.7.0dev, released 2023/99/99 (debug build)<br>
            $ ./get_gdal_memory<br>
          </font></div>
        <font face="monospace">GDAL version is 3.7.0dev<br>
          GDAL thinks it has 135083474944 bytes of physical memory<br>
          GDAL thinks it has 135083474944 bytes of usable physical
          memory<br>
          sysinfo() thinks it has 135083474944 bytes of physical memory</font>
        <div><font face="monospace">$ free -h<br>
                           total        used        free      shared
             buff/cache   available<br>
            Mem:           2.0Gi       152Mi       1.1Gi       0.0Ki    
              755Mi       1.8Gi<br>
            Swap:          256Mi          0B       256Mi</font><br>
        </div>
        <div><font face="monospace">$ cat /proc/meminfo | grep MemTotal<br>
            MemTotal:        2048000 kB</font><br>
        </div>
        <div><br>
          <div>I wanted to dig a bit but am no expert in
            containerization and cgroup v2. It seems that some tools
            show the memory the container has (<font face="monospace"><a href="https://man7.org/linux/man-pages/man1/free.1.html" target="_blank">free </a></font>& <font face="monospace"><a href="https://man7.org/linux/man-pages/man5/proc.5.html" target="_blank">/proc/meminfo</a></font>) and
            others (<span style="font-family:monospace"><a href="https://man7.org/linux/man-pages/man2/sysinfo.2.html" target="_blank">sysinfo</a></span>) show the host
            memory. For cgroups v2 I see your code is trying to find the
            max memory from a specific <font face="monospace">memory.max</font>
            file in <font face="monospace">/sys/fs/cgroup/</font><font face="arial, sans-serif">. In my <i>containers </i>that
              file (actually all the </font><font face="monospace">memory.max</font><font face="arial, sans-serif"> files) contain the default value
              "max".</font></div>
          <div><font face="arial, sans-serif"><br>
            </font></div>
          <div><font face="monospace">$ find /sys/fs/cgroup -type f
              -name memory.max -exec sh -c "cat '{}'" \;<br>
              max<br>
              max<br>
              max<br>
              ... all max ...<br>
              max</font><br>
          </div>
          <div><font face="monospace"><br>
            </font></div>
          <div><font face="arial, sans-serif">If I try the same thing on
              the <i>host </i>I actually find it is set to the
              expected value.</font></div>
          <div><font face="arial, sans-serif"><br>
            </font></div>
          <div><font face="monospace">cat $
              /sys/fs/cgroup/lxc/901/memory.max<br>
              2097152000</font><font face="arial, sans-serif"><br>
            </font></div>
          <div><font face="monospace"><br>
            </font></div>
          <div><font face="arial, sans-serif">The cgroup values on the
              host appear to be what is limiting the container memory,
              more rules can be added inside the container but they are
              still beholden to the host rules. I am not sure how </font><font face="monospace">free </font>& <font face="monospace">/proc/memory</font><font face="arial, sans-serif"> are getting the correct
              available memory but maybe I will ask the proxmox or LXD
              people.</font></div>
          <div><font face="arial, sans-serif"><br>
            </font></div>
          <div><font face="arial, sans-serif">Thanks again,</font></div>
          <div><font face="arial, sans-serif"><br>
            </font></div>
          <div><font face="arial, sans-serif">Angus</font></div>
          <div><br>
          </div>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Jan 25, 2023 at 4:49
          AM Even Rouault <<a href="mailto:even.rouault@spatialys.com" target="_blank">even.rouault@spatialys.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <p>Angus,</p>
            <p>I'm not familiar with LXC. I tried to setup LXD with <a href="https://linuxcontainers.org/lxd/introduction/" target="_blank">https://linuxcontainers.org/lxd/introduction/</a>
              but it fails with a mysterious "Error: Failed to create
              local member network "lxdbr0" in project "default": Failed
              generating auto config: Failed to automatically find an
              unused IPv4 subnet, manual configuration required"</p>
            <p>Anyway, I've attempted in <a href="https://github.com/OSGeo/gdal/pull/7124" target="_blank">https://github.com/OSGeo/gdal/pull/7124</a>
              to better take into account cgroup to get memory
              limitation. Could you give this a try?</p>
            <p>Even<br>
            </p>
            <div>Le 25/01/2023 à 06:24, Angus Dickey a écrit :<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div dir="ltr"><span>
                          <p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt">Even,</p>
                          <p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><br>
                          </p>
                          <p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt">Thanks
                            for the reply, I went ahead and compiled the
                            latest GDAL 3.6.2 on Ubuntu 22.04.
                            Unfortunately I ended up with a similar
                            result, GDAL thinks it has 755GB of RAM to
                            work with when it only has 2GB:</p>
                          <p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><br>
                          </p>
                          <p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="monospace">$ gdalinfo --version<br>
                              GDAL 3.6.2, released 2023/01/02 (debug
                              build)<br>
                              <br>
                              $ ./get_gdal_memory<br>
                              GDAL version is 3.6.2<br>
                              GDAL thinks is has 811526475776 bytes of
                              physical memory<br>
                              GDAL thinks it has 811526475776 bytes of
                              usable physical memory<br>
                              <br>
                              $ free -h<br>
                                             total        used      
                               free      shared  buff/cache   available<br>
                              Mem:           2.0Gi       148Mi      
                              1.2Gi       0.0Ki       639Mi       1.8Gi<br>
                              Swap:          256Mi          0B      
                              256Mi<br>
                            </font></p>
                        </span></div>
                    </div>
                  </div>
                </div>
                <div><br>
                </div>
                My knowledge on the subject is limited but I think Linux
                containers (LXC) uses cgroups and not setrlimit to limit
                resources, so maybe that is why the new changes had no
                effect. To reproduce this issue you can create a
                container using  LXC, LXD, or a hypervision like proxmox
                (what I am using) and call CPLGetUsablePhysicalRAM().
                <div><br>
                </div>
                <div>If there is any other info that might be helpful
                  let me know. I might try a Docker container (it also
                  uses cgroups) and is more popular than LXC, although
                  it fulfills a different function.
                  <div><br>
                  </div>
                  <div>thanks,</div>
                  <div><br>
                  </div>
                  <div>Angus<br>
                    <div><br>
                    </div>
                    <div><br>
                      <div class="gmail_quote">
                        <div dir="ltr" class="gmail_attr">On Tue, Jan
                          24, 2023 at 5:50 PM Even Rouault <<a href="mailto:even.rouault@spatialys.com" target="_blank">even.rouault@spatialys.com</a>>
                          wrote:<br>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                          <div>
                            <p>Angus,</p>
                            <p>there has been a recent extra fix that
                              landed in GDAL 3.6.2 that might possibly
                              help: <a href="https://github.com/OSGeo/gdal/pull/6926" target="_blank">https://github.com/OSGeo/gdal/pull/6926</a></p>
                            <p>Even</p>
                            <div>Le 25/01/2023 à 01:36, Angus Dickey a
                              écrit :<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="ltr">Hi all,
                                <div><br>
                                </div>
                                <div>I am running into an issue where
                                  GDAL is overestimating the amount of
                                  physical memory it has leading to it
                                  locking up the OS by taking 100% of
                                  the memory. Here is an example program
                                  that illustrates the issue:<br>
                                  <br>
                                  #include <stdio.h><br>
                                  #include "gdal.h"<br>
                                  <br>
                                  int main(void) {<br>
                                     printf("GDAL version is %s\n",
                                  GDALVersionInfo("RELEASE_NAME"));<br>
                                     printf("GDAL thinks is has %lld
                                  bytes of physical memory\n",
                                  CPLGetPhysicalRAM());<br>
                                     printf("GDAL thinks it has %lld
                                  bytes of usable physical memory\n",
                                  CPLGetUsablePhysicalRAM());<br>
                                     return 0;<br>
                                  }<br>
                                </div>
                                <div><br>
                                </div>
                                <div>When this is compiled with GDAL
                                  3.5.1 on Ubuntu 22.04 we get:<br>
                                </div>
                                <div><br>
                                </div>
                                <div>$ ./get_gdal_memory <br>
                                  GDAL version is 3.5.1<br>
                                  GDAL thinks is has 811526475776 bytes
                                  of physical memory<br>
                                  GDAL thinks it has 811526475776 bytes
                                  of usable physical memory<br>
                                  <br>
                                  Which is not consistent with the
                                  actual available memory:</div>
                                <div><br>
                                  $ free -h<br>
                                                 total        used      
                                   free      shared  buff/cache  
                                  available<br>
                                  Mem:           2.0Gi       148Mi      
                                  1.2Gi       0.0Ki       639Mi      
                                  1.8Gi<br>
                                  Swap:          256Mi          0B      
                                  256Mi<br>
                                </div>
                                <div><br>
                                </div>
                                <div>So GDAL thinks it has 755GB of
                                  memory but it only has 2GB, this
                                  causes issues with the raster read
                                  cache and maybe elsewhere. I suspect
                                  this is happening because it is
                                  running in a <a href="https://linuxcontainers.org/" target="_blank">Linux
                                    container</a> and GDAL is getting
                                  the total physical memory of the host,
                                  not the container. The strange thing
                                  is Linux containers use cgroups for
                                  memory restrictions and the API docs <a href="https://gdal.org/api/cpl.html#_CPPv417CPLGetPhysicalRAMv" target="_blank">mention it
                                    was fixed in GDAL 2.4.0</a> but I am
                                  still seeing the issue in 3.5.1.</div>
                                <div><br>
                                </div>
                                <div>Any help or insight would be
                                  appreciated; I am happy to provide any
                                  additional information or testing.</div>
                                <div><br>
                                </div>
                                <div>Thanks,</div>
                                <div><br>
                                </div>
                                <div>Angus</div>
                              </div>
                              <br>
                              <fieldset></fieldset>
                              <pre>_______________________________________________
gdal-dev mailing list
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
                            </blockquote>
                            <pre cols="72">-- 
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
                          </div>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </blockquote>
            <pre cols="72">-- 
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <pre cols="72">-- 
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
  </div>

</blockquote></div></div>