<div dir="ltr"><div>for tcmalloc do you need master? this recent release seems to have CMake<br></div><div><a href="https://github.com/gperftools/gperftools/releases/tag/gperftools-2.15">https://github.com/gperftools/gperftools/releases/tag/gperftools-2.15</a></div><div><br></div><div>Of course, I do not mean to force the usage of it. But could be a suggestion in case we do not find anything better and a user has problems. Or a way to inspire later research.</div><div><br></div><div>For us it is definitely helping.</div><div><br></div><div>Cheers,</div><div>Javier<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 21 Mar 2024 at 14:59, Even Rouault via gdal-dev <<a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>I've played with VirtualAlloc(NULL, SINGLE_ALLOC_SIZE, MEM_COMMIT
| MEM_RESERVE, PAGE_READWRITE), and it does avoid the performance
issue. However I see that VitualAlloc() allocates by chunks of 64
kB, so depending on the size of a block, it might cause
significant waste of RAM, so that can't be used as a direct
replacement of malloc().<br>
</p>
<p>My inclination would be to perhaps have an optional config option
like GDAL_BLOCK_CACHE_USE_PRIVATE_HEAP that could be set, and when
doing so it would use HeapCreate(0, 0, GDAL_CACHEMAX) to create a
heap only used by the block cache. Not ideal, since that would
reserve the whole GDAL_CACHEMAX (but for a large enough
processing, you'll end up consuming it), but it has the advantage
of not being extremely intrusive either... and could be easily
ditched/replaced by something better in the future.<br>
</p>
<p>Regarding tcmalloc, I've had to use it on Linux too, but only on
scenarios involving multithreading where it helps reducing RAM
fragmentation: cf
<a href="https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading" target="_blank">https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading</a>
. I've just tried quickly to use it on Windows to test it on the
scenario, but didn't really manage to make it work. Even building
it was challenging. Actually I tried
<a href="https://github.com/gperftools/gperftools" target="_blank">https://github.com/gperftools/gperftools</a> and I had to build from
master since the latest tagged version doesn't build with CMake on
Windows. But then nothing happens when linking
tcmalloc_minimal.lib against my toy app. I probably missed
something.<br>
</p>
<p>Anyway I don't really think we can force tcmalloc to be used in
GDAL, as a library. Unless there would be a way to have its
allocator to be optionnaly used at places that we control (ie
explicitly call tc_malloc / tc_free), and not replace the default
malloc / free etc, which might be undesirable when GDAL is just a
component of a larger application.<br>
</p>
<p>Disabling entirely the block cache (or setting it to a minimum
value) is only a workable option for uncompressed formats, or if
you use per-band blocks (INTERLEAVE=BAND in GTiff language) and
not one block for all bands (INTERLEAVE=PIXEL), otherwise you'll
pay multiple time the decompression.<br>
</p>
<div>Le 21/03/2024 à 14:38, Meyer, Jesse R.
(GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a
écrit :<br>
</div>
<blockquote type="cite">
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Aptos",sans-serif">+1.
We use a variety of hand-rolled VirtualAlloc based (for
basic tasks, a simple pointer bump, and for more elaborate
needs, a ‘buddy’) allocators, some of which try to be smart
about memory usage via de-committing regions. In our work,
we tend to disable the GDAL cache entirely and rely on the
file system’s file cache instead, which is a simplification
we can make but is surely untenable in general here.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Aptos",sans-serif"><u></u> <u></u></span></p>
<div style="border-width:1pt medium medium;border-style:solid none none;border-color:rgb(181,196,223) currentcolor currentcolor;padding:3pt 0in 0in">
<p class="MsoNormal"><b><span style="font-family:"Calibri",sans-serif;color:black" lang="CA">From:
</span></b><span style="font-family:"Calibri",sans-serif;color:black" lang="CA">gdal-dev
<a href="mailto:gdal-dev-bounces@lists.osgeo.org" target="_blank"><gdal-dev-bounces@lists.osgeo.org></a> on behalf of Abel
Pau via gdal-dev <a href="mailto:gdal-dev@lists.osgeo.org" target="_blank"><gdal-dev@lists.osgeo.org></a><br>
<b>Reply-To: </b>Abel Pau <a href="mailto:a.pau@creaf.uab.cat" target="_blank"><a.pau@creaf.uab.cat></a><br>
<b>Date: </b>Thursday, March 21, 2024 at 4:51 AM<br>
<b>To: </b><a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">"gdal-dev@lists.osgeo.org"</a>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank"><gdal-dev@lists.osgeo.org></a><br>
<b>Subject: </b>[EXTERNAL] [BULK] Re: [gdal-dev]
Experience with slowness of free() on Windows with lots of
allocations?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif" lang="CA"><u></u> <u></u></span></p>
</div>
<table style="border:1.5pt solid black" cellspacing="0" cellpadding="0" border="1" align="left">
<tbody>
<tr>
<td style="width:100%;border:medium;background:rgb(255,235,156);padding:3.75pt" width="100%">
<p class="MsoNormal">
<b><span style="font-size:10pt;font-family:"Aptos",sans-serif;color:black">CAUTION:</span></b><span style="font-family:"Aptos",sans-serif;color:black">
</span><span style="font-size:10pt;font-family:"Aptos",sans-serif;color:black">This
email originated from outside of NASA. Please take
care when clicking links or opening attachments.
Use the "Report Message" button to report suspicious
messages to the NASA SOC.</span><span style="font-family:"Aptos",sans-serif;color:black"> </span>
<span style="font-family:"Aptos",sans-serif"><u></u><u></u></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin-bottom:12pt"><span style="font-family:"Aptos",sans-serif" lang="CA"><br>
<br>
<u></u><u></u></span></p>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)" lang="ES">Hi Even,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)" lang="ES"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)" lang="ES">you’re right. We also know that.
</span><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">When
programming the driver I took it in consideration. Our
solution is not rely on windows to make a good job with
memory and we try to reuse as memory as possible instead
of use calloc/free freely.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">For
instance, in the driver, for each feature I have to get or
write the coordinates. I could do it every time I have to,
so lots of times: create memory for reading, and then put
them on the feature, and then free... so many times. What
I do? When opening the layer I create some memory blocs of
250 Mb (due to the format itself) and I use that created
memory to manage whatever I need. And when closing, I free
it.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">While
doing that I observed that sometimes I have to use GDAL
code that doesn’t take it in consideration (</span><span style="font-size:9.5pt;font-family:Consolas;color:rgb(111,0,138)" lang="CA">CPLRecode()</span><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">
for instance). Perhaps it could be improves as well.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">Thanks
for noticing that.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:11pt;font-family:"Calibri",sans-serif" lang="ES">De:</span></b><span style="font-size:11pt;font-family:"Calibri",sans-serif" lang="ES"> gdal-dev
<a href="mailto:gdal-dev-bounces@lists.osgeo.org" target="_blank"><gdal-dev-bounces@lists.osgeo.org></a>
<b>En nombre de </b>Javier Jimenez Shaw via gdal-dev<br>
<b>Enviado el:</b> dijous, 21 de març de 2024 8:27<br>
<b>Para:</b> Even Rouault
<a href="mailto:even.rouault@spatialys.com" target="_blank"><even.rouault@spatialys.com></a><br>
<b>CC:</b> gdal dev <a href="mailto:gdal-dev@lists.osgeo.org" target="_blank"><gdal-dev@lists.osgeo.org></a><br>
<b>Asunto:</b> Re: [gdal-dev] Experience with slowness of
free() on Windows with lots of allocations?<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="CA"><u></u> <u></u></span></p>
<div>
<p class="MsoNormal"><span lang="CA">In my company we
confirmed that "Windows heap allocation mechanism
sucks."<u></u><u></u></span></p>
<div>
<p class="MsoNormal"><span lang="CA">Closing the
application after using gtiff driver can take many
seconds due to memory deallocations.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="CA"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="CA">One workaround was to
use tcmalloc. I will ask my colleagues more details
next week.<u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><span lang="CA"><u></u> <u></u></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="CA">On Thu, 21 Mar 2024,
01:55 Even Rouault via gdal-dev, <<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>>
wrote:<u></u><u></u></span></p>
</div>
<blockquote style="border-width:medium medium medium 1pt;border-style:none none none solid;border-color:currentcolor currentcolor currentcolor rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<p class="MsoNormal"><span lang="CA">Hi,<br>
<br>
while investigating <br>
<a href="https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408" target="_blank">https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408</a>,
I've
<br>
come to the conclusion that the Windows heap
allocation mechanism sucks. <br>
Basically if you allocate a lot of heap regions of
modest size with <br>
malloc()/new[], the time spent when freeing them all
with corresponding <br>
free()/delete[] is excruciatingly slow (like ~ 10
seconds for ~ 80,000 <br>
allocations). The slowness is clearly quadratic with
the number of <br>
allocations. You only start noticing it with ~ 30,000
allocations. And <br>
interestingly, another condition for that slowness is
that each <br>
individual allocation much be strictly greater than
4096 * 4 bytes. At <br>
exactly that value, perf is acceptable, but add one
extra byte, and it <br>
suddenly drops. I suspect that there must be a
threshold from which <br>
malloc() starts using VirtualAlloc() instead of the
heap, which must <br>
involve slow system calls, instead of a user-land
allocation mechanism.<br>
<br>
Anyone has already hit that and found solutions? The
only potential idea <br>
I found until now would be to use a private heap with
HeapCreate() with <br>
a fixed maximum size, which is a bit problematic to
adopt by default, <br>
basically that would mean that the size of
GDAL_CACHEMAX would be <br>
consumed as soon as one use the block cache.<br>
<br>
Even<br>
<br>
-- <br>
<a href="http://www.spatialys.com/" target="_blank">http://www.spatialys.com</a><br>
My software is free, but my time generally not.<br>
<br>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><u></u><u></u></span></p>
</blockquote>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
gdal-dev mailing list
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre cols="72">--
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</div>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</blockquote></div>