<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>I've played with VirtualAlloc(NULL, SINGLE_ALLOC_SIZE, MEM_COMMIT
| MEM_RESERVE, PAGE_READWRITE), and it does avoid the performance
issue. However I see that VitualAlloc() allocates by chunks of 64
kB, so depending on the size of a block, it might cause
significant waste of RAM, so that can't be used as a direct
replacement of malloc().<br>
</p>
<p>My inclination would be to perhaps have an optional config option
like GDAL_BLOCK_CACHE_USE_PRIVATE_HEAP that could be set, and when
doing so it would use HeapCreate(0, 0, GDAL_CACHEMAX) to create a
heap only used by the block cache. Not ideal, since that would
reserve the whole GDAL_CACHEMAX (but for a large enough
processing, you'll end up consuming it), but it has the advantage
of not being extremely intrusive either... and could be easily
ditched/replaced by something better in the future.<br>
</p>
<p>Regarding tcmalloc, I've had to use it on Linux too, but only on
scenarios involving multithreading where it helps reducing RAM
fragmentation: cf
<a class="moz-txt-link-freetext" href="https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading">https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading</a>
. I've just tried quickly to use it on Windows to test it on the
scenario, but didn't really manage to make it work. Even building
it was challenging. Actually I tried
<a class="moz-txt-link-freetext" href="https://github.com/gperftools/gperftools">https://github.com/gperftools/gperftools</a> and I had to build from
master since the latest tagged version doesn't build with CMake on
Windows. But then nothing happens when linking
tcmalloc_minimal.lib against my toy app. I probably missed
something.<br>
</p>
<p>Anyway I don't really think we can force tcmalloc to be used in
GDAL, as a library. Unless there would be a way to have its
allocator to be optionnaly used at places that we control (ie
explicitly call tc_malloc / tc_free), and not replace the default
malloc / free etc, which might be undesirable when GDAL is just a
component of a larger application.<br>
</p>
<p>Disabling entirely the block cache (or setting it to a minimum
value) is only a workable option for uncompressed formats, or if
you use per-band blocks (INTERLEAVE=BAND in GTiff language) and
not one block for all bands (INTERLEAVE=PIXEL), otherwise you'll
pay multiple time the decompression.<br>
</p>
<div class="moz-cite-prefix">Le 21/03/2024 à 14:38, Meyer, Jesse R.
(GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a
écrit :<br>
</div>
<blockquote type="cite"
cite="mid:003F39B4-75EE-4AAB-B3E6-B67E493A3806@ndc.nasa.gov">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator"
content="Microsoft Word 15 (filtered medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Times New Roman",serif;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}div.WordSection1
{page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Aptos",sans-serif">+1.
We use a variety of hand-rolled VirtualAlloc based (for
basic tasks, a simple pointer bump, and for more elaborate
needs, a ‘buddy’) allocators, some of which try to be smart
about memory usage via de-committing regions. In our work,
we tend to disable the GDAL cache entirely and rely on the
file system’s file cache instead, which is a simplification
we can make but is surely untenable in general here.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
<div
style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-family:"Calibri",sans-serif;color:black" lang="CA">From:
</span></b><span
style="font-family:"Calibri",sans-serif;color:black" lang="CA">gdal-dev
<a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev-bounces@lists.osgeo.org"><gdal-dev-bounces@lists.osgeo.org></a> on behalf of Abel
Pau via gdal-dev <a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev@lists.osgeo.org"><gdal-dev@lists.osgeo.org></a><br>
<b>Reply-To: </b>Abel Pau <a class="moz-txt-link-rfc2396E" href="mailto:a.pau@creaf.uab.cat"><a.pau@creaf.uab.cat></a><br>
<b>Date: </b>Thursday, March 21, 2024 at 4:51 AM<br>
<b>To: </b><a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev@lists.osgeo.org">"gdal-dev@lists.osgeo.org"</a>
<a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev@lists.osgeo.org"><gdal-dev@lists.osgeo.org></a><br>
<b>Subject: </b>[EXTERNAL] [BULK] Re: [gdal-dev]
Experience with slowness of free() on Windows with lots of
allocations?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span
style="font-family:"Aptos",sans-serif" lang="CA"><o:p> </o:p></span></p>
</div>
<table class="MsoNormalTable" style="border:solid black 1.5pt"
cellspacing="0" cellpadding="0" border="1" align="left">
<tbody>
<tr>
<td
style="width:100.0%;border:none;background:#FFEB9C;padding:3.75pt 3.75pt 3.75pt 3.75pt"
width="100%">
<p class="MsoNormal"
style="mso-element:frame;mso-element-frame-hspace:2.25pt;mso-element-wrap:around;mso-element-anchor-vertical:paragraph;mso-element-anchor-horizontal:column;mso-height-rule:exactly">
<b><span
style="font-size:10.0pt;font-family:"Aptos",sans-serif;color:black">CAUTION:</span></b><span
style="font-family:"Aptos",sans-serif;color:black">
</span><span
style="font-size:10.0pt;font-family:"Aptos",sans-serif;color:black">This
email originated from outside of NASA. Please take
care when clicking links or opening attachments.
Use the "Report Message" button to report suspicious
messages to the NASA SOC.</span><span
style="font-family:"Aptos",sans-serif;color:black"> </span>
<span style="font-family:"Aptos",sans-serif"><o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-family:"Aptos",sans-serif" lang="CA"><br>
<br>
<o:p></o:p></span></p>
<div>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"
lang="ES">Hi Even,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"
lang="ES"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"
lang="ES">you’re right. We also know that.
</span><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">When
programming the driver I took it in consideration. Our
solution is not rely on windows to make a good job with
memory and we try to reuse as memory as possible instead
of use calloc/free freely.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">For
instance, in the driver, for each feature I have to get or
write the coordinates. I could do it every time I have to,
so lots of times: create memory for reading, and then put
them on the feature, and then free... so many times. What
I do? When opening the layer I create some memory blocs of
250 Mb (due to the format itself) and I use that created
memory to manage whatever I need. And when closing, I free
it.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">While
doing that I observed that sometimes I have to use GDAL
code that doesn’t take it in consideration (</span><span
style="font-size:9.5pt;font-family:Consolas;color:#6F008A"
lang="CA">CPLRecode()</span><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
for instance). Perhaps it could be improves as well.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Thanks
for noticing that.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif"
lang="ES">De:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif"
lang="ES"> gdal-dev
<a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev-bounces@lists.osgeo.org"><gdal-dev-bounces@lists.osgeo.org></a>
<b>En nombre de </b>Javier Jimenez Shaw via gdal-dev<br>
<b>Enviado el:</b> dijous, 21 de març de 2024 8:27<br>
<b>Para:</b> Even Rouault
<a class="moz-txt-link-rfc2396E" href="mailto:even.rouault@spatialys.com"><even.rouault@spatialys.com></a><br>
<b>CC:</b> gdal dev <a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev@lists.osgeo.org"><gdal-dev@lists.osgeo.org></a><br>
<b>Asunto:</b> Re: [gdal-dev] Experience with slowness of
free() on Windows with lots of allocations?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="CA"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span lang="CA">In my company we
confirmed that "Windows heap allocation mechanism
sucks."<o:p></o:p></span></p>
<div>
<p class="MsoNormal"><span lang="CA">Closing the
application after using gtiff driver can take many
seconds due to memory deallocations.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="CA"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="CA">One workaround was to
use tcmalloc. I will ask my colleagues more details
next week.<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span lang="CA"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="CA">On Thu, 21 Mar 2024,
01:55 Even Rouault via gdal-dev, <<a
href="mailto:gdal-dev@lists.osgeo.org"
moz-do-not-send="true" class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a>>
wrote:<o:p></o:p></span></p>
</div>
<blockquote
style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<p class="MsoNormal"><span lang="CA">Hi,<br>
<br>
while investigating <br>
<a
href="https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408</a>,
I've
<br>
come to the conclusion that the Windows heap
allocation mechanism sucks. <br>
Basically if you allocate a lot of heap regions of
modest size with <br>
malloc()/new[], the time spent when freeing them all
with corresponding <br>
free()/delete[] is excruciatingly slow (like ~ 10
seconds for ~ 80,000 <br>
allocations). The slowness is clearly quadratic with
the number of <br>
allocations. You only start noticing it with ~ 30,000
allocations. And <br>
interestingly, another condition for that slowness is
that each <br>
individual allocation much be strictly greater than
4096 * 4 bytes. At <br>
exactly that value, perf is acceptable, but add one
extra byte, and it <br>
suddenly drops. I suspect that there must be a
threshold from which <br>
malloc() starts using VirtualAlloc() instead of the
heap, which must <br>
involve slow system calls, instead of a user-land
allocation mechanism.<br>
<br>
Anyone has already hit that and found solutions? The
only potential idea <br>
I found until now would be to use a private heap with
HeapCreate() with <br>
a fixed maximum size, which is a bit problematic to
adopt by default, <br>
basically that would mean that the size of
GDAL_CACHEMAX would be <br>
consumed as soon as one use the block cache.<br>
<br>
Even<br>
<br>
-- <br>
<a href="http://www.spatialys.com/" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">http://www.spatialys.com</a><br>
My software is free, but my time generally not.<br>
<br>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><br>
<a
href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><o:p></o:p></span></p>
</blockquote>
</div>
</div>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</body>
</html>