[gdal-dev] [EXTERNAL] [BULK] Re: Experience with slowness of free() on Windows with lots of allocations?
Javier Jimenez Shaw
j1 at jimenezshaw.com
Fri Mar 29 03:37:45 PDT 2024
for tcmalloc do you need master? this recent release seems to have CMake
https://github.com/gperftools/gperftools/releases/tag/gperftools-2.15
Of course, I do not mean to force the usage of it. But could be a
suggestion in case we do not find anything better and a user has problems.
Or a way to inspire later research.
For us it is definitely helping.
Cheers,
Javier
On Thu, 21 Mar 2024 at 14:59, Even Rouault via gdal-dev <
gdal-dev at lists.osgeo.org> wrote:
> I've played with VirtualAlloc(NULL, SINGLE_ALLOC_SIZE, MEM_COMMIT |
> MEM_RESERVE, PAGE_READWRITE), and it does avoid the performance issue.
> However I see that VitualAlloc() allocates by chunks of 64 kB, so depending
> on the size of a block, it might cause significant waste of RAM, so that
> can't be used as a direct replacement of malloc().
>
> My inclination would be to perhaps have an optional config option like
> GDAL_BLOCK_CACHE_USE_PRIVATE_HEAP that could be set, and when doing so it
> would use HeapCreate(0, 0, GDAL_CACHEMAX) to create a heap only used by the
> block cache. Not ideal, since that would reserve the whole GDAL_CACHEMAX
> (but for a large enough processing, you'll end up consuming it), but it has
> the advantage of not being extremely intrusive either... and could be
> easily ditched/replaced by something better in the future.
>
> Regarding tcmalloc, I've had to use it on Linux too, but only on scenarios
> involving multithreading where it helps reducing RAM fragmentation: cf
> https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading
> . I've just tried quickly to use it on Windows to test it on the scenario,
> but didn't really manage to make it work. Even building it was challenging.
> Actually I tried https://github.com/gperftools/gperftools and I had to
> build from master since the latest tagged version doesn't build with CMake
> on Windows. But then nothing happens when linking tcmalloc_minimal.lib
> against my toy app. I probably missed something.
>
> Anyway I don't really think we can force tcmalloc to be used in GDAL, as a
> library. Unless there would be a way to have its allocator to be optionnaly
> used at places that we control (ie explicitly call tc_malloc / tc_free),
> and not replace the default malloc / free etc, which might be undesirable
> when GDAL is just a component of a larger application.
>
> Disabling entirely the block cache (or setting it to a minimum value) is
> only a workable option for uncompressed formats, or if you use per-band
> blocks (INTERLEAVE=BAND in GTiff language) and not one block for all bands
> (INTERLEAVE=PIXEL), otherwise you'll pay multiple time the decompression.
> Le 21/03/2024 à 14:38, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND
> APPLICATIONS INC] via gdal-dev a écrit :
>
> +1. We use a variety of hand-rolled VirtualAlloc based (for basic tasks,
> a simple pointer bump, and for more elaborate needs, a ‘buddy’) allocators,
> some of which try to be smart about memory usage via de-committing
> regions. In our work, we tend to disable the GDAL cache entirely and rely
> on the file system’s file cache instead, which is a simplification we can
> make but is surely untenable in general here.
>
>
>
> *From: *gdal-dev <gdal-dev-bounces at lists.osgeo.org>
> <gdal-dev-bounces at lists.osgeo.org> on behalf of Abel Pau via gdal-dev
> <gdal-dev at lists.osgeo.org> <gdal-dev at lists.osgeo.org>
> *Reply-To: *Abel Pau <a.pau at creaf.uab.cat> <a.pau at creaf.uab.cat>
> *Date: *Thursday, March 21, 2024 at 4:51 AM
> *To: *"gdal-dev at lists.osgeo.org" <gdal-dev at lists.osgeo.org>
> <gdal-dev at lists.osgeo.org> <gdal-dev at lists.osgeo.org>
> *Subject: *[EXTERNAL] [BULK] Re: [gdal-dev] Experience with slowness of
> free() on Windows with lots of allocations?
>
>
>
> *CAUTION:* This email originated from outside of NASA. Please take care
> when clicking links or opening attachments. Use the "Report Message"
> button to report suspicious messages to the NASA SOC.
>
>
>
> Hi Even,
>
>
>
> you’re right. We also know that. When programming the driver I took it in
> consideration. Our solution is not rely on windows to make a good job with
> memory and we try to reuse as memory as possible instead of use calloc/free
> freely.
>
>
>
> For instance, in the driver, for each feature I have to get or write the
> coordinates. I could do it every time I have to, so lots of times: create
> memory for reading, and then put them on the feature, and then free... so
> many times. What I do? When opening the layer I create some memory blocs of
> 250 Mb (due to the format itself) and I use that created memory to manage
> whatever I need. And when closing, I free it.
>
>
>
> While doing that I observed that sometimes I have to use GDAL code that
> doesn’t take it in consideration (CPLRecode() for instance). Perhaps it
> could be improves as well.
>
>
>
> Thanks for noticing that.
>
>
>
> *De:* gdal-dev <gdal-dev-bounces at lists.osgeo.org>
> <gdal-dev-bounces at lists.osgeo.org> *En nombre de *Javier Jimenez Shaw via
> gdal-dev
> *Enviado el:* dijous, 21 de març de 2024 8:27
> *Para:* Even Rouault <even.rouault at spatialys.com>
> <even.rouault at spatialys.com>
> *CC:* gdal dev <gdal-dev at lists.osgeo.org> <gdal-dev at lists.osgeo.org>
> *Asunto:* Re: [gdal-dev] Experience with slowness of free() on Windows
> with lots of allocations?
>
>
>
> In my company we confirmed that "Windows heap allocation mechanism sucks."
>
> Closing the application after using gtiff driver can take many seconds due
> to memory deallocations.
>
>
>
> One workaround was to use tcmalloc. I will ask my colleagues more details
> next week.
>
>
>
> On Thu, 21 Mar 2024, 01:55 Even Rouault via gdal-dev, <
> gdal-dev at lists.osgeo.org> wrote:
>
> Hi,
>
> while investigating
> https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408, I've
> come to the conclusion that the Windows heap allocation mechanism sucks.
> Basically if you allocate a lot of heap regions of modest size with
> malloc()/new[], the time spent when freeing them all with corresponding
> free()/delete[] is excruciatingly slow (like ~ 10 seconds for ~ 80,000
> allocations). The slowness is clearly quadratic with the number of
> allocations. You only start noticing it with ~ 30,000 allocations. And
> interestingly, another condition for that slowness is that each
> individual allocation much be strictly greater than 4096 * 4 bytes. At
> exactly that value, perf is acceptable, but add one extra byte, and it
> suddenly drops. I suspect that there must be a threshold from which
> malloc() starts using VirtualAlloc() instead of the heap, which must
> involve slow system calls, instead of a user-land allocation mechanism.
>
> Anyone has already hit that and found solutions? The only potential idea
> I found until now would be to use a private heap with HeapCreate() with
> a fixed maximum size, which is a bit problematic to adopt by default,
> basically that would mean that the size of GDAL_CACHEMAX would be
> consumed as soon as one use the block cache.
>
> Even
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
> _______________________________________________
> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240329/2497ddac/attachment-0001.htm>
More information about the gdal-dev
mailing list