[gdal-dev] Experience with slowness of free() on Windows with lots of allocations?

Uhrig, Stefan stefan.uhrig at sap.com
Thu Mar 21 06:17:14 PDT 2024


I was curious and gave it a try. I also saw the bad performance on deallocations, but surprisingly the usage of a std::vector in the outer loop speeds things up considerably.

I could still see a peak memory usage of 1.8GiB, so it does not seem as if the compiler did optimize something out.

#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <vector>

int SINGLE_ALLOC_SIZE = 21200;
int NUMBER_OF_ALLOCS = 21200 * 4;

class CMyClass
{
public:
    CMyClass()
    {
        lpData = new char[SINGLE_ALLOC_SIZE];
        assert(lpData);
    };

    ~CMyClass()
    {
        delete[] lpData;
    };

public:
    char* lpData;
};

int main()
{
    do
    {
        printf("start\n");
        {
            std::vector<CMyClass> lpList(NUMBER_OF_ALLOCS);
            //CMyClass* lpList = new CMyClass[NUMBER_OF_ALLOCS];
            printf("after alloc. starting freeing\n");
            //delete[] lpList;
        }
        printf("end\n");
    } while (1);
    return 0;
}




From: gdal-dev <gdal-dev-bounces at lists.osgeo.org> On Behalf Of Abel Pau via gdal-dev
Sent: Thursday, March 21, 2024 9:52 AM
To: gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] Experience with slowness of free() on Windows with lots of allocations?

Hi Even,

you’re right. We also know that. When programming the driver I took it in consideration. Our solution is not rely on windows to make a good job with memory and we try to reuse as memory as possible instead of use calloc/free freely.

For instance, in the driver, for each feature I have to get or write the coordinates. I could do it every time I have to, so lots of times: create memory for reading, and then put them on the feature, and then free... so many times. What I do? When opening the layer I create some memory blocs of 250 Mb (due to the format itself) and I use that created memory to manage whatever I need. And when closing, I free it.

While doing that I observed that sometimes I have to use GDAL code that doesn’t take it in consideration (CPLRecode() for instance). Perhaps it could be improves as well.

Thanks for noticing that.

De: gdal-dev <gdal-dev-bounces at lists.osgeo.org<mailto:gdal-dev-bounces at lists.osgeo.org>> En nombre de Javier Jimenez Shaw via gdal-dev
Enviado el: dijous, 21 de març de 2024 8:27
Para: Even Rouault <even.rouault at spatialys.com<mailto:even.rouault at spatialys.com>>
CC: gdal dev <gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>>
Asunto: Re: [gdal-dev] Experience with slowness of free() on Windows with lots of allocations?

In my company we confirmed that "Windows heap allocation mechanism sucks."
Closing the application after using gtiff driver can take many seconds due to memory deallocations.

One workaround was to use tcmalloc. I will ask my colleagues more details next week.

On Thu, 21 Mar 2024, 01:55 Even Rouault via gdal-dev, <gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>> wrote:
Hi,

while investigating
https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408, I've
come to the conclusion that the Windows heap allocation mechanism sucks.
Basically if you allocate a lot of heap regions of modest size with
malloc()/new[], the time spent when freeing them all with corresponding
free()/delete[] is excruciatingly slow (like ~ 10 seconds for ~ 80,000
allocations). The slowness is clearly quadratic with the number of
allocations. You only start noticing it with ~ 30,000 allocations. And
interestingly, another condition for that slowness is that each
individual allocation much be strictly greater than 4096 * 4 bytes. At
exactly that value, perf is acceptable, but add one extra byte, and it
suddenly drops. I suspect that there must be a threshold from which
malloc() starts using VirtualAlloc() instead of the heap, which must
involve slow system calls, instead of a user-land allocation mechanism.

Anyone has already hit that and found solutions? The only potential idea
I found until now would be to use a private heap with HeapCreate() with
a fixed maximum size, which is a bit problematic to adopt by default,
basically that would mean that the size of GDAL_CACHEMAX would be
consumed as soon as one use the block cache.

Even

--
http://www.spatialys.com<http://www.spatialys.com/>
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/gdal-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240321/84474aad/attachment-0001.htm>


More information about the gdal-dev mailing list