[gdal-dev] Call for discussion on "RFC 45: GDAL datasets and	raster bands as virtual memory mappings"
    Even Rouault 
    even.rouault at mines-paris.org
       
    Wed Dec 18 14:02:00 PST 2013
    
    
  
Le mercredi 18 décembre 2013 21:09:48, Trent Piepho a écrit :
> On Wed, Dec 18, 2013 at 11:46 AM, Even Rouault
> 
> <even.rouault at mines-paris.org> wrote:
> > Le mercredi 18 décembre 2013 19:53:37, Frank Warmerdam a écrit :
> >> I imagined an available virtual method on the band which could be
> >> implemented - primarily by the RawBand class to try and mmap() the data
> >> and return the layout.  But when that fails, or is unavailable it could
> >> use your existing methodology with a layout that seems well tuned to
> >> the underlying data organization.
> > 
> > Yes, that should be doable, but with the limitation I raised about the
> > memory management of file-based mmap() : if you mmap() a file larger
> > than RAM, and read it entirely, without explicit madvise() to discard
> > regions no longer needed, it will fill RAM and cause disk swapping. I
> > should retest to confirm. Perhaps there are some OS level tuning to
> > avoid that ?
> 
> For Linux, if you mmap a file and do not write to it, the pages will
> be clean.  This means that under memory pressure those pages can be
> dropped without paging out to swap.  They are already backed on disk
> in the mmaped file.  Only dirty anonymous mapped pages (anon mmap,
> malloc() memory from mmap() or brk(), stack, etc.) would need to be
> written to swap.
Yes, that's the theory. But in practice, on my system ( kernel 2.6.32-46-
generic 64 bit - Ubuntu 10.04 - 4 GB RAM ), the system becomes rather 
unresponsive as soon as the process has read a part of the file that is 
equivalent to the initial remaining free RAM. The 'top' utility shows it to 
consume ~ 2.7 GB, which must be the free RAM.
Here's the test program I've used :
test_mmap.c :
#define _LARGEFILE64_SOURCE 1
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <assert.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
    int fd;
    struct stat64 buf;
    char* ptr;
    long long i;
    int res = 0;
    int bDontNeed = 0;
    assert( argc == 2 || argc == 3 );
    if( argc == 3 && strcmp(argv[2], "-dontneed") == 0 )
        bDontNeed = 1;
    fd = open(argv[1], O_RDONLY);
    assert(fd >= 0);
    assert(stat64(argv[1], &buf) == 0);
    ptr = (char*) mmap(NULL, buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    assert(ptr);
    for(i = 0; i< buf.st_size; i+= 4096)
    {
        /* Discard the pages every 500 MB read */
        if( bDontNeed && ((i % (1024 * 1024 * 500)) == 0) )
            madvise(ptr, buf.st_size, MADV_DONTNEED);
        res += ptr[i];
    }
    close(fd);
    return res;
}
$ gcc -Wall -g test_mmap.c -o test_mmap
$ ./test_mmap eudem_dem_4258_europe.tif
(the file is 20 GB large)
--> system becomes unresponsive
$ ./test_mmap eudem_dem_4258_europe.tif -dontneed
--> system remains usable. Every 500 MB read, a madvise() call will 
explicitely discard all pages. That's just for test. It couldn't be used in 
practice.
==> Does anyone reproduce similar behaviour ?
> 
> Of course if you touch a large amount of memory and know you're never
> use it again, you can help the OS out when it comes to deciding which
> pages to free by using madvise.
> 
> One think to consider is that a 32-bit OS can only memory map about
> 2-3 GB at once, even though there is no trouble using files much
> larger than this size.  If you want to access a large file with
> mmap(), you might need to use some kind of sliding window.
Yes, I'm well aware of that. But 32bit systems are now becoming increasingly 
legacy, so we shouldn't worry too much about them.
> 
> I think also, mmaping many gigabytes has a certain cost in setting up
> the page tables for the mapping that's not insignificant.  Even on a
> 64-bit os, mmaping a 20 GB file just to access some small portion of
> it could be inefficient.
Yes, I agree there are hidden costs in the memory management layers of the OS. 
"Huge TLB pages" (2 MB) on AMD64 systems can potentially be a solution to 
decrease that cost. I had started a bit to experiment with that, but my kernel 
was not recent enough to benefit from all functionnalities or it didn't seem 
really practical to use.
-- 
Geospatial professional services
http://even.rouault.free.fr/services.html
    
    
More information about the gdal-dev
mailing list