[gdal-dev] Call for discussion on "RFC 45: GDAL datasets and raster bands as virtual memory mappings"
Trent Piepho
tpiepho at gmail.com
Wed Dec 18 12:09:48 PST 2013
On Wed, Dec 18, 2013 at 11:46 AM, Even Rouault
<even.rouault at mines-paris.org> wrote:
> Le mercredi 18 décembre 2013 19:53:37, Frank Warmerdam a écrit :
>>
>> I imagined an available virtual method on the band which could be
>> implemented - primarily by the RawBand class to try and mmap() the data and
>> return the layout. But when that fails, or is unavailable it could use
>> your existing methodology with a layout that seems well tuned to the
>> underlying data organization.
>
> Yes, that should be doable, but with the limitation I raised about the memory
> management of file-based mmap() : if you mmap() a file larger than RAM, and read
> it entirely, without explicit madvise() to discard regions no longer needed,
> it will fill RAM and cause disk swapping. I should retest to confirm. Perhaps
> there are some OS level tuning to avoid that ?
For Linux, if you mmap a file and do not write to it, the pages will
be clean. This means that under memory pressure those pages can be
dropped without paging out to swap. They are already backed on disk
in the mmaped file. Only dirty anonymous mapped pages (anon mmap,
malloc() memory from mmap() or brk(), stack, etc.) would need to be
written to swap.
Of course if you touch a large amount of memory and know you're never
use it again, you can help the OS out when it comes to deciding which
pages to free by using madvise.
One think to consider is that a 32-bit OS can only memory map about
2-3 GB at once, even though there is no trouble using files much
larger than this size. If you want to access a large file with
mmap(), you might need to use some kind of sliding window.
I think also, mmaping many gigabytes has a certain cost in setting up
the page tables for the mapping that's not insignificant. Even on a
64-bit os, mmaping a 20 GB file just to access some small portion of
it could be inefficient.
More information about the gdal-dev
mailing list