[gdal-dev] vsipreload: enabling VSI Virtual File API for regular I/O

Even Rouault even.rouault at mines-paris.org
Tue May 28 15:20:31 PDT 2013


> Now that I have your attention, I brought up with Frank at FOSS4GNA that
> there could sometimes be a need for both MEM drivers to spool off to disk
> through some sort of out-of-core mmap'd allocation.

> Does this currently exist in VSI,

No

> and if not, do you see any use for such a thing?

I didn't yet, but apparently you do :-)

> My thought
> was there might be scenarios where someone working with multi-gb (or
> worse) raster data sources where you'd like to control the paging (ie, you
> have SSD, or you want to create intermediates for some weird processing
> chain).

I'm not sure to understand the mention to paging in your sentence. When 
refering to mmap(), paging makes me think of the page size you get with 
sysconf(_SC_PAGE_SIZE), but that's probably not what you meant.

> Maybe not all that useful in exchange for the added complexity...

I'm not entirely clear on the advantages of using mmap() over a backing file 
rather than just doing a very large malloc(). In both cases I guess you would 
get swap trashing when you ping more virtual memory than actual physical 
memory available (although I see the madvise() call that could be used to give 
a hint to avoid pages to stay in RAM for too long). Or perhaps you're thinking 
to a mmap() on a limited portion of the backing file ? And the application 
changes the mapping according to where the user requests data ? But getting 
good performance with mmaping can be tricky ( 
http://stackoverflow.com/questions/6055861/why-is-sequentially-reading-a-large-
file-row-by-row-with-mmap-and-madvise-sequen ), so perhaps the backing to disk 
could also be done with traditionnal IO ?

Anyway this wouldn't be a "usual" VSI file system since its semantics must be 
similar to traditionnal POSIX file I/O (fread(), fwrite(), etc...), and 
mmap'ing is a different beast. This would be more something like a portability 
API for the Unix vs Windows API to establish a memory mapping.

(This discussion makes me think of the latest release of sqlite where they can 
optiionnaly use mmap() : http://www.sqlite.org/mmap.html . Though it is 
limited to situations where the mmap() size you give is big enough to fit the 
file size. If that condition is met, they have observed performance boost by a 
factor of 2 in some situations w.r.t traditionnal I/O methods)

Even

-- 
Geospatial professional services
http://even.rouault.free.fr/services.html


More information about the gdal-dev mailing list