[gdal-dev] Amazon S3 virtual file system /vsis3/ and /vsis3_streaming/

Even Rouault even.rouault at spatialys.com
Tue Oct 27 11:23:57 PDT 2015


Hi,

Just to mention that trunk has received 2 new virtual file systems
/vsis3/ and /vsis3_streaming/ to handle files stored on Amazon S3.

They are specializations of well known /vsicurl/ and /vsicurl_streaming/
to deal with S3 authentication and other particularities. In addition to
reading, /vsis3/ supports sequential writing.

For more details, quoting the docs:

/**
 * \brief Install /vsis3/ Amazon S3 file system handler (requires libcurl)
 *
 * A special file handler is installed that allows on-the-fly random reading of files
 * available in AWS S3 buckets, without prior download of the entire file.
 * It also allows sequential writing of files (no seeks or read operations are then
 * allowed).
 *
 * Recognized filenames are of the form /vsis3/bucket/key where
 * bucket is the name of the S3 bucket and key the S3 object "key", i.e.
 * a filename potentially containing subdirectories.
 *
 * Partial downloads are done with a 16 KB granularity by default.
 * If the driver detects sequential reading
 * it will progressively increase the chunk size up to 2 MB to improve download
 * performance.
 *
 * The AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID configuration options *must* be
 * set.
 * The AWS_REGION configuration option may be set to one of the supported
 * <a href="http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region">S3 regions</a>
 * and defaults to 'us-east-1'
 * The AWS_S3_ENDPOINT configuration option defaults to s3.amazonaws.com.
 *
 * The GDAL_HTTP_PROXY, GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration options can be
 * used to define a proxy server. The syntax to use is the one of Curl CURLOPT_PROXY,
 * CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
 *
 * On reading, the file can be cached in RAM by setting the configuration option
 * VSI_CACHE to TRUE. The cache size defaults to 25 MB, but can be modified by setting
 * the configuration option VSI_CACHE_SIZE (in bytes).
 *
 * On writing, the file is uploaded using the S3 <a href="http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html">multipart upload API</a>.
 * The size of chunks is set to 50 MB by default, allowing creating files up to
 * 500 GB (10000 parts of 50 MB each). If larger files are needed, then increase the
 * value of the VSIS3_CHUNK_SIZE config option to a larger value (expressed in MB).
 * In case the process is killed and the file not properly closed, the multipart upload
 * will remain open, causing Amazon to charge you for the parts storage. You'll have to
 * abort yourself with other means such "ghost" uploads
 * (e.g. with the <a href="http://s3tools.org/s3cmd">s3cmd</a> utility)
 * For files smaller than the chunk size, a simple PUT request is used instead
 * of the multipart upload API.



/**
 * \brief Install /vsis3_streaming/ Amazon S3 file system handler (requires libcurl)
 *
 * A special file handler is installed that allows on-the-fly sequential reading of files
 * streamed from AWS S3 buckets without prior download of the entire file.
 *
 * Recognized filenames are of the form /vsis3_streaming/bucket/key where
 * bucket is the name of the S3 bucket and resource the S3 object "key", i.e.
 * a filename potentially containing subdirectories.
 *
 * The AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID configuration options *must* be
 * set.
 * The AWS_REGION configuration option may be set to one of the supported
 * <a href="http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region">S3 regions</a>
 * and defaults to 'us-east-1'
 * The AWS_S3_ENDPOINT configuration option defaults to s3.amazonaws.com.
 *
 * The GDAL_HTTP_PROXY, GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration options can be
 * used to define a proxy server. The syntax to use is the one of Curl CURLOPT_PROXY,
 * CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
 *
 * The file can be cached in RAM by setting the configuration option
 * VSI_CACHE to TRUE. The cache size defaults to 25 MB, but can be modified by setting
 * the configuration option VSI_CACHE_SIZE (in bytes).
 *


Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list