[gdal-dev] get data from one s3 bucket, process and upload to another bucket as docker container

Daniel Evans daniel.fred.evans at gmail.com
Tue Nov 29 11:24:19 PST 2022


Hi Marcin,

For writing the data to an S3 bucket, two options come to mind:

1) Write directly from GDAL, using the /vsis3/ virtual file system
2) Write the data locally from GDAL, and then copy to S3 using another
Python library (e.g. boto3, s3fs) or the AWS CLI

Option 2 is quite widely adopted. While it seems more complex, writing COGs
directly to S3 is difficult (impossible?), because it requires random write
access, while S3 only supports sequential writes.

Another option I've seen is to use rasterio, the Pythonic GDAL wrapper, to
write to an in-memory buffer, and then give this as a byte steam for boto3
to copy to S3. This avoids use of the local filesystem, but won't scale to
very large rasters. I'm unsure how easy it would be to achieve this from
the plain GDAL Python library.

Regards,
Daniel

On Tue, 29 Nov 2022, 10:24 Marcin Niemyjski via gdal-dev, <
gdal-dev at lists.osgeo.org> wrote:

> Hello,
>
> First of all, thank you for providing awesome open-source tools, my work
> would not be possible without it ^^
>
> I'm looking for an effective way/workflow to get data from s3 bucket, then
> generate COG and VRT from it into another S3 bucket. Everything should be
> working as containerized python application.
>
> The main problem is that I do not know how to transfer processed data
> (COG, VRT) into another bucket.
>
> Best,
> Macin Niemyjski
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20221129/95a89adf/attachment.htm>


More information about the gdal-dev mailing list