[gdal-dev] Cache when dealing with several processes and COG

Sean Gillies sean at mapbox.com
Tue Jul 3 09:34:32 PDT 2018


Hi Guy,

On Tue, Jul 3, 2018 at 12:06 AM Guy Doulberg <guyd at satellogic.com> wrote:

> Hi guys,
>
> I am working on a tileserver use case on top of cogs.
>
> I  want to find a cache mechanism to my architecture.
>
> The tile-server architecture is several python processes(gunicorn) running
> on several VMs.
>
> I understand how GDAL caches the curl blocks or the raster using
> intra-process caching, but I can't use this cache in the other
> processes/vms.
>
> I was thinking maybe to use some kind of http proxy server that will cache
> the bytes content retrieved from the http server holding the cogs (Azure
> blob storage)
>
> There is some data that can be reused(therefore cached) across all tile
> requests for example:
> 1. The file size (HEAD)
> 2. The first header block
> 3. The other header blocks
> 4. maybe in some cases the image blocks themselves (in case you take the
> same blocks all the time but change something in the presentation layer)
>
> did any of you tried this architecture or used a different way to cache
> across servers?  maybe there is a way to share GDAL_CACHE across process
> that I missed?
>
> Thanks,
> Guy
>

Nginx advertises some support for byte range caching that I've been meaning
to try: https://www.nginx.com/blog/smart-efficient-byte-range-caching-nginx/
.

My own strategy so far has been to deploy to the same cloud as the data and
profit from higher bandwidth and not to think about caching very much at
all.

-- 
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180703/ace2cd8e/attachment.html>


More information about the gdal-dev mailing list