[gdal-dev] Cache when dealing with several processes and COG

Guy Doulberg guyd at satellogic.com
Thu Jul 5 04:00:13 PDT 2018


Thanks, Sean

I followed the links. it seem to be working only for HTTP and not for
HTTPS, I might still use it.

One thing you need to consider when using Nginx split caching is to
configure the range of the content you want to cache.

if you are caching range is 1K and your block is 999-1500 nginx will fetch
blocks 0-1024 and 1024-2048 to return you the block you originally
requested, I wonder if I can align this range configuration to the data
blocks in a COG, I think I can't right? there is no way of knowing the
sizes of the block without reading the headers, right? especially if I am
using compression,

Guy



On Tue, Jul 3, 2018 at 7:34 PM, Sean Gillies <sean at mapbox.com> wrote:

> Hi Guy,
>
> On Tue, Jul 3, 2018 at 12:06 AM Guy Doulberg <guyd at satellogic.com> wrote:
>
>> Hi guys,
>>
>> I am working on a tileserver use case on top of cogs.
>>
>> I  want to find a cache mechanism to my architecture.
>>
>> The tile-server architecture is several python processes(gunicorn)
>> running on several VMs.
>>
>> I understand how GDAL caches the curl blocks or the raster using
>> intra-process caching, but I can't use this cache in the other
>> processes/vms.
>>
>> I was thinking maybe to use some kind of http proxy server that will
>> cache the bytes content retrieved from the http server holding the cogs
>> (Azure blob storage)
>>
>> There is some data that can be reused(therefore cached) across all tile
>> requests for example:
>> 1. The file size (HEAD)
>> 2. The first header block
>> 3. The other header blocks
>> 4. maybe in some cases the image blocks themselves (in case you take the
>> same blocks all the time but change something in the presentation layer)
>>
>> did any of you tried this architecture or used a different way to cache
>> across servers?  maybe there is a way to share GDAL_CACHE across process
>> that I missed?
>>
>> Thanks,
>> Guy
>>
>
> Nginx advertises some support for byte range caching that I've been
> meaning to try: https://www.nginx.com/blog/smart-efficient-byte-range-
> caching-nginx/.
>
> My own strategy so far has been to deploy to the same cloud as the data
> and profit from higher bandwidth and not to think about caching very much
> at all.
>
> --
> Sean Gillies
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180705/15c61332/attachment.html>


More information about the gdal-dev mailing list