<div dir="ltr">Thanks Even, <div><br></div><div>it's difficult to do async when needing to support so many formats and corresponding third party libraries :(.</div><br><div class="gmail_quote"><div dir="ltr">On Mon, Jun 18, 2018 at 8:37 PM Even Rouault <<a href="mailto:even.rouault@spatialys.com">even.rouault@spatialys.com</a>> wrote:</div><div dir="ltr"><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

One limitation of a single-threaded asynchronous approach in your use case is <br>

that you use DEFLATE, which is CPU hungry. So you are really taking advantage <br>

of multiple vCPUs. Is the "total_cpu :   92.00 sec" really measure CPU <br>

activity ? If so, normally I'd expect this to be mostly spent in DEFLATE <br>

decompression (although 92 s to decompresss ~ 500 MB seems too much, so <br>

there's some non-neglectable CPU activity happening somewhere else)<br></blockquote><div><br></div><div>That's poor reporting on my part, should probably take that `total_cpu` out, or rename it to something reasonable. What this is just a sum of time spent waiting for the result across all threads, most of that is blocking for I/O, only tiny fraction is doing actual work like decompression. The actual cpu load while running the benchmark is relatively  low, as observed with `htop` and `time` commands, unfortunately I'm not recording that. I'm pretty sure that I don't manage to saturate compute and the bottleneck is I/O. Latency I measure for read is quite similar to just measuring latency with `wrk` that just grabs data, so while I haven't measured the proportion of time spent decompressing vs doing IO I am fairly confident that IO dominates by a large margin.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

> <br>

> I'd like to be able to<br>

> <br>

> 1. Issue multiple open requests from the same thread<br>

<br>

In a asynchronous way ? Hard to do</blockquote><div><br></div><div> Yep, it's important to issues several opens at a time, since open is more than 50% of the total latency when reading just one tile. It kinda makes sense, 2 requests cost similar, even though second one is much larger in bytes, first one is on a "cold" object so takes more time than a larger second request.</div><div><br></div><div><br></div></div></div>