[gdal-dev] GDAL, vsis3 and vsisubfile

Tue Jul 25 01:19:56 PDT 2017

On lundi 24 juillet 2017 22:41:05 CEST Mike Pfaffenberger wrote:
> Hi Even,
> 
> I ran the script you linked, and your hypothesis is absolutely correct.
> 
> <JP2KCodeStream filename="/vsisubfile/4038_901949970,/vsis3/glitch253/
> test2.ntf">
> .
> .
> .
> <Field name="SGcod_Progress" type="uint8" description="RLCP">1</Field>
> <Field name="SGcod_NumLayers" type="uint16">19</Field>
> 
> I also added a quick printf in the vsi subfile read function which prints
> the nSize and nCount variables. Running the python script you linked me
> triggered the vsi subfile read function 75,868 times, mostly with small
> sizes, and nCount=1.
> 
> Doing the same thing on my gdal_translate -srcwin 000 000 1000 1000
> triggered vsi subfile read 9,024 times, almost all with nSize=1 and
> nCount=1024. If the vsisubfile object is wrapping a vsis3 dataset, then
> does each vsi subfile read turn into an HTTP request? 

Not exactly. /vsis3/ reads by chunks of a minimum 16 KB (with a logic to grow this chunk size 
when it realizes that the chunks are consecutive), and with a cache, to avoid issuing too many 
small HTTP range requests. The issue with the original NITF file are that those small sizes 
must be scattered through the whole file, causing a lot of 16 KB chunks to be read. I doubt 
that reducing the chunk size would really help with performance because the bottleneck 
must be more the latency of each HTTP request than the amount of bytes transfered.

That said it is not difficult to try. You can edit the port/cpl_vsil_curl.cpp file and modify the 
value of the DOWNLOAD_CHUNK_SIZE constant to be something smaller than the current 
16384 (e.g try 1024)

LRCP with many quality layers is ideal when you want to be able to do a progressive 
rendering of the whole image. Remember the old times of super slow Internet where your 
browser would display a progressive JPEG or PNG with growing quality.

> That would certainly
> explain the extremely long time to crop my window.
> 
> Just out of curiosity I ran the python script you linked on my JP2 file
> (same image as the NITF, I just ran gdal_translate on it).
> 
> This one appears to have the codestream progression order LRCP with only
> one layer...?:
>  <Field name="SGcod_Progress" type="uint8" description="LRCP">0</Field>
>  <Field name="SGcod_NumLayers" type="uint16">1</Field>
> 
> I'm guessing the fact that my JP2 file only has one layer is the reason
> vsis3 works well with it, regardless of it being LRCP (not optimal for
> windowed reads).

When one of the L, R, C, P "dimension" is of size 1, it doesn't really count. So with one single 
quality layer, this is in fact a RCP layout, which means that resolution level are scattered 
through the files. It is still not the ideal layout fro windowed reads at full resolution (unless 
there's just one resolution level. You can check it with the value of 
<SPcod_NumDecompositions>)

> 
> Anyway, thanks. I learned some more about JPEG2K here. Unfortunately I
> think I'm pretty out of luck on the prospect of doing remote windowed reads
> quickly on this data.

Yes, I don' think it is really possible to improve that with those datasets unmodified.

I l believe that there are some JPEG2000 toolkits that can change the progression order of an 
existing JPEG2000 file without adding new loss (even if it uses lossy compression).

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170725/29be4a0b/attachment-0001.html>