[gdal-dev] GDAL, vsis3 and vsisubfile

Even Rouault even.rouault at spatialys.com
Mon Jul 24 08:21:19 PDT 2017


Mike,

(note to other readers: this is the continuation of the thread
[gdal-dev] VSIS3 on digital globe multiview-stereo (NITF) )

> I turned on some debug options that shed some light on to what's going on.
> It appears that the NITF driver must internally open a JPEG 2000 Driver on
> a virtual subfile. In my case, that virtual subfile starts at offset 4038
> and continues to the end of the file, offset 901949970.
> 
> While this is a nice way of providing a JPEG2000 decompression routine to
> the NITF driver, when accessing a remote dataset, it causes the entire file
> to be downloaded even when reading a small window.
> 
> I used gdal_translate locally on my NITF file and turned it into a JP2
> file, then I uploaded this file to S3 and ran my gdal_translate -srcwin 000
> 000 1000 1000 /vsis3/mybucket/jp2file.JP2 local_file.tiff and it ran
> instantly. Is there a way to completely bypass using the NITF driver and
> simply open the NITF file with the JP2 driver wrapped up with vsis3?

Yes, you should be able to open the following filename, but this is actually what the NITF 
driver does :
/vsisubfile/4038_901949970,/vsis3/glitch253/test2.ntf  (you may need to adjust the second 
value '901949970' to be 901949970-4038, since it is supposed to be a lenght and not an 
offset)
This shoud be recognized by one of the JPEG2000 drivers, and you should likely get the same 
performance characteristics as using it through the NITF driver (or the NITF driver does 
something that requires reading the whole file, but I don't think so)

My hypothesis is that the root cause of the performance issue comes is the progression order 
of the JPEG2000 codestream of this NITF file, that causes most of the file to be read through. 
Likely only X % of bytes are really read, but as they are scattered throughout the whole file, 
given the chunk by chunk downloading logic of /vsis3, you end up reading the whole file in 
practice.
For example I'd expect LRCP (Layer-Resolution-Component-Precincts), RLCP and RPCL to 
cause issues. Whereas PCRL and CPRL should perform better for windowed requests.

http://www.gwg.nga.mil/ntb/baseline/docs/bpj2k01/ISOJ2K_profile.pdf recommands using 
LRCP with 19-20 quality layers, so that would indeed cause a lot of seeking through the file. 
You can check the progression order in the output of the following (check for 
"SGcod_Progress")

python dump_jp2.py /vsisubfile/4038_901949970,/vsis3/glitch253/test2.ntf

where dump_jp2.py is
https://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/dump_jp2.py

It is likely that your translating into JP2 turn the original codestream into one with a 
progression order that is more seeking friendly (the default progression order may be 
different depending on drivers)

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170724/07b90598/attachment.html>


More information about the gdal-dev mailing list