[gdal-dev] TAR POSIX support
Schindler, Fabian
fabian.schindler at eox.at
Wed Aug 21 06:14:03 PDT 2019
Hi Even,
Thanks for your quick response!
The issue is not that we are dealing with POSIX TAR files (which actually
seems to be handled fine with GDAL 3.1, when we created them using `tar
--format=pax`).
The issue seems to be the handling of the file user/group separator.
Looking at the code, the expected separator is nul (`\0`) (
https://github.com/OSGeo/gdal/blob/master/gdal/port/cpl_vsil_tar.cpp#L336)
but our file seems to use space characters as separators:
$ python
>>> a = open('a.tar', 'rb').read(1024)
>>> a[100:150]
b'0100644 0000000 0000000 00000000162 13460560104 01'
We believe that this is the relevant line in the specification:
https://github.com/Keruspe/tar-parser.rs/blob/master/tar.specs#L202, as it
states that both spaces and nuls shall be allowed in the separation of the
groups.
We actually succeeded in loading the tiff when we started the `gdalinfo`
with `gdb`, setting a breakpoint in the above location and the did the
following for each file:
(gdb) set abyHeader[115]=0
(gdb) set abyHeader[123]=0
(gdb) set abyHeader[107]=0
This produced a list of files within that archive.
So I think, the (already hefty) `if` clause could just be expanded to also
allow spaces in the separators and we would be fine.
Do you think that is feasible?
Regards
On Wed, 21 Aug 2019 at 13:24, Even Rouault <even.rouault at spatialys.com>
wrote:
> Fabian,
>
> /vsitar/ has indeed no dedicated support for POSIX.1-2001 (pax)
>
> On a very simple test with a tiny file from GDAL autotest,
>
> tar --format=pax --create -f byte_pax.tar byte.tif
>
> then gdalinfo /vsitar/byte_pax.tar/byte.tif works.
>
> But I see that GDAL sees an extra text file "PaxHeaders.25680/byte.tif"
> which
> is the one mentionned in
> https://en.wikipedia.org/wiki/Tar_(computing)#POSIX.1-2001/pax which
> contains
> 3 timestamps with nanosecond precision.
>
> So I suspect your particular .tar file uses specific pax extensions which
> are
> not UStar compatible. I didn't dig more. Would probably require an example
> to
> better see what's going on.
>
> Even
>
> > Hi list,
> >
> > We have a large archive of images which are stored in POSIX TAR format
> > (POSIX.1-2001 (pax)). GDAL seems to be unable to open these files (GDAL
> > 2.4.0). Unfortunately, re-packing these files in GNU format (which GDAL
> > seems to be happy with) is not an option.
> >
> > We also tried with the latest Docker image (alpine-small-latest, GDAL
> > 3.1.0dev-3c44aef0f367d0439d42f5384896fc7899317f06), but the same issue
> > persists: the GNU tar is opened without issues, the POSIX one fails.
> >
> > Is it confirmed that GDAL does not have support for that TAR POSIX
> format?
> > How much effort would it be to add support for that format?
> >
> > Regards,
> > Fabian
>
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20190821/c143eda3/attachment.html>
More information about the gdal-dev
mailing list