[gdal-dev] Why is saving an array to ENVI file format using SaveArray so slow?
Even Rouault
even.rouault at spatialys.com
Mon Jul 27 11:01:05 PDT 2015
On Monday 27 July 2015 16:23:26 mike c wrote:
> Dear Even and GDAL community
>
>
>
> Thanks for the feedback.
>
>
>
>
> After receiving your
> email I tried to narrow down the source of the problem. On further
> investigation it seems like indeed, like you, writing to my local disk is
> quick. I notice, however, the problem occurs when writing to file shares
> on a local area network. The remote file shares where the underlying file
> system is UNIX seems to cause the
> problem i.e. writing an image (128,4000,128) using SaveArray(img,remote
> system ,’ENVI’) from my Windows system to remote
> file share(where the file system is UNIX) takes 3-4 minutes. Writing the
> image between the same systems to GeoTiff format appears to be quicker ~24
> second. Using the ndarray.tofile() to write the same image from the same
> system to the same remote disk takes ~20 seconds. Copying the same size
> image between the two systems using Explorer takes only a few seconds.
> Using the SaveArray(img,remote system ,’ENVI’) operation to write the
> image to a remote Windows file
> share on the same network takes a few seconds.
>
>
>
> Any thoughts as to why this is happening and whether there
> is a way of overcoming this problem would be welcomed.
Mike,
Looking more closely at file system access, the GeoTIFF driver will write by
chunks of 64 KB (128 bands * 128 pixels * sizeof(float), since the default
interleaving of the GTiff driver is PIXEL), where as the ENVI driver will write
by chunks of 1 KB ( 128 pixels * sizeof(float) since the default interleaving
of the ENVI driver is BAND) (on Linux, the C library actually caches by chunks
of 4 KB). That might be the likely explanation.
Workarounds :
- write to a local file and then copy using shutil.copy() or other efficient
methods
- use SaveArray() to a /vsimem/ file and then write it to final destinations.
See https://svn.osgeo.org/gdal/trunk/autotest/gcore/vsifile.py for examples on
how to read from /vsimem/
Extending /vsicache/ to support write operations could be an interesting
option to solve more generally that kind of issue.
Even
>
>
>
> Thanks
>
> mike
>
> > From: even.rouault at spatialys.com
> > To: gdal-dev at lists.osgeo.org
> > Subject: Re: [gdal-dev] Why is saving an array to ENVI file format using
> > SaveArray so slow? Date: Fri, 24 Jul 2015 21:31:30 +0200
> > CC: mikec7200 at hotmail.com
> >
> > Mike,
> >
> > I've tried the following snippet:
> >
> > from osgeo import gdal, gdalnumeric
> > import numpy
> >
> > ar = numpy.zeros( [128,4000,128], dtype = numpy.float32)
> > gdalnumeric.SaveArray(ar,'testenvi.bin','ENVI')
> > gdalnumeric.SaveArray(ar,'testtif.tif','GTiff')
> >
> > And it runs in ~ 2 seconds on both GDAL 1.11 and trunk. (Linux 64bit, but
> > I'm not sure why the OS would account for such a dramatic difference)
> >
> > Even
> >
> > > Dear GDAL Developers
> > >
> > >
> > >
> > > I’m finding that when
> > > saving a relatively small array (shape =
> > > 128,4000,128 ie 128 band x 4000 lines x
> > > 128 columns of float data = 256Mb total) to a ENVI file format using
> > > the
> > > call:
> > >
> > >
> > >
> > > gdalnumeric.SaveArray(img,strf_out,'ENVI')
> > >
> > >
> > >
> > > performance is extremely slow (~260 seconds). When I save the same
> > > array
> > > to Geotif format using;
> > >
> > >
> > >
> > > gdalnumeric.SaveArray(img,strf_out,'GTIFF')
> > >
> > >
> > >
> > > the operation
> > > takes ~24 seconds which is much more
> > > acceptable.
> > >
> > >
> > >
> > > I’m currently
> > > using gdal version 1.11.1 .
> > >
> > >
> > >
> > > My question to
> > > the forum is then -why is there such a
> > > large discrepancy between the performance of the two operations and is
> > > there a way to improve the performance when writing an array to an ENVI
> > > file format?
> > >
> > >
> > >
> > > I have tried changing
> > > the GDAL Cache options but performance doesn’t seem to alter. When
> > > profiling the code which saves the array to GeoTIFF format I get the
> > > output contained in the attached file timing_savearray_GTIFF.txt. When
> > > profiling the code which saves the array to ENVI format I get the output
> > > contained in the attached file timing_savearray_ENVI.txt
> > >
> > >
> > >
> > > I would like to
> > > save the data to ENVI file format. Your
> > > thoughts on this matter would be appreciated.
> > >
> > >
> > >
> > > Thanks
> > >
> > >
> > >
> > > Regards
> > >
> > > mike
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the gdal-dev
mailing list