[gdal-dev] Why is saving an array to ENVI file format using SaveArray so slow?

Even Rouault even.rouault at spatialys.com
Mon Jul 27 11:01:05 PDT 2015


On Monday 27 July 2015 16:23:26 mike c wrote:
> Dear Even and GDAL community
> 
> 
> 
> Thanks for the feedback.
> 
> 
> 
> 
>  After receiving your
> email I tried to narrow down the source of the problem.  On further
> investigation it seems like indeed, like you,  writing to my local disk is
> quick.  I notice, however,  the problem occurs when writing to file shares
> on a local area network.  The remote file shares where the underlying file
> system is UNIX  seems to cause the
> problem i.e. writing an image (128,4000,128) using  SaveArray(img,remote
> system ,’ENVI’) from my Windows system to remote
> file share(where the file system is UNIX) takes 3-4 minutes. Writing the
> image  between the same systems to GeoTiff format appears to be quicker ~24
> second. Using the ndarray.tofile() to write the same image from the same
> system to the same remote disk takes ~20 seconds.  Copying the same size
> image between the two systems using Explorer takes only a few seconds.  
> Using  the SaveArray(img,remote system ,’ENVI’)   operation to write the
> image to  a remote Windows file
> share on the same network  takes  a   few seconds.
> 
> 
> 
> Any thoughts as to why this is happening and whether there
> is a way of overcoming this problem would be welcomed.

Mike,

Looking more closely at file system access, the GeoTIFF driver will write by 
chunks of 64 KB (128 bands * 128 pixels * sizeof(float), since the default 
interleaving of the GTiff driver is PIXEL), where as the ENVI driver will write 
by chunks of 1 KB ( 128 pixels * sizeof(float) since the default interleaving 
of the ENVI driver is BAND) (on Linux, the C library actually caches by chunks 
of 4 KB). That might be the likely explanation.

Workarounds :
- write to a local file and then copy using shutil.copy() or other efficient 
methods
- use SaveArray() to a /vsimem/ file and then write it to final destinations. 
See https://svn.osgeo.org/gdal/trunk/autotest/gcore/vsifile.py for examples on 
how to read from /vsimem/

Extending /vsicache/ to support write operations could be an interesting 
option to solve more generally that kind of issue.

Even



> 
> 
> 
> Thanks
> 
> mike
> 
> > From: even.rouault at spatialys.com
> > To: gdal-dev at lists.osgeo.org
> > Subject: Re: [gdal-dev] Why is saving an array to ENVI file format using
> > SaveArray so slow? Date: Fri, 24 Jul 2015 21:31:30 +0200
> > CC: mikec7200 at hotmail.com
> > 
> > Mike,
> > 
> > I've tried the following snippet:
> > 
> > from osgeo import gdal, gdalnumeric
> > import numpy
> > 
> > ar = numpy.zeros( [128,4000,128], dtype = numpy.float32)
> > gdalnumeric.SaveArray(ar,'testenvi.bin','ENVI')
> > gdalnumeric.SaveArray(ar,'testtif.tif','GTiff')
> > 
> > And it runs in ~ 2 seconds on both GDAL 1.11 and trunk. (Linux 64bit, but
> > I'm not sure why the OS would account for such a dramatic difference)
> > 
> > Even
> > 
> > > Dear GDAL Developers
> > > 
> > > 
> > > 
> > > I’m  finding that when
> > > saving  a relatively small array (shape =
> > > 128,4000,128  ie 128 band x 4000 lines x
> > > 128 columns of float data = 256Mb total)  to a ENVI file format using
> > > the
> > > call:
> > > 
> > > 
> > > 
> > > gdalnumeric.SaveArray(img,strf_out,'ENVI')
> > > 
> > > 
> > > 
> > > performance is extremely slow (~260 seconds).  When I save the same
> > > array
> > > to Geotif  format using;
> > > 
> > > 
> > > 
> > > gdalnumeric.SaveArray(img,strf_out,'GTIFF')
> > > 
> > > 
> > > 
> > > the operation
> > > takes  ~24 seconds which is much more
> > > acceptable.
> > > 
> > > 
> > > 
> > > I’m currently
> > > using gdal version 1.11.1 .
> > > 
> > > 
> > > 
> > > My question to
> > > the forum is then  -why is there such a
> > > large discrepancy between the performance of the two operations and is
> > > there a way to improve the performance when writing an array to  an ENVI
> > > file format?
> > > 
> > > 
> > > 
> > > I have tried changing
> > > the GDAL Cache options but performance doesn’t seem to alter.  When
> > > profiling the code which saves the array to GeoTIFF format I get the
> > > output contained in the attached file timing_savearray_GTIFF.txt.  When
> > > profiling the code which saves the array to ENVI format I get the output
> > > contained in the attached file timing_savearray_ENVI.txt
> > > 
> > > 
> > > 
> > > I would like to
> > > save the data to ENVI file format.  Your
> > > thoughts on this matter would be appreciated.
> > > 
> > > 
> > > 
> > > Thanks
> > > 
> > > 
> > > 
> > > Regards
> > > 
> > > mike

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list