I am working on processing some raster images and have been really struggling with processing times. I have tracked the issue down to GDAL's WriteArray() function. I am cutting the image into blocks and processing in small chunks. This is facilitate working with really large images where the entire array can not be loaded into memory. <div>
<br></div><div>If I try to call WriteArray() after processing each block processing time is 3+ hours for a 1GB image. By loading the processed block (numpy arrays) into a python list and then writing out a number at the same time I have cut processing time to sub 2 minutes.</div>
<div><br></div><div>What is GDAL doing when it calls writeArray() that is requiring so much time? Has anyone else encountered this and been able to speed up GDAL's array writing?</div><div><br></div><div>Thanks Jay</div>
<div><br></div><div>Here is an example is pseudocode:</div><div><br></div><div>open the raster</div><div><br></div><div>iterate through the bands using dataset.GetRasterBand(j)</div><div><br></div><div>iterate through the rows</div>
<div> iterate through the columns</div><div> </div><div> readAsArray(a block of rows and columns)</div><div><br></div><div> process the array in numpy</div><div><br></div><div> #The more often this is called the slower the entire script runs...why?</div>
<div> write out the array with outdataset.GetRasterBand(j).WriteArray(the proper place to insert the modified array) </div>