[gdal-dev] Re: python downsample API?

K.-Michael Aye kmichael.aye at gmail.com
Wed Apr 11 16:50:23 EDT 2012


On 2012-04-11 19:17:27 +0000, David Shean said:

> Michael,
> Scott is right.  Not sure if this is the preferred approach, but I 
> accomplished this for large datasets by specifying buffer sizes for 
> ReadAsArray.  The doc I consulted is here: 
> http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray.  
> I used masked arrays to exclude nodata values - you may not need to 
> worry about with this.
> -David
> 
> Excerpt from my script:
> 
> src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
> b = src_ds.GetRasterBand(1)
> ndv = b.GetNoDataValue()
> ns = src_ds.RasterXSize
> nl = src_ds.RasterYSize
> 
> #Don't want to load the entire dataset for stats computation
> #This is maximum dimension for reduced resolution array
> max_dim = 1024.
> 
> scale_ns = ns/max_dim
> scale_nl = nl/max_dim
> scale_max = max(scale_ns, scale_nl)
> 
> if scale_max > 1:
>     nl = round(nl/scale_max)
>     ns = round(ns/scale_max)
> 
> #The buf_size parameters determine the final array dimensions
> bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns, 
> buf_ysize=nl)), ndv)


Holy excrements. I didn't know about this functionality, most likely 
because I only have one-banded datasets and the ds.ReadAsArray does not 
offer buf_xsize and buf_ysize!
I just tried it out and it's so endlessly fast? My compliments to the 
GDAL team!
And muchas gracias to you and Scott for pointing it out to me.

This will be a happy summer when I finally implement my private fast 
image viewer… ;)

Best regards,
Michael


> 
> On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:
> Hi Michael,
> 
> 
> I may be missing your question, but why aren't you just using 
> ReadAsArray?  It has an option to return a smaller array from the input 
> array.  Now, I'm not sure how it does the resampling (you could look to 
> see), but you can make a call like
> 
> data = 
> banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize) 
> 
> 
> where xsize and ysize are smaller than the true RasterXSize or 
> RasterYSize.  I haven't looked at this in a while, but I'm pretty sure 
> this will work.  Did I miss the point of what you were asking?
> 
> 
> Thanks,
> Scott
> 
> 
> On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <kmichael.aye at gmail.com> wrote:
> Dear all,
> 
> is there a Python API for downsampling a huge dataset?
> What I would like to do:
> 
> * get my dataset
> * read out RasterXSize and RasterYSize
> * calculate how many lines and rows I need to skip to get a quick 
> overview image, e.g. 10 lines to skip.
> * Have a ReadAsArray interface where I can say something like this:
> ** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)
> 
> which in numpy terms would give me every 10nth line like this: array[:,:,10]
> 
> I really don't need quality at all, just speed, for a rough overview 
> for further zooming in with lassos, as the images I deal with sometimes 
> have more than 200 MPixels.
> 
> Is this possible in Python?
> I was thinking now, maybe one could use numpy's memmap somehow for 
> this, don't know much about it, though…
> 
> Thanks for any hints!
> 
> Best regards,
> Michael
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev





More information about the gdal-dev mailing list