[gdal-dev] python downsample API?

David Shean dshean at uw.edu
Wed Apr 11 15:17:27 EDT 2012


Michael,
Scott is right.  Not sure if this is the preferred approach, but I accomplished this for large datasets by specifying buffer sizes for ReadAsArray.  The doc I consulted is here: http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray.  
I used masked arrays to exclude nodata values - you may not need to worry about with this.
-David

Excerpt from my script:

src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
b = src_ds.GetRasterBand(1)
ndv = b.GetNoDataValue()
ns = src_ds.RasterXSize
nl = src_ds.RasterYSize

#Don't want to load the entire dataset for stats computation
#This is maximum dimension for reduced resolution array
max_dim = 1024.

scale_ns = ns/max_dim
scale_nl = nl/max_dim
scale_max = max(scale_ns, scale_nl)

if scale_max > 1:
    nl = round(nl/scale_max)
    ns = round(ns/scale_max)

#The buf_size parameters determine the final array dimensions
bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns, buf_ysize=nl)), ndv)


On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:

> Hi Michael,
> 
> 
> I may be missing your question, but why aren't you just using ReadAsArray?  It has an option to return a smaller array from the input array.  Now, I'm not sure how it does the resampling (you could look to see), but you can make a call like
> 
> data = banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize)
> 
> where xsize and ysize are smaller than the true RasterXSize or RasterYSize.  I haven't looked at this in a while, but I'm pretty sure this will work.  Did I miss the point of what you were asking?
> 
> 
> Thanks,
> Scott
> 
> 
> On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <kmichael.aye at gmail.com> wrote:
> Dear all,
> 
> is there a Python API for downsampling a huge dataset?
> What I would like to do:
> 
> * get my dataset
> * read out RasterXSize and RasterYSize
> * calculate how many lines and rows I need to skip to get a quick overview image, e.g. 10 lines to skip.
> * Have a ReadAsArray interface where I can say something like this:
> ** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)
> 
> which in numpy terms would give me every 10nth line like this: array[:,:,10]
> 
> I really don't need quality at all, just speed, for a rough overview for further zooming in with lassos, as the images I deal with sometimes have more than 200 MPixels.
> 
> Is this possible in Python?
> I was thinking now, maybe one could use numpy's memmap somehow for this, don't know much about it, though…
> 
> Thanks for any hints!
> 
> Best regards,
> Michael
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20120411/edcc4cbf/attachment.html


More information about the gdal-dev mailing list