[gdal-dev] python downsample API?
David Shean
dshean at uw.edu
Wed Apr 11 15:17:27 EDT 2012
Michael,
Scott is right. Not sure if this is the preferred approach, but I accomplished this for large datasets by specifying buffer sizes for ReadAsArray. The doc I consulted is here: http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray.
I used masked arrays to exclude nodata values - you may not need to worry about with this.
-David
Excerpt from my script:
src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
b = src_ds.GetRasterBand(1)
ndv = b.GetNoDataValue()
ns = src_ds.RasterXSize
nl = src_ds.RasterYSize
#Don't want to load the entire dataset for stats computation
#This is maximum dimension for reduced resolution array
max_dim = 1024.
scale_ns = ns/max_dim
scale_nl = nl/max_dim
scale_max = max(scale_ns, scale_nl)
if scale_max > 1:
nl = round(nl/scale_max)
ns = round(ns/scale_max)
#The buf_size parameters determine the final array dimensions
bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns, buf_ysize=nl)), ndv)
On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:
> Hi Michael,
>
>
> I may be missing your question, but why aren't you just using ReadAsArray? It has an option to return a smaller array from the input array. Now, I'm not sure how it does the resampling (you could look to see), but you can make a call like
>
> data = banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize)
>
> where xsize and ysize are smaller than the true RasterXSize or RasterYSize. I haven't looked at this in a while, but I'm pretty sure this will work. Did I miss the point of what you were asking?
>
>
> Thanks,
> Scott
>
>
> On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <kmichael.aye at gmail.com> wrote:
> Dear all,
>
> is there a Python API for downsampling a huge dataset?
> What I would like to do:
>
> * get my dataset
> * read out RasterXSize and RasterYSize
> * calculate how many lines and rows I need to skip to get a quick overview image, e.g. 10 lines to skip.
> * Have a ReadAsArray interface where I can say something like this:
> ** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)
>
> which in numpy terms would give me every 10nth line like this: array[:,:,10]
>
> I really don't need quality at all, just speed, for a rough overview for further zooming in with lassos, as the images I deal with sometimes have more than 200 MPixels.
>
> Is this possible in Python?
> I was thinking now, maybe one could use numpy's memmap somehow for this, don't know much about it, though…
>
> Thanks for any hints!
>
> Best regards,
> Michael
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20120411/edcc4cbf/attachment.html
More information about the gdal-dev
mailing list