<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Michael,</div><div>Scott is right. Not sure if this is the preferred approach, but I accomplished this for large datasets by specifying buffer sizes for ReadAsArray. The doc I consulted is here: <a href="http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray">http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray</a>. </div><div>I used masked arrays to exclude nodata values - you may not need to worry about with this.</div><div>-David</div><div><br></div><div>Excerpt from my script:</div><div><br></div><div>src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)<br>b = src_ds.GetRasterBand(1)<br>ndv = b.GetNoDataValue()<br>ns = src_ds.RasterXSize<br>nl = src_ds.RasterYSize<br><br>#Don't want to load the entire dataset for stats computation</div><div>#This is maximum dimension for reduced resolution array<br>max_dim = 1024.<br><br>scale_ns = ns/max_dim<br>scale_nl = nl/max_dim<br>scale_max = max(scale_ns, scale_nl)<br><br>if scale_max > 1:<br> nl = round(nl/scale_max)<br> ns = round(ns/scale_max)<br><br>#The buf_size parameters determine the final array dimensions<br>bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns, buf_ysize=nl)), ndv)</div><div><br></div><div><br></div><div><div>On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Hi Michael,<div><br></div><div><br></div><div>I may be missing your question, but why aren't you just using ReadAsArray? It has an option to return a smaller array from the input array. Now, I'm not sure how it does the resampling (you could look to see), but you can make a call like</div>
<div><br></div><div>data = banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize)</div><div><br></div><div>where xsize and ysize are smaller than the true RasterXSize or RasterYSize. I haven't looked at this in a while, but I'm pretty sure this will work. Did I miss the point of what you were asking?</div>
<div><br></div><div><br></div><div>Thanks,</div><div>Scott</div><div><br><br><div class="gmail_quote">On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <span dir="ltr"><<a href="mailto:kmichael.aye@gmail.com">kmichael.aye@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>
<br>
is there a Python API for downsampling a huge dataset?<br>
What I would like to do:<br>
<br>
* get my dataset<br>
* read out RasterXSize and RasterYSize<br>
* calculate how many lines and rows I need to skip to get a quick overview image, e.g. 10 lines to skip.<br>
* Have a ReadAsArray interface where I can say something like this:<br>
** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)<br>
<br>
which in numpy terms would give me every 10nth line like this: array[:,:,10]<br>
<br>
I really don't need quality at all, just speed, for a rough overview for further zooming in with lassos, as the images I deal with sometimes have more than 200 MPixels.<br>
<br>
Is this possible in Python?<br>
I was thinking now, maybe one could use numpy's memmap somehow for this, don't know much about it, though…<br>
<br>
Thanks for any hints!<br>
<br>
Best regards,<br>
Michael<br>
<br>
<br>
______________________________<u></u>_________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/<u></u>mailman/listinfo/gdal-dev</a><br>
</blockquote></div><br><br><br>
</div>
_______________________________________________<br>gdal-dev mailing list<br><a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a><br>http://lists.osgeo.org/mailman/listinfo/gdal-dev</blockquote></div><br></body></html>