[gdal-dev] GDAL and numpy, masking arrays

Christopher Barker Chris.Barker at noaa.gov
Wed Nov 12 15:27:15 EST 2008


Armin Burger wrote:
> maybe someone on this list has some experience using numpy in 
> combination with GDAL/Python and could give me some advice.

For further questions about numpy, the numpy list is very helpful.


> difference can be easily calculated after reading both images 
> into an array and substract one from the other.

> The problem is that both images can contain clouds or snow that have 
> predefined pixel values (252,253).

> Does anybody know if this is possible and how to perform it?

yep.

> Looking 
> through the numpy docs I was not able to identify required methods or 
> functions for this. There is something like 'masked arrays' but I have 
> not understood if this could be used for my purpose.

This is exactly the kind of thing masked arrays are for. Yu can create a 
masked array out of your data with something like:

 >>> a
array([ 1,  2,  3,  4,  5,  6,  3,  4, 67,  4,  3,  5,  6,  7])

#a regular array

 >>> import numpy.ma as ma

# create a masked array with the mask set at all elements with a value of 3:
 >>> a = ma.masked_values(a, 3)
 >>> a
masked_array(data = [1 2 -- 4 5 6 -- 4 67 4 -- 5 6 7],
       mask = [False False  True False False False  True False False 
False  True False
  False False],
       fill_value=3)

#another one:
 >>> b = np.array((1,3,4,2,4,7,4,5,23,5,7,3,8,5))
 >>> b = ma.masked_values(b, 3)
 >>> b
masked_array(data = [1 -- 4 2 4 7 4 5 23 5 7 -- 8 5],
       mask = [False  True False False False False False False False 
False False  True
  False False],
       fill_value=3)


add them together:
 >>> c = a+b
 >>> c
masked_array(data = [2 -- -- 6 9 13 -- 9 90 9 -- -- 14 12],
       mask = [False  True  True False False False  True False False 
False  True  True
  False False],
       fill_value=3)


If your values are Floating Point, then Another option would be to 
replace all the "cloud" values with NaN:

 >>> a = np.array((1,2,3,4,5,6,3,4,67,4,3,5,6,7), dtype=np.float)
 >>> a[a==3] = np.nan
 >>> a
array([  1.,   2.,  NaN,   4.,   5.,   6.,  NaN,   4.,  67.,   4.,  NaN,
          5.,   6.,   7.])
 >>> b = np.array((1,3,4,2,4,7,4,5,23,5,7,3,8,5), dtype=np.float)
 >>> b[b==3] = np.nan
 >>> b
array([  1.,  NaN,   4.,   2.,   4.,   7.,   4.,   5.,  23.,   5.,   7.,
         NaN,   8.,   5.])
 >>> a+b
array([  2.,  NaN,  NaN,   6.,   9.,  13.,  NaN,   9.,  90.,   9.,  NaN,
         NaN,  14.,  12.])


-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


More information about the gdal-dev mailing list