[GRASS-dev] Compute mahalanobis distance using Scipy
Glynn Clements
glynn at gclements.plus.com
Fri Jan 30 06:39:56 PST 2015
Paulo van Breugel wrote:
> I would like to compute a raster layer with for each raster cell the
> mahalanobis distance to the centre of the environmental space formed by all
> reference data points (raster cells). In R this can be done as explained
> here [1].
>
> . I would like to do this using python only (no dependency on R). I came up
> with the following, which works, but is very slow. I guess this is because
> it loops over every raster cell to compute the mahalanobis distance? Any
> idea how this can be done faster (I am very new to python, so bear with me
> if I am making stupid mistakes)
The main speed-up will come from "inlining" distance.mahalanobis(),
which is essentially:
delta = u - v
m = np.dot(np.dot(delta, VI), delta)
return np.sqrt(m)
np.dot(v, m) is equivalent to np.sum(v * m,axis=0), and np.dot(u,v) is
equivalent to np.sum(u * v), so the second line above is equivalent to
m = np.sum(np.sum(delta[:,None] * VI * delta[None,:], axis=-1), axis=-1)
Thus, delta can be extended from a 1-D array to 3-D, changing the
result from a scalar to a 2-D array. The calculation of stat_mah then
becomes:
VI = np.linalg.inv(covar)
delta = dat_ref - stat_mean[:,None,None]
m = np.sum(np.sum(delta[:,:,:,None] * delta[:,:,None,:] * VI[None,None,:,:],axis=-1),axis=-1)
stat_mah = np.sqrt(m)
This moves the loops from Python into C, which usually results in a
significant speed increase.
--
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list