[GRASS-dev] Compute mahalanobis distance using Scipy

Paulo van Breugel p.vanbreugel at gmail.com
Mon Feb 2 06:40:19 PST 2015


On Mon, Feb 2, 2015 at 1:38 PM, Glynn Clements <glynn at gclements.plus.com>
wrote:

>
> Paulo van Breugel wrote:
>
> > > The main speed-up will come from "inlining" distance.mahalanobis(),
> > > which is essentially:
> > >
> > >     delta = u - v
> > >     m = np.dot(np.dot(delta, VI), delta)
> > >     return np.sqrt(m)
> > >
> > > np.dot(v, m) is equivalent to np.sum(v * m,axis=0), and np.dot(u,v) is
> > > equivalent to np.sum(u * v), so the second line above is equivalent to
> > >
> > >     m = np.sum(np.sum(delta[:,None] * VI * delta[None,:], axis=-1),
> > > axis=-1)
> >
> >
> > Smart! I can follow the logic, but I am not sure how to solve the problem
> > below:
> >
> >    Traceback (most recent call last):
> >     File "<stdin>", line 1, in <module>
> >     ValueError: operands could not be broadcast together with shapes
> > (3,1,77,78) (3,3)
> >
> > Which refers to the different dimensions of delta and VI?
>
> The first version of the code (which is quoted above) will only work
> with 1-D arrays. It's just a fairly direct translation of the existing
> distance.mahalanobis() function, given to explain how it can then be
> extened to 3-D arrays (i.e. a 2-D array of 1-D vectors).
>
> The second version:
>
> > >     VI = np.linalg.inv(covar)
> > >     delta = dat_ref - stat_mean[:,None,None]
> > >     m = np.sum(np.sum(delta[:,:,:,None] * delta[:,:,None,:] *
> VI[None,None,:,:],axis=-1),axis=-1)
> > >     stat_mah = np.sqrt(m)
>
> should work with delta being a 3-D array.
>


Yes, but when running the lines above, I am getting the same error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes
(3,77,78,78) (1,1,3,3)

>From what I can see from broadcasting rules, there is mismatch in the last
and fore-last dimensions of VI[None,None,:,:] compared to the other two?

delta[:,:,:,None].shape
(3, 77, 78, 1)

delta[:,:,None,:].shape
(3, 77, 1, 78)

VI[None,None,:,:].shape
(1, 1, 3, 3)

I am really have to get a better grasp of working with these arrays, but in
any case, after a bit trial and error, I came up with the following to
compute m.

 m = np.sum(np.sum(delta[None,:,:,:] * VI[:,:,None,None],axis=1) * delta,
axis=0)


This does, in my limited testing, gives the same result as using the loop
with

pow(distance.mahalanobis(cell_ref, stat_mean, VI),2).


to be sure, does the above makes sense to you?


Paulo










> --
> Glynn Clements <glynn at gclements.plus.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20150202/124af5bb/attachment.html>


More information about the grass-dev mailing list