[GRASS-dev] Some comments on the v.surf.icw add-on

Tue Jan 5 08:17:02 EST 2010

Ben wrote:
> I have just tested v.surf.icw in some detail.

[ http://grass.osgeo.org/wiki/GRASS_AddOns#v.surf.icw ]

> In general, it is a very nice module that produces
> far more realistic results than plain euclidean v.surf.idw

thanks.

> in the right circumstances.

exactly.

> I made some observations and have some questions/comments
> I'd like to share:
> 
> Line 150:
> This produces an error message from the shell if any of the
> 
> env vars does not exist (cosmetics, really):
> 
>   if [ $GIS_FLAG_V -eq 1 ] || [ "$GRASS_VERBOSE" -gt 1 ] ; then

right, there was a code comment on the line above that saying it
would happen. now fixed in addons-svn.

> Line 234:
> 
>   r.cost $VERBOSE -k in=tmp_icw_area_$$
> out=cost_site.$NUM coordinate=$EASTING,$NORTHING
>   =
>   r.cost $VERBOSE -k in=tmp_icw_area_$$
> output=cost_site.$NUM coordinate=$EASTING,$NORTHING
> 
> Otherwise, it won't play any longer with newer versions of
> r.cost that have the "outdir=" option.

right, fixed in svn.

> Line 239:
> 
> This changes the original input data without any warning or
> documentation about it:

in what I consider to be a corner case. The intention is to do just
the opposite,

>   # so the divisor exists and the weighting is huge at the exact sample spots
>   # more efficient to reclass to 1?
>   r.mapcalc "cost_site.$NUM = if(cost_site.$NUM == 0, 0.1, cost_site.$NUM)"
> 
> Apparently, this is done to avoid a divison by zero in the
> standard IDW formula (line 246):
> 
>   EXPRESSION="1.0 / pow(cost_site.$NUM $DIVISOR, $FRICTION )"

yes, it is to avoid the DIV0, but in a more fundamentally it is there
to ensure that if a starting data point falls within a cell then that
data point will hold (just less than) infinitely more weight than the
other starting points around it.

when a starting point falls inside a cell, the cost to get to that cell
is 0, which makes the inverse distance weight infinite, which makes the
resulting output cell NULL, and not == starting point value. So your
final output map would have little holes everywhere you had a starting
point. I suppose v.to.rast + r.patch is another way to solve that.

I had not considered the case where the input cost map has real zero-
friction areas in it.

> If so, it needs to be documented. I actually used costs 
> normalized to [0,1] and ran into trouble here. 

I'd need to know a bit more about what you are doing with it to get my
head around a more general solution. (offlist is fine)

0-1 normalized data is not exotic IMO, so I'll entertain thoughts on
how to make this work.

I'm using it with a flat cost map of 1-everywhere and using it to calculate
true-distance on a horizontal plane around obstacles. So I've never
considered that zero-cost might happen beyond the starting cells before.

> I realize now that wasn't a good idea, because a movement 
> costof "0" between two spatially distinct locations is physically 
> implausible.

it depends on your medium. perhaps physically implausible but why limit
ourselves to physical applications? or with hi-temp superconductors
coming online..   ..shrug

> However, this behaviour still needs to be document, 
> perhaps advising users to use a minimum cost of "1"?

or just recommend to not to set costs of zero.

that 0.1 could be set to e.g. an order of magnitude smaller than the
cost map's minimum value instead of being hardcoded at 0.1.
In my case the distances are in thousands of meters so 0.1 was small
enough.

> Line 246:
> 
> The variable $DIVISOR is never initialized (empty) but
> still used in the expression:
> 
>   EXPRESSION="1.0 / pow(cost_site.$NUM $DIVISOR, $FRICTION )"

that's harmless/obscure/as designed. in bash if a variable is unset &
you then go and use it, it just comes out as a blank. which is what is
intended here. to make it more readable I've explicitly set it = "" now
in svn.

> In addition, I would like to know where the second 
> IDW formula comes from (-r flag). Any literature
> references?

"Moving Vessel ADCP Measurement of Tidal Streamfunction using Radial
Basis Functions" - Vennell & Beatson, Journal of Physical Oceanography
2006-111.

see section 2 on RBFs and thin plate splines.

@article{vennell2006moving,
  title={{Moving vessel acoustic Doppler current profiler measurement of tidal stream function using radial basis functions}},
  author={Vennell, R. and Beatson, R.},
  journal={Journal of Geophysical Research. C. Oceans},
  volume={111},
  year={2006},
  publisher={American Geophysical Union, 2000 Florida Ave., N. W. Washington DC 20009 USA,}
}

there is some nice thin plate spline interpolation method + application
in there.

> When would this be the preferable formula?

to be honest I just threw it in to see how it did. Interpolation is the
art of making stuff up in a plausible way, there is no one correct method
to use for all cases. the best you can do is to choose a method that
minimizes damage.

What is appropriate or preferable will be case dependent. I suggest you
make a slope map of your output map & view it in nviz + r.profile to get
a handle on how your method is behaving. It's not always obvious by
looking at a colorized map.

If you look at a slope map or profile of the output you'll see that 1/d^2
makes for perfectly smooth transitions but also tends to go off wildly at
unconstrained boundaries. 1/d^3 does better keeping the edges under
control but the transitions are not as smooth. You could run it twice
and average those two cases with r.series I'm not sure how a single
1/d^2.5 run would compare.

To maintain confidence in the output map you might also take care to make
another distance-from-all-starting-points cost map and set anything more
than the average distance between sampling stations to be NULL. (That's
why the post_mask= option is there)

IIRC I choose to use ln() instead of log10() for -r because that made
it act in a way more dissimilar to 1/d^n so more signal to explore, and
seemed well, for lack of a better term, more natural. In the r.volcano
addon module I try a number of other decay functions which could be just
as easily dropped into it.

If this all seems a bit loose, it is, but that is the nature and
interesting part of the art of gap-filling IMO. One thing I like about
the above paper is that it looks at how you can decide how much field
sampling you have to do to before you can be confident that the
interpolation will be realistic and not just line-between-points voodoo.

Hamish