[GRASS-dev] using rand(x,y) in r.mapcalc (grass7)

Paulo van Breugel p.vanbreugel at gmail.com
Wed Jul 2 23:55:52 PDT 2014


On 03-07-14 03:43, Vaclav Petras wrote:
>
> On Wed, Jul 2, 2014 at 8:15 PM, Glynn Clements 
> <glynn at gclements.plus.com <mailto:glynn at gclements.plus.com>> wrote:
>
>     > Shouldn't the seed not be generated on e.g, OS time,
>     > which would ensure that each run would give a different result?
>
>     No. The reason is to provide reproducibility. Anyone running the same
>     command with the same data should obtain the same result.
>
It is certainly be good to be able to reproduce commands. However, I 
think in most (statistical) software the default / expected behaviour is 
to have a new automatically generated seed at each run. In R for 
example, if you have to explicitly specify the seed using the function 
set.seed().  I would think therefore what most users will  expect a 
similar behaviour in GRASS. It would certainly be my personal preference 
to have the option to set the seed explicitly if you want 
reproducibility, but have it generated automatically otherwise. But that 
is just a personal preference.
>
>
> Does the reproducibility go behind one operating system, compiler or 
> library? I don't think that the first random number is specified by 
> the C language standard. If the results would be really reproducible 
> it would be good for testing framework but I'm afraid that they are 
> not (with my limited knowledge about the topic).
>
>     If you want a different result each time, set GRASS_RND_SEED to a
>     different value each time, e.g.
>
>             GRASS_RND_SEED=`date +%N` r.mapcalc "a = rand(0,100)"
>
>     [%N is the nanoseconds portion of the current time; this is a GNU
>     extension.]
>
Perhaps this can be explained like this in the manual page? A far better 
option would be to provide this as  a normal parameter so it can be set 
from the gui interface or command line like any other variable.
>
> I've heard that this is not enough on powerful computers/clusters, 
> that you have to use also PID because nanoseconds might be the same (I 
> think I rememberer that it was nanoseconds not seconds).
>
>
>     > On a related note, it would be nice to be able to set the seed
>     (I think
>     > there has been such a request before, but not sure about the
>     answer at that
>     > time).
>
>     GRASS_RND_SEED was the answer.
>
>
> I think there should be some possibility of randomization 
> (auto-setting of seed) build-in the modules providing random(ized) 
> results. Perhaps a flag which would turn it on. It can be also an 
> option which would behave like GRASS_RND_SEED but would have one 
> special value for auto-generating the seed. (GRASS_RND_SEED if present 
> would override this option.) With the default value of the option we 
> should ask a question what is actually the expected behavior of the 
> module giving random results.
Yes, that would be great. As for the default value, see my earlier argument.
>
> This would provide a nicer interface in Python, standard interface in 
> command line, and possibility to set it in the GUI (which means 
> possibility to set it for users which don't use command line.) 
> Moreover, it would provide all users with the way of setting the 
> random seen in the manner which we consider the best according to our 
> knowledge.
Agree. The way to set the seed now may not be understood by everybody 
and with all the work going into streamlining the GUI, this kind of 
fairly important options should also be available through the GUI
>
> Vaclav

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20140703/f6b144dc/attachment-0001.html>


More information about the grass-dev mailing list