<div dir="ltr"><div><div><br><br>On Thu, Oct 19, 2017 at 2:31 AM, Daniel Victoria <<a href="mailto:daniel.victoria@gmail.com">daniel.victoria@gmail.com</a>> wrote:<br>><br>> Hi Markus M,<br>><br>> Thanks for your input. But one thing is still confusing me. From what I understood, the multiple comparison problem would arise if I calculated one p-value for all the regressions in a computational region. Say I have a 100 x 100 raster, chances are one of those 10,000 pixels would yield me a significant regression. But in my case, I'm calculating a p-value raster that is, each pixel has it's own p-value and I'm interested in the slopes that have a significant trend (p <= 0.05). Thus each pixel regression is (sort of) independent.<br><br></div>The p-values for each pixel are not really independent because they have been calculated for the same raster series. For example, if you generate a series of random maps and calculate p-values for each pixel of this series, by pure chance some p-values will be very small. This is the multiple comparison problem.<br><br></div>Markus M<br><br><div><div>><br>> In essence. If I generate a raster with regression slope and p-value, and I mask out the areas with high p (above 5%), the slope values in the remaining regions would be significant correct? What you are saying is that I might be overestimating the area with significant slope?<br>><br>> Cheers and thanks<br>> Daniel<br>><br>> On Wed, Oct 18, 2017 at 6:06 PM Markus Metz <<a href="mailto:markus.metz.giswork@gmail.com">markus.metz.giswork@gmail.com</a>> wrote:<br>>><br>>><br>>><br>>> On Wed, Oct 18, 2017 at 7:19 PM, Daniel Victoria <<a href="mailto:daniel.victoria@gmail.com">daniel.victoria@gmail.com</a>> wrote:<br>>> ><br>>> > I just read on the p-value regression ticket a comment from Markus Metz [1]. If I understood correctly, he mentions that the chances of getting small p-values at random is high and we should do a correction. But this would result in non-significant p-values. He concludes that it would be more "appropriate to make prior assumptions about slope, intercept, and effect size, then judge the results according to these prior assumptions".<br>>> ><br>>> > Does this means that I should not rely on the p-value obtained?<br>>><br>>> Yes and no. The p-value needs to be interpreted correctly. Commonly used thresholds are alpha = 0.05 and alpha = 0.01. That means if p <= alpha, the result is statistically significant. Problems occur if you repeat the test with the same dataset several times:<br>>> <a href="https://en.wikipedia.org/wiki/Multiple_comparisons_problem">https://en.wikipedia.org/wiki/Multiple_comparisons_problem</a><br>>><br>>> In these cases, alpha needs to be corrected in order to decide if a p-value is significant or not. Regarding r.series, millions of repeated tests might be performed (one for each cell in the current computational region). Any standard correction method would thus render pretty much all p-values non-significant. Instead, Bayesian statistics might be a solution.<br>>><br>>> Markus M<br>>><br>>> ><br>>> > Where can I find more information about this? Some colleagues and I are in the process of finishing a paper that uses applies a regression to annual NDVI data and right now, we are discussing if we should (or not) consider the p-values obtained.<br>>> ><br>>> > Thanks and sorry if this is a bit of topic<br>>> ><br>>> > Cheers<br>>> > Daniel<br>>> ><br>>> > [1] <a href="https://trac.osgeo.org/grass/ticket/2376#comment:3">https://trac.osgeo.org/grass/ticket/2376#comment:3</a><br>>> ><br>>> ><br>>> > On Mon, Oct 16, 2017 at 2:12 PM Daniel Victoria <<a href="mailto:daniel.victoria@gmail.com">daniel.victoria@gmail.com</a>> wrote:<br>>> >><br>>> >> Replying to self and in case helps anyone.<br>>> >><br>>> >> Solved it by using R and the raster package. Here is a Stackoverflow post about it<br>>> >><br>>> >> <a href="https://stackoverflow.com/questions/20262999/how-to-output-regression-summarye-g-p-value-and-coeff-into-a-rasterbrick">https://stackoverflow.com/questions/20262999/how-to-output-regression-summarye-g-p-value-and-coeff-into-a-rasterbrick</a><br>>> >><br>>> >> Cheers<br>>> >> Daniel<br>>> >><br>>> >> On Wed, Oct 11, 2017 at 10:44 AM Daniel Victoria <<a href="mailto:daniel.victoria@gmail.com">daniel.victoria@gmail.com</a>> wrote:<br>>> >>><br>>> >>> OK, dumb question since I'm a bit (or very) bad at stats.<br>>> >>><br>>> >>> I'm calculating the slope from a series of rasters using r.series. I see that I can also get the t-value and the coefficient of determination. Is there a way to get the p-value for the regression?<br>>> >>><br>>> >>> I've seen that this question has been asked before (in 2012) [1] and it ended with the addition of the t-value calculation in r.series. But I failed to see how the p-value can be obtained.<br>>> >>><br>>> >>> I also found this ticket [2], related to the p-value question.<br>>> >>><br>>> >>> Thanks<br>>> >>> Daniel<br>>> >>><br>>> >>> [1] - <a href="http://osgeo-org.1560.x6.nabble.com/Calculate-p-value-for-regression-slope-in-r-series-td5014228.html">http://osgeo-org.1560.x6.nabble.com/Calculate-p-value-for-regression-slope-in-r-series-td5014228.html</a><br>>> >>><br>>> >>> [2]  <a href="https://trac.osgeo.org/grass/ticket/2376">https://trac.osgeo.org/grass/ticket/2376</a><br>>> >>><br>>> ><br>>> > _______________________________________________<br>>> > grass-user mailing list<br>>> > <a href="mailto:grass-user@lists.osgeo.org">grass-user@lists.osgeo.org</a><br>>> > <a href="https://lists.osgeo.org/mailman/listinfo/grass-user">https://lists.osgeo.org/mailman/listinfo/grass-user</a><br>>><br></div></div></div>