[GRASS-dev] how to reproject a raster map with absolute numbers without losing data

Moritz Lennert mlennert at club.worldonline.be
Fri Dec 2 02:19:23 PST 2016


On 01/12/16 14:31, Markus Metz wrote:
>
>
> On Thu, Dec 1, 2016 at 1:16 PM, Paulo van Breugel
> <p.vanbreugel at gmail.com <mailto:p.vanbreugel at gmail.com>> wrote:
>>
>>
>>
>> On 01-12-16 12:01, Moritz Lennert wrote:
>>>
>>> On 30/11/16 09:40, Paulo van Breugel wrote:
>>>>
>>>>
>>>>> On Tue, Nov 29, 2016 at 5:44 PM, Moritz Lennert
>>>
>>>
>>>>>
>>>>> > An area() function in r.mapcalc would be nice...
>>>>
>>>>
>>>> Would indeed be a nice addition, but it isn't too difficult to compute,
>>>> using:
>>>>
>>>> |g.region rast=afripop10 at PERMANENT|
>>>> |PI=3.1415926536|
>>>> |export `g.region -g`|
>>>> |export `g.proj -j`|
>>>
>>>
>>> This doesn't work for me as the output is
>>>
>>> g.proj -j
>>>
>>> +proj=longlat
>>> +no_defs
>>> +a=6378137
>>> +rf=298.257223563
>>> +towgs84=0.000,0.000,0.000
>>>
>>> and +a is not acceptable as variable name.
>>
>>
>> You are right, that should  be g.proj -g I think

No this does not provide a, but the name of the ellipsoid. I'm not sure 
that you can currently directly parse g.proj output for a...

This works for me:

eval $(g.proj -j | awk '{if(match($1,"=")>0){sub("+", "", $1); print $1}}')

>>
>>>
>>>> |r.mapcalc "rcs = (sin(y()+$nsres/2) - sin(y()-$nsres/2)) * ($ewres *
>>>> $PI/180) * float($a)^2";|
>>>>
>>>
>>> I tested this approach as well. I.e.
>>>
>>> area = above formula
>>> pop_density = pop / area
>>>
>>> reproject pop_density
>>>
>>> pop = pop_density * (ewres*nsres)
>>>
>>> It is faster than the r.in.xyz <http://r.in.xyz> approach. But it
> does not seem to be as precise.
>>>
>>> I still lost about 100.000 inhabitants, and more when I smooth more
> (the "loss" increases from nearest neighbor to bilinear to bicubic).
>
> Note that some of the loss happens when a non-zero cell is bordering a
> NULL cell, unless you use one of the *_f interpolation methods.
>>
>>
>> Just guessing, but an issue I had in the past with layers like Afripop
> is that it may contain cells with extremely high values (a result of the
> inter/extrapolation methods used to create those layers I guess, not
> sure it is an issue for e.g., Wordpop). Depending on the resampling
> method, this could also add to the differences in totals.
>>
>>>
>>> In order to avoid precision issues, I tried with both density by m2
> and density by km2, but results are similar.
>
> Why not per hectare? The target cell size is close to one hectare, it
> could be set to exactly one hectare.
>>
>> Does it make a difference if you use a much higher target resolution?
>
> In theory, a higher target resolution should result in a higher total
> sum, a lower target resolution in a lower total sum.

Why is that if we work in densities ?

>>>
>>>
>>> I don't know which part of the error comes from the use of spheroid
> approximation and which part comes from the resampling at reprojection.
>>
>>
>> It would be interesting to see how results are if using the same
> routine as above (convert to densities, reproject, convert back to
> totals), but using the more precise G_area_of_cell_at_row() function to
> compute the cell areas.
>
> Try the new area() function of r.mapcalc ;-)

Thanks !!

FYI, for all of Madagascar, for pixels of resolution 0:00:02.99988 (i.e. 
+/- 100m at the equator), I get an RMSE of about 42m2 between Paulo's 
formula and area() in r.mapcalc. The error varies between 31 and 52m2, 
with highest values in the North, i.e. the closer you get to the equator.

>
> in the source location
> record absolute count as src_count
> convert absolute count to density per hectare with
> r.mapcalc "pop_dens_ha = 10000.0 * pop_count / area()"
>
> in the target location
> set the region according to r.proj -g
> use reasonable grid geometry with e.g. g.region -pa res=100
> reproject pop_dens_ha with r.proj method=bilinear_f

Actually, using area(), and a 100m target resolution, I get closer to 
the source total with bilinear than with bilinear_f, but in all cases 
the target total is now _above_ the source total:

Source total: 20714000

Target total - Source total:

nn: 22473
bilinear: 2300
bilinear_f: 26873

> now density is equal to the absolute count because density is per
> hectare and cell size is 10000 square meter
> record absolute count as tgt_count
>
> fix any deviations between src_count and tgt_count with
> r.mapcalc "pop_count_calibrated = pop_dens_ha * $src_count / $tgt_count"
>
> this calbration could also be done for administrative units separately

Yes, obviously, by calibrating this way we will get the same total by 
definition. But this means that we postulate that the "error" is 
distributed equally across space...

Thanks to both of you for the very helpful discussion. I'm afraid few 
people dealing with these data think much about this problem.

Moritz


More information about the grass-dev mailing list