[GRASS-dev] how to reproject a raster map with absolute numbers without losing data

Paulo van Breugel p.vanbreugel at gmail.com
Fri Dec 2 02:53:54 PST 2016



On 02-12-16 11:19, Moritz Lennert wrote:
> On 01/12/16 14:31, Markus Metz wrote:
>>
>>
>> On Thu, Dec 1, 2016 at 1:16 PM, Paulo van Breugel
>> <p.vanbreugel at gmail.com <mailto:p.vanbreugel at gmail.com>> wrote:
>>>
>>>
>>>
>>> On 01-12-16 12:01, Moritz Lennert wrote:
>>>>
>>>> On 30/11/16 09:40, Paulo van Breugel wrote:
>>>>>
>>>>>
>>>>>> On Tue, Nov 29, 2016 at 5:44 PM, Moritz Lennert
>>>>
>>>>
>>>>>>
>>>>>> > An area() function in r.mapcalc would be nice...
>>>>>
>>>>>
>>>>> Would indeed be a nice addition, but it isn't too difficult to 
>>>>> compute,
>>>>> using:
>>>>>
>>>>> |g.region rast=afripop10 at PERMANENT|
>>>>> |PI=3.1415926536|
>>>>> |export `g.region -g`|
>>>>> |export `g.proj -j`|
>>>>
>>>>
>>>> This doesn't work for me as the output is
>>>>
>>>> g.proj -j
>>>>
>>>> +proj=longlat
>>>> +no_defs
>>>> +a=6378137
>>>> +rf=298.257223563
>>>> +towgs84=0.000,0.000,0.000
>>>>
>>>> and +a is not acceptable as variable name.
>>>
>>>
>>> You are right, that should  be g.proj -g I think
>
> No this does not provide a, but the name of the ellipsoid. I'm not 
> sure that you can currently directly parse g.proj output for a...

It does to me:

GRASS 7.3.svn (latlon):~ > g.proj -g
name=Lat/Lon
proj=ll
datum=wgs84
a=6378137
es=0.006694379990141316
no_defs=defined
unit=degree
units=degrees
meters=1.0
GRASS 7.3.svn (latlon):~ >


>
> This works for me:
>
> eval $(g.proj -j | awk '{if(match($1,"=")>0){sub("+", "", $1); print 
> $1}}')
>
>>>
>>>>
>>>>> |r.mapcalc "rcs = (sin(y()+$nsres/2) - sin(y()-$nsres/2)) * ($ewres *
>>>>> $PI/180) * float($a)^2";|
>>>>>
>>>>
>>>> I tested this approach as well. I.e.
>>>>
>>>> area = above formula
>>>> pop_density = pop / area
>>>>
>>>> reproject pop_density
>>>>
>>>> pop = pop_density * (ewres*nsres)
>>>>
>>>> It is faster than the r.in.xyz <http://r.in.xyz> approach. But it
>> does not seem to be as precise.
>>>>
>>>> I still lost about 100.000 inhabitants, and more when I smooth more
>> (the "loss" increases from nearest neighbor to bilinear to bicubic).
>>
>> Note that some of the loss happens when a non-zero cell is bordering a
>> NULL cell, unless you use one of the *_f interpolation methods.
>>>
>>>
>>> Just guessing, but an issue I had in the past with layers like Afripop
>> is that it may contain cells with extremely high values (a result of the
>> inter/extrapolation methods used to create those layers I guess, not
>> sure it is an issue for e.g., Wordpop). Depending on the resampling
>> method, this could also add to the differences in totals.
>>>
>>>>
>>>> In order to avoid precision issues, I tried with both density by m2
>> and density by km2, but results are similar.
>>
>> Why not per hectare? The target cell size is close to one hectare, it
>> could be set to exactly one hectare.
>>>
>>> Does it make a difference if you use a much higher target resolution?
>>
>> In theory, a higher target resolution should result in a higher total
>> sum, a lower target resolution in a lower total sum.
>
> Why is that if we work in densities ?
>
> I don't know which part of the error comes from the use of spheroid
>> approximation and which part comes from the resampling at reprojection.
>>>
>>>
>>> It would be interesting to see how results are if using the same
>> routine as above (convert to densities, reproject, convert back to
>> totals), but using the more precise G_area_of_cell_at_row() function to
>> compute the cell areas.
>>
>> Try the new area() function of r.mapcalc ;-)
>
> Thanks !!
>
> FYI, for all of Madagascar, for pixels of resolution 0:00:02.99988 
> (i.e. +/- 100m at the equator), I get an RMSE of about 42m2 between 
> Paulo's formula and area() in r.mapcalc. The error varies between 31 
> and 52m2, with highest values in the North, i.e. the closer you get to 
> the equator.
>
>>
>> in the source location
>> record absolute count as src_count
>> convert absolute count to density per hectare with
>> r.mapcalc "pop_dens_ha = 10000.0 * pop_count / area()"
>>
>> in the target location
>> set the region according to r.proj -g
>> use reasonable grid geometry with e.g. g.region -pa res=100
>> reproject pop_dens_ha with r.proj method=bilinear_f
>
> Actually, using area(), and a 100m target resolution, I get closer to 
> the source total with bilinear than with bilinear_f, but in all cases 
> the target total is now _above_ the source total:
>
> Source total: 20714000
>
> Target total - Source total:
>
> nn: 22473
> bilinear: 2300
> bilinear_f: 26873
>
>> now density is equal to the absolute count because density is per
>> hectare and cell size is 10000 square meter
>> record absolute count as tgt_count
>>
>> fix any deviations between src_count and tgt_count with
>> r.mapcalc "pop_count_calibrated = pop_dens_ha * $src_count / $tgt_count"
>>
>> this calbration could also be done for administrative units separately
>
> Yes, obviously, by calibrating this way we will get the same total by 
> definition. But this means that we postulate that the "error" is 
> distributed equally across space...
>
> Thanks to both of you for the very helpful discussion. I'm afraid few 
> people dealing with these data think much about this problem.
>
> Moritz



More information about the grass-dev mailing list