[GRASS-dev] how to reproject a raster map with absolute numbers without losing data
Moritz Lennert
mlennert at club.worldonline.be
Mon Nov 28 08:51:24 PST 2016
Hi,
In discussions with students the following problem has arisen, and I'm
not sure how to respond to this:
When reprojecting raster data, one has to chose a method of
interpolation. However, when the data you have represents absolute
numbers (such as population grids), you do not wish to interpolate. What
is needed is a technique to redistribute all units (the counts) into a
new partition of space. This is implement in r.resamp.stats, but is not
possible with r.proj.
As an example, I took the 2010 Worldpop population grid [1] of
Madagascar. The data is described as:
SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the equator)
PROJECTION: Geographic, WGS84
I import the data into a epsg:4326 location and calculate the sum of the
pixel values, i.e. the total population:
r.univar worldpop_madagascar_2010
total null and non-null cells: 143500959
total null cells: 70052410
Of the non-null cells:
----------------------
n: 73448549
minimum: 0
maximum: 1163.43
range: 1163.43
mean: 0.282021
mean of absolute values: 0.282021
standard deviation: 4.3961
variance: 19.3257
variation coefficient: 1558.79 %
sum: 20714000.1804286
I then reproject the data to UTM 38S (epsg:32738), by first estimating
the correct region settings with
r.proj -g location=LL_WGS84 mapset=mlennert input=worldpop_madagascar_2010
which gives me
n=8678821.71318899 s=7156445.80116155 w=302657.5841309
e=1098038.55818598 rows=16387 cols=8757
I then set the region accordingly in order to stay close to the original
grid and reproject
g.region n=8678821.71318899 s=7156445.80116155 w=302657.5841309
e=1098038.55818598 rows=16387 cols=8757
r.proj location=LL_WGS84 mapset=mlennert input=worldpop_madagascar_2010
When I then run r.univar on the resulting map, I get:
r.univar worldpop_madagascar_2010 100%
total null and non-null cells: 143500959
total null cells: 73263298
Of the non-null cells:
----------------------
n: 70237661
minimum: 0
maximum: 1163.43
range: 1163.43
mean: 0.281982
mean of absolute values: 0.281982
standard deviation: 4.37828
variance: 19.1693
variation coefficient: 1552.68 %
sum: 19805754.8529859
Madagascar has just lost 908,246 inhabitants !
Note also that while the total number of cells is equal between the two,
the number of non-null cells is lower after reprojection.
Also note that the new region has resolutions:
nsres: 92.9014409
ewres: 90.82802033
while g.region -m in the lat-long location gives me:
nsres=92.24099389
ewres=87.22776965
When I use these resolution settings in the UTM location:
g.region n=8678821.71318899 s=7156445.80116155 w=302657.5841309
e=1098038.55818598 nsres=92.24099389 ewres=87.22776965 -a
I get the following result:
r.univar wp_mdg_2010
100%
total null and non-null cells: 150525600
total null cells: 76865463
Of the non-null cells:
----------------------
n: 73660137
minimum: 0
maximum: 1163.43
range: 1163.43
mean: 0.282092
mean of absolute values: 0.282092
standard deviation: 4.38898
variance: 19.2632
variation coefficient: 1555.87 %
sum: 20778920.5206973
i.e. a total sum and a number of non-null cells much closer to those of
the original map.
Two questions out of this (and maybe a potential bug report);
- How does r.proj determine the region settings. These seem quite "off"
at least in what I would expect. How does g.region -m do it ? Could
r.proj just use that procedure ? (BTW, reprojecting with a simple 'Save
as' in QGIS, I get a pixel size of 88.8198,-91.3023, and using zonal
statistiques on the entire country, I get exactly the same population be
in lat-long or in UTM...)
- Would it be possible (desirable?) to implement some equivalent of
r.resamp.stats, ideally with the -w flag for weighting the sum.
Or is there something I fundamentally misunderstand ?
Moritz
P.S. I've tried to summarise the issue on a the wiki page about the
computational region [2]. If anyone wants to add to that, please feel free.
[1] http://www.worldpop.org.uk/
[2]
https://grasswiki.osgeo.org/wiki/Computational_region#Understanding_the_impact_of_region_settings
More information about the grass-dev
mailing list