[GRASS-dev] how to reproject a raster map with absolute numbers without losing data

Moritz Lennert mlennert at club.worldonline.be
Tue Nov 29 08:44:47 PST 2016



Le 29 novembre 2016 15:10:11 GMT+01:00, Markus Metz <markus.metz.giswork at gmail.com> a écrit :
>On Mon, Nov 28, 2016 at 5:51 PM, Moritz Lennert <
>mlennert at club.worldonline.be> wrote:
>>
>> Hi,
>>
>> In discussions with students the following problem has arisen, and
>I'm
>not sure how to respond to this:
>>
>> When reprojecting raster data, one has to chose a method of
>interpolation. However, when the data you have represents absolute
>numbers
>(such as population grids), you do not wish to interpolate. What is
>needed
>is a technique to redistribute all units (the counts) into a new
>partition
>of space. This is implement in r.resamp.stats, but is not possible with
>r.proj.
>>
>> As an example, I took the 2010 Worldpop population grid [1] of
>Madagascar. The data is described as:
>>
>> SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the
>equator)
>> PROJECTION: Geographic, WGS84
>>
>> I import the data into a epsg:4326 location and calculate the sum of
>the
>pixel values, i.e. the total population:
>>
>> r.univar worldpop_madagascar_2010
>>
>> total null and non-null cells: 143500959
>> total null cells: 70052410
>>
>> Of the non-null cells:
>> ----------------------
>> n: 73448549
>> minimum: 0
>> maximum: 1163.43
>> range: 1163.43
>> mean: 0.282021
>> mean of absolute values: 0.282021
>> standard deviation: 4.3961
>> variance: 19.3257
>> variation coefficient: 1558.79 %
>> sum: 20714000.1804286
>>
>> I then reproject the data to UTM 38S (epsg:32738), by first
>estimating
>the correct region settings with
>>
>> r.proj -g location=LL_WGS84 mapset=mlennert
>input=worldpop_madagascar_2010
>>
>> which gives me
>>
>> n=8678821.71318899 s=7156445.80116155 w=302657.5841309
>e=1098038.55818598
>rows=16387 cols=8757
>>
>> I then set the region accordingly in order to stay close to the
>original
>grid and reproject
>>
>> g.region n=8678821.71318899 s=7156445.80116155 w=302657.5841309
>e=1098038.55818598 rows=16387 cols=8757
>> r.proj location=LL_WGS84 mapset=mlennert
>input=worldpop_madagascar_2010
>>
>> When I then run r.univar on the resulting map, I get:
>>
>> r.univar worldpop_madagascar_2010 100%
>> total null and non-null cells: 143500959
>> total null cells: 73263298
>>
>> Of the non-null cells:
>> ----------------------
>> n: 70237661
>> minimum: 0
>> maximum: 1163.43
>> range: 1163.43
>> mean: 0.281982
>> mean of absolute values: 0.281982
>> standard deviation: 4.37828
>> variance: 19.1693
>> variation coefficient: 1552.68 %
>> sum: 19805754.8529859
>>
>> Madagascar has just lost 908,246 inhabitants !
>
>You would need to convert absolute numbers to densities before
>reprojection. Note that in latlong, cells have different sizes when
>expressed in square meters.

Yes. Do we have any existing module to calculate area by pixel ? An area() function in r.mapcalc would be nice...

>>
>> Note also that while the total number of cells is equal between the
>two,
>the number of non-null cells is lower after reprojection.
>
>For illustration, you could create a vector grid in the source
>location,
>then reproject the vector grid to the target location to see how cells
>would be differently reprojected in different parts of the target
>computational region.

That would be a nice pedagogical tool, yes. 

>>
>> Also note that the new region has resolutions:
>>
>> nsres:      92.9014409
>> ewres:      90.82802033
>>
>> while g.region -m in the lat-long location gives me:
>>
>> nsres=92.24099389
>> ewres=87.22776965
>>
>> When I use these resolution settings in the UTM location:
>>
>> g.region n=8678821.71318899 s=7156445.80116155 w=302657.5841309
>e=1098038.55818598 nsres=92.24099389 ewres=87.22776965 -a
>>
>> I get the following result:
>>
>> r.univar wp_mdg_2010
>>  100%
>> total null and non-null cells: 150525600
>> total null cells: 76865463
>>
>> Of the non-null cells:
>> ----------------------
>> n: 73660137
>> minimum: 0
>> maximum: 1163.43
>> range: 1163.43
>> mean: 0.282092
>> mean of absolute values: 0.282092
>> standard deviation: 4.38898
>> variance: 19.2632
>> variation coefficient: 1555.87 %
>> sum: 20778920.5206973
>>
>> i.e. a total sum and a number of non-null cells much closer to those
>of
>the original map.
>>
>> Two questions out of this (and maybe a potential bug report);
>>
>> - How does r.proj determine the region settings. These seem quite
>"off"
>at least in what I would expect.
>
>What would you expect?

I guess something that preserves as much as possible the number of pixels ? But yes I'm conscious that that's probably not a generalizable criterion...

 r.proj walks along the border of the input
>raster
>map and reprojects the border coordinates for each cell. Target e,w,n,s
>are
>the repsective maxima and minima of the reprojected coordinates. The
>resolution is simply e.g. (n - s) / rows.

>
>> How does g.region -m do it ?
>
>g.region -m uses the mean of the edge lengths (western, eastern,
>northern,
>southern edge) divided by rows or cols.
>
>> Could r.proj just use that procedure ?
>
>r.proj -g is more accurate than g.region -m because r.proj walks along
>the
>borders. The purpose of r.proj -g is to figure out extents that cover
>the
>whole input image.


OK, thanks for the explanation.

>
>> (BTW, reprojecting with a simple 'Save as' in QGIS, I get a pixel
>size of
>88.8198,-91.3023, and using zonal statistiques on the entire country, I
>get
>exactly the same population be in lat-long or in UTM...)
>
>Is QGIS using gdalwarp internally?

Don't know....

>>
>> - Would it be possible (desirable?) to implement some equivalent of
>r.resamp.stats, ideally with the -w flag for weighting the sum.
>
>IMHO, the -w flag does not make sense because cell borders are no
>longer
>rectangles after reprojection, they are rather something like
>trapezoids.


And why does that imply that -w doesn't make sense ?  

Thanks for the explanation !!

Moritz


More information about the grass-dev mailing list