[GRASS5] v.univar

Radim Blazek blazek at itc.it
Sat Jul 3 06:27:01 EDT 2004


On Friday 02 July 2004 19:52, Roger Bivand wrote:
> On Fri, 2 Jul 2004, Radim Blazek wrote:
> > I have written v.univar. I am not sure how to calculate statistics
> > for lines and areas, does the code below make sense?
>
> Briefly, no. You are calculating weighted means, weighting by line length
> or area surface size. I think it would be better to treat each line or
> area as a discrete, unweighted, unit unless some reason to the contrary is
> given, just like points/sites. 

Then v.univar and v.to.rast + r.univar will give completely different
results for the same data, is it correct?

What should be the 'unit', one geometry element in the map or one category
(one record in the table)? Both fail in some cases, I think.
1) unit = geometry element
   One town, e.g. Bergen is composed of more isolated areas (land+islands)
   all those areas however share the same category and database record
   (town name, number of inhabitants). Now if I take each island (geometry element)
   as one 'unit' and calculate mean of inhabitants in the cities in Norway, the result 
   is wrong, I think. The right approach in this case is to take one category as one unit.
2) unit = category
   Map of public lighting, each point is one light but there are only two 
   types of lights installed so there are only 2 categories and 2 records
   in the table (type, price). If I want mean of price for installed 
   lights and I use the category as the unit the result is wrong again
   (mean of 2 prices not all lights).

> It is probably more important to handle
> missing data gracefully than weight the means or other statistics, I
> think. 

What is precisely 'missing data' and what is 'gracefully'?
Currently only non-NULL values are used in the calculation
and number of missing records and number of NULL values is reported
at the end, is it sufficient?

> There may be reasons to weight sometimes, but most often I see
> ratios or rates of two variables, rather than of a single variable and
> length or area.

It could be optional 
1) unit=category (default?)
2) unit=geometry (default?)
3) weighted by area/length

Radim




More information about the grass-dev mailing list