[GRASS-dev] v.univar question: Why not lines and areas?

Tue Jan 29 20:43:26 EST 2008

On Jan 29, 2008, at 5:12 PM, Moritz Lennert wrote:

> On 28/01/08 16:22, Michael Barton wrote:
>> On Jan 28, 2008, at 5:50 AM, Moritz Lennert wrote:
>>> On 27/01/08 20:30, Michael Barton wrote:
>>>> v.univar only works with points. But since it is calculating
>>>> stats on a field in the attributes table, it should work the same
>>>> for all vector objects. Can we get rid of the limitation that it
>>>> only works with points?
>>> There was some debate [1] about the statistical validity of working
>>>  with the other types, as the way it was programmed, the statistics
>>>  were calculated with weights which corresponded to line length /
>>> area surface .
>>> I guess we might want to distinguish between a v.univar which works
>>> on the actual vector objects from a v.db.univar which works on any
>>>  arbitrary attribute (or combination of attributes). We could write
>>> a C-replacement of the current v.db.univar script on the base of
>>> the code I have for the classification algorithms used in v.class.
>> AFAICT, v.univar does not calculate anything from vector topology,
>> only from an attribute column.
> [...]
>> An attribute is the same whether it's linked to a point, line, or
>> area.
>
> v.univar currently calculates as follows for lines and areas, even  
> though the results are never printed (main.c):
>
> [lines:]
> 206 	                        l = Vect_line_length ( Points );
> 207 	                        sum += l*val;
> 208 	                        sumsq += l*val*val;
> 209 	                        sum_abs += l * fabs (val);
> 210 	                        total_size += l;
>
> [areas:]
> 270 	                        a = Vect_get_area_area ( &Map, area );
> 271 	                        sum += a*val;
> 272 	                        sumsq += a*val*val;
> 273 	                        sum_abs += a * fabs (val);
> 274 	                        total_size += a;
>
> 285 	        if ( (otype & GV_LINES) || (otype & GV_AREA) ) {
> 286 	            mean = sum / total_size;
> 287 	            mean_abs = sum_abs / total_size;
>
> So the mean is actually a weighted mean with the area as weight. I  
> don't
> really no why Radim coded it like this at the time, and I think we
> should change this so that it just uses unweighted feature counts,  
> just
> as Roger suggested at the time. Try the attached (untested) patch.
>
> One thing that does potentially matter, though, is whether to use  
> the features or the attribute columns as a base. If you have  
> several features with the same cat value, this can make a  
> difference, as in the former case they will all be counted  
> individually, whereas in the latter case, they will only be counted  
> once. If each of the features has an indvididual meaning than the  
> former case seems more correct, but if not (e.g. each island of the  
> Philippines counted separately in a table which lists population by  
> country). Obviously we could just say that it is up to the user to  
> make sure that the map data is correct, i.e. if we take the above  
> example, there should only be one centroid linked to data per  
> country).
>
> The way the routines are written in v.class, they take an arbitrary  
> array of floats, so it is up to the individual modules to decide  
> how to create this array.
>

This is all very interesting. It is a bit worrisome too. I don't want  
a mean of an attribute column weighted by area unless I specifically  
ask for it. This suggests that people using v.univar may not be  
getting what they think they are getting. I think it is an excellent  
option, but should not be a silent default.

How to count the features is a bit of an issue, but couldn't this be  
left up to the user too--summarize by cat or by individual feature as  
an option?

Michael

____________________
C. Michael Barton, Professor of Anthropology
Director of Graduate Studies
School of Human Evolution & Social Change
Center for Social Dynamics & Complexity
Arizona State University

Phone: 480-965-6262
Fax: 480-965-7671
www: <www.public.asu.edu/~cmbarton>