[GRASS-dev] matplotlib example script

Glynn Clements glynn at gclements.plus.com
Fri Jul 25 14:59:29 EDT 2008


Michael Barton wrote:

> > Even if it takes just as long, you're less likely to have it fail
> > because "plotlist" consumed all available RAM. As it stands, plotlist
> > will have one entry for every non-null cell in the raster.
> >
> > When processing "bulk" data, anything that uses a fixed amount of
> > memory (e.g. one integer per bin) is preferable to using memory
> > proportional to the size of the input.
> >
> > Hence the recommendation to iterate over the lines of r.stats' output
> > rather than read it all into a list then iterate over the list.
> 
> To do a histogram, I need to send ax.hist a list of values. So I don't  
> know how I can get away without creating that list unless I use a  
> completely different algorithm (something from numpy?).

Don't use axes.hist(), use axes.bar(). I.e. leave the calculations to
GRASS, and only use matplotlib for plotting.

ax.hist() and numpy.histogram() are broken by design. They should
accept an iterator as an argument. Requiring the entire data to be
passed as a list makes them useless for large amounts of data.

If we were to impose a requirement that maps must fit into memory,
writing GRASS modules would be significantly simpler. GRASS would also
be significantly less useful.

> On the other hand, in spite of recent improvements, d.hist is still  
> pretty ugly, with formatting issues like varying font sizes on a  
> single axis. And it is not very flexible. It would be nice to see  
> where standard deviations lie, customize axis formatting, etc. For  
> this module in particular, it is probably better to bin the data in  
> another way than I have, and input it into one of the matplotlib plot  
> methods.

Actually, for statistical information, the most important feature is
the ability output data in formats that are useful to *real*
statistical software. There must be packages out there (even free
ones) which will do a far better job than anything we are going to
provide.

Sure, we can provide "basic" functionality, but where do you draw the
line? Which features of a "real" statistics package *wouldn't* be
useful in GRASS? I'm really quite worried about the potential for
feature creep in this area.

And library of plotting functions isn't a statistics package. You want
something where GRASS hands over the data and forgets about it. We
shouldn't be responsible for communicating from the user to the
software details such as what type of graph to draw, or the colours or
symbols or whatever.

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list