[GRASS-stats] Scatterplot "thinning" (points reduction)?

Roger Bivand Roger.Bivand at nhh.no
Mon Aug 17 04:43:35 EDT 2009


On Mon, 17 Aug 2009, Markus Neteler wrote:

> On Mon, Aug 17, 2009 at 9:33 AM, Roger Bivand<Roger.Bivand at nhh.no> wrote:
>> On Sun, 16 Aug 2009, Markus Neteler wrote:
>>
>>> Hi,
>>>
>>> I am plotting elevation against temperature and have the problem that
>>> including all points leads to heavy slow graphs... Reducing the raster
>>> resolution is not a solution since it does not maintain the
>>> characteristics
>>> of the graph (since GRASS is using nearest neighbor).
>>
>> One point initially. I'm assuming that you are using a Linux platform - on
>> this platform, there is an order of magnitude speedup if you plot on screen
>> without "cairo", the default x11 type= - try using type="Xlib", which is
>> much faster but not so refined.
>
> (yes, Linux)
> I have searched around bit I am not entirely sure to which function
> this type parameter belongs.

In x11() to open the screen graphics device - by default it opens by 
itself with type="cairo" when needed, you you have to open it manually 
with the non-default type, or use use X11.options() to have the 
automatically opened devices used "Xlib". Generally, "cairo" is 
preferable, but slower. I'd probably leave "cairo", and use hexbin() 
instead.

Roger

>
>> Given that, consider the cex= argument for varying symbol size, and maybe
>> the pch="." possibility for using a single pt. point. They still all get
>> drawn, so there is no time saving, but they may be more visible.
>
> I am currently plotting like this:
> plot(data$dem ~ data$raw)
> points(data$dem ~ data$filt2, col="yellow", cex=0.5, pch=3)
> points(data$dem ~ data$rst, col="green", xlab="LST value [°C]",
> ylab="elevation [m]", pch=2)
> abline(lm(data$dem ~ data$raw))
> abline(lm(data$dem ~ data$filt2), col="yellow")
> abline(lm(data$dem ~ data$rst), col="green", xlab="LST value [°C]",
> ylab="elevation [m]")
>
> So the backgound (largest) cloud comes in back circles,
> the interim (smaller) in yellow crosses with many of them in the circles,
> and the upper point could (smallest) in green triangles.
> I guess the real problem are the 826896 * 3 points in the plot.
>
>> For very large data sets, consider hexbin() in the hexbin package - I'm not
>> sure how best to display three data sets. For single scatterplots, it is
>> very powerful. Maybe contours of 2D densities of the extra data sets could
>> be overlaid over a base hexbin plot? There is an informative vignette in
>> hexbin.
>
> Oh, this is interesting! Thanks,
> Markus
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the grass-stats mailing list