[GRASS-dev] v.what.rast speedup

Sun Oct 21 19:58:32 EDT 2007

Hamish wrote:

> > >> >> > since v.what.rast is (for me) extremely slow, I have added
> > >> >> > cover map support to r.random.
> > >> >> ..
> > >> >> > PS: v.what.rast still running on just 300k points while I
> > >> >> >     implemented above :-) Anyone who could make v.what.rast faster?
> > >> >> 
> > >> >> Maybe it's the qsort(), maybe it's the i,j loop within a loop.
> > >> > 
> > >> > The loop is certainly inefficient. Rather than shifting the entire
> > >> > array down every time it finds a duplicate, it should keep separate
> > >> > source and destination indices, e.g. (untested):
> > >> > 
> > >> >     for (i = j = 0; j < point_cnt; j++)
> Glynn:
> > > --- vector/v.what.rast/main.c	17 Oct 2007 14:07:23 -0000	1.26
> > > +++ vector/v.what.rast/main.c	21 Oct 2007 17:52:42 -0000
> > > ... patch
> Markus:
> > Thanks (for your patience): I have made a Spearfish test:
> > 
> > g.region rast=elevation.10m
> > r.random elevation.10m vect=elevpnts n=5000 --o
> > v.db.addcol elevpnts col="height double precision"
> > 
> > # old
> > time v.what.rast elevpnts rast=elevation.10m col=height
> > real    0m25.253s
> > user    0m24.756s
> > sys     0m0.308s
> > 
> > # new
> > real    0m24.040s
> > user    0m23.707s
> > sys     0m0.297s
> > 
> > Interestingly, timing rather identical (also for 1000). Using the DBF 
> > driver here.
> 
> It is as expected, only a small speed up. While the for i,j loop map may have
> been inefficient, it still was fast enough to only take a second to get though.
> The bulk of the time was and is being taken up by running db_execute_imediate()
> for each point later on in the script. (then db_commit_transaction() is run
> after the loop) 
> 
> To solve this in v.in.garmin and v.in.gpsbabel we wrote all db SET .. to ..
> statements to a temporary file, then ran db.execute once for the tmp file
> instead of earlier running db.execute for every point. It is not a direct
> analogy, but it did made a huge difference there.

That isn't applicable here. The underlying db_execute_immediate()
function can only execute one statement at a time.

> Or maybe it is the bsearch() for every point in the db_execute_imediate() loop?

Nope. bsearch() is O(log(n)) per call so it's even less significant
than the duplicate removal (was O(n), is now O(1)).

It's almost certainly the DBMI overhead. Buffered I/O may make some
difference (but won't work on Windows); beyond that, there's nothing
more that can be done with the existing interface.

-- 
Glynn Clements <glynn at gclements.plus.com>