[GRASS-dev] v.what.rast speedup

Mon Oct 22 13:59:26 EDT 2007

Markus Neteler wrote:

> >> >> >> > since v.what.rast is (for me) extremely slow
> > ...
> >> >> >> Maybe it's the qsort(), maybe it's the i,j loop within a loop.
> > ...
> >> >> > The loop is certainly inefficient. 
> > ...
> >> > --- vector/v.what.rast/main.c patch
> > ...
> >> I have made a Spearfish test:
> >> 
> >> g.region rast=elevation.10m
> >> r.random elevation.10m vect=elevpnts n=5000 --o
> >> v.db.addcol elevpnts col="height double precision"
> >> 
> >> # old
> >> time v.what.rast elevpnts rast=elevation.10m col=height
> >> real    0m25.253s
> >> user    0m24.756s
> >> sys     0m0.308s
> >> 
> >> # new
> >> real    0m24.040s
> >> user    0m23.707s
> >> sys     0m0.297s
> >> 
> >> Interestingly, timing rather identical (also for 1000). Using the DBF 
> >> driver here.
> > 
> > It was subsequently suggested that it may be the database access which
> > is the main performance issue. Personally, I suspect that is probably
> > the case.
> > 
> > You could try re-compiling the DBMI libraries with -DUSE_BUFFERED_IO
> > to see if that helps at all.
> 
> It makes things worse: I have killed the job after 6minutes... (DBF).

That implies there's a deadlock, i.e. one side is waiting for data
which is sitting in the other side's buffer.

I'm not sure how that occurs; reading from a stream should flush any
output streams. It would be useful if you could debug this to see
where it's blocking.

> Compiling again 
> without -DUSE_BUFFERED_IO brings me back to 25 seconds.
> 
> Running the procedure with SQLite backend as suggested:
> 
> time v.what.rast elevpnts rast=elevation.10m col=height
> real    0m7.530s
> user    0m6.992s
> sys     0m0.373s
> 
> Running the same procedure connected to PostgreSQL server
> on a different machine in intranet:
> 
> time v.what.rast elevpnts rast=elevation.10m col=height
> real    0m14.085s
> user    0m1.419s
> sys     0m0.366s
> 
> Since SQLite wins, it is not only the DBMI. Still everything is not
> extremely fast given that I tested 5000 points but need to work
> with 300k points for my real work (ideally for several thousand
> maps).

The DBMI may account for a lot of the overhead in the SQLite case. The
difference between SQLite and other drivers has to be due to the
driver, as the DBMI overhead would be the same in each case.

It may be possible to reduce the SQL overhead by storing the
assignments in a separate table then using UPDATE ... FROM (or
CREATE TABLE ... AS). But I don't think that the DBF driver supports
that, and simply populating a table through DBMI may still be
unacceptably slow.

-- 
Glynn Clements <glynn at gclements.plus.com>