[GRASS-dev] About v.distance,
v.what.vect (wrt "count points within...").
nikos.alexandris at felis.uni-freiburg.de
Sat Aug 7 02:33:49 EDT 2010
I attempted to measure the time required to process large vector point maps
(derived from the landcover.30m spearfish map) with v.what.vect/v.distance.
a. v.distance without "dmax=" and afterwards v.what.vect
b. v.distance with -pa, grabbing "dmax" and feeding afterwards v.what.vect
Unfortunately, the process crashed because I (wrongly) ran concurrently
another heavy process for my work. I have no more time to repeat it.
Nevertheless, I have a question: can a good C/C++ programmer, without GRASS-
background, make it more efficient within a day let's say? Is the cause of
v.distance being slow known? I did not find any specific trac-ticket.
(I am asking this because I know somebody which could eventually help-out. In
case it takes a _lot_ of time to jump-in the GRASS-C/++ stuff then I won't
bother to asking him.)
A bit more...
"v.distance" is slow (for the impatient user) with very large vectors (from my
memory I estimate that it took ~20h for ~600.000 features). Trying to get the
"dmax=" first in order to tell "v.distance" to look for features within a
certain radius, might _not_ be very efficient as well. It takes time to get
results from "v.distance -pa" and, depending on the vector(s), the addition of
the _real_ v.distance can be a deal-breaker.
However, it might be worthwhile in the end to use "dmax=", as MoritzL
mentioned in some post. I think it makes sense when the area of interest (in
which v.distance _should_ operate) is much smaller than the queried vector map
itself (right?). Unfortunately, the process which was doing exactly this,
crashed (using >6000 points and a vector fishnet to count the points that fall
inside each box of the fishnet map)
In case "dmax=" does a real good job in some cases, maybe v.what.vect can be
improved. For example, check the features of the vector maps fed in "vect="
and "qvect=" respectively, compare, and, depending on the occasion, decide
automatically if or not to use first a "v.distance -pa" step.
Probably you are aware of all this but I post them anyway. Below is the
information that I have collected.
Thank for your time, Nikos
The stuff I collected so far:
--Count points within vectors--
"Point count with large vectors"
+ (this is rather old) <http://lists.osgeo.org/pipermail/grass-user/2003-
v.count.points.sh (uses v.what.vect which in turn uses v.distance)
+ (the following post is is rather old - maybe fixed in latest
...this script is very slow, because it uses a nested for loop, so if you have
10000 points and 1000 areas, it calls 10000 * 1000 database queries (in case
you provide a class column)...
--Count points in raster cells (not relevant to v.distance)--
- "r.in.xyz method=n" -
+ see also <http://lists.osgeo.org/pipermail/grass-user/2010-
-Information that might be relevant to "count points"-
[Point in polygon]
[r.statistics base=name cover=name method=count ?]
[density surface from points in PostGIS]
--Towards a faster "count of points within..." script--
+ About SQL statements, related to "v.what.vect": <http://www.mail-
archive.com/grass-stats at lists.osgeo.org/msg00321.html>
More information about the grass-dev