[GRASS-dev] About v.distance, v.what.vect (wrt "count points within...").

Markus Metz markus.metz.giswork at googlemail.com
Tue Aug 10 06:15:23 EDT 2010


Nikos Alexandris wrote:
> Nikos Alexandris wrote:
>
>> >> "v.distance" is slow (for the impatient user) with very large vectors
>> >> (from my
>> >> memory I estimate that it took ~20h for ~600.000 features). Trying to
>> >> get the
>> >> "dmax=" first in order to tell "v.distance" to look for features within
>> >> a certain radius, might _not_ be very efficient as well. It takes time
>> >> to get
>> >> results from "v.distance -pa" and, depending on the vector(s), the
>> >> addition of
>> >> the _real_ v.distance can be a deal-breaker.
>
> Moritz Lennert:
>
>> > Just an innocent question (partly directed to Markus M): could this be
>> > linked to the spatial index ? And is v.distance faster with spatial index
>> > on file in grass7 ?
>
> Markus Metz:
>
>> No consistent speed difference between grass 6.x and grass 7.
>> Sometimes grass 6 is faster, sometimes grass 7, but nothing dramatic.
>> The time-consuming parts are distance calculations and tests whether a
>> point is inside an area or inside an isle of an area. These
>> calculations should be done only when necessary. I have added some
>> tests using bounding boxes to my local copy to avoid
>> distance-from-point-to-line calculations; in combination with dmax and
>> without -a there is a real speed gain for point to area distances, but
>> I need to do more testing to make sure results stay identical.
>
>> It might save some time to get distances from points to centroids
>> first with -p (I don't think -a is necessary) because getting
>> distances to centroids is much faster than getting distances to areas.
>
> "Area = boundary + centroid", so what exactly is the point to area distance?

If a point is inside an area (the polygon composed of the area's
boundaries), the distance is 0 (zero):

A centroid is always inside the area it belongs to, therefore the
distance of a point outside an area to the area's centroid is always
larger than the distance of that point to the area's boundaries.

> Point to boundary? Specifically, point to all-lines that compose a boundary?
> Many distances? And then, keep the maximum?

If a point is outside an area, v.distance determines the shortest
distance to the area's boundaries (point to all-lines that compose a
boundary), IOW, keep the minimum not the maximum because v.distance
searches for the nearest feature.
A point is also outside an area if it is inside an isle of the area.
This isle might in turn be another area with centroid.

>
> Oh, I am still not within the logic of how exactly the "within area" point
> detection works.

Do you really want to know?
Start with
http://en.wikipedia.org/wiki/Point_in_polygon
GRASS uses the ray casting algorithm

>
> Markus, would you mind pointing out the code lines where distance measuring
> (points to areas) is happening?

from
https://trac.osgeo.org/grass/browser/grass/trunk/vector/v.distance/main.c#L676
to
https://trac.osgeo.org/grass/browser/grass/trunk/vector/v.distance/main.c#L702

IMHO, there is not a lot of potential for speed improvements, a bit,
but that comes at the cost of more complicated code.

I think dmax is the crucial option to speed things up. Without dmax,
v.distance has to determine for each point in <from> the distance to
all areas in <to>. With many points and many areas, this will
obviously take some time. A reasonably fast way to estimate a good
dmax helps much more than trying to hack v.distance for distances to
areas.

Markus M


More information about the grass-dev mailing list