[GRASS-dev] About v.distance, v.what.vect (wrt "count points within...").

Markus Metz markus.metz.giswork at googlemail.com
Tue Aug 10 09:44:02 EDT 2010


Nikos Alexandris wrote:
> @Markus M: Thank you Markus. Below I have put more questions (before reading
> the document) on purpose. No need to reply (except if you have the time) -
> it's for the sake of a potential beginner's "question and answer" wiki-page.
>
Some short answers below.
>
> Nikos A wrote:
>> >> >> "v.distance" is slow (for the impatient user) with very large vectors
>> >> >> (from my memory I estimate that it took ~20h for ~600.000 features).
>> >> >> Trying to get the "dmax=" first in order to tell "v.distance" to look
>> >> >> for features within a certain radius, might _not_ be very efficient as
>> >> >> well. It takes time to get results from "v.distance -pa" and,
>> >> >> depending on the vector(s), the addition of the _real_ v.distance can
>> >> >> be a deal-breaker.
>
> Moritz L:
>
>> >> > Just an innocent question (partly directed to Markus M): could this be
>> >> > linked to the spatial index ? And is v.distance faster with spatial
>> >> > index on file in grass7 ?
>
> Markus M:
>
>> >> No consistent speed difference between grass 6.x and grass 7.
>> >> Sometimes grass 6 is faster, sometimes grass 7, but nothing dramatic.
>> >> The time-consuming parts are distance calculations and tests whether a
>> >> point is inside an area or inside an isle of an area. These
>> >> calculations should be done only when necessary. I have added some
>> >> tests using bounding boxes to my local copy to avoid
>> >> distance-from-point-to-line calculations; in combination with dmax and
>> >> without -a there is a real speed gain for point to area distances, but
>> >> I need to do more testing to make sure results stay identical.
>> >>
>> >> It might save some time to get distances from points to centroids
>> >> first with -p (I don't think -a is necessary) because getting
>> >> distances to centroids is much faster than getting distances to areas.
>
> Nikos A:
>
>> > "Area = boundary + centroid", so what exactly is the point to area
>> > distance?
>
> Markus M:
>
>> If a point is inside an area (the polygon composed of the area's
>> boundaries), the distance is 0 (zero):
>
> This sentence makes me think that it is a priori known (based on something
> else - related to topology?) when a point is inside an area.

This is not known a priori, v.distance checks if a point is inside an
area, if yes, distance is set to zero.

> Why all the need
> to measure distances then in order to count how many points are inside?

If you want to count the number of points inside areas, set dmax to 0 (zero).

>
>> A centroid is always inside the area it belongs to, therefore the
>> distance of a point outside an area to the area's centroid is always
>> larger than the distance of that point to the area's boundaries.
>
> What happens when the centroid (gravity center) falls outside of the
> boundaries?

Corrupt topology. Happens only when manually moving a centroid outside
its area. When GRASS calculates a new centroid for an area, the
centroid is always within the area (e.g. v.in.* and v.centroids).
>
>> > Point to boundary? Specifically, point to all-lines that compose a
>> > boundary? Many distances? And then, keep the maximum?
>
>> If a point is outside an area, v.distance determines the shortest
>> distance to the area's boundaries (point to all-lines that compose a
>> boundary), IOW, keep the minimum not the maximum because v.distance
>> searches for the nearest feature.
>
> All right. But, (repeating the same question): why is distance measuring
> required at all? To compare it with what?

If distance measuring is required and distances are supposed to be
compared with something, then use v.distance ;-) The answer to these
questions comes before the decision to use v.distance.
>
>> A point is also outside an area if it is inside an isle of the area.
>> This isle might in turn be another area with centroid.
>
> This complicates the process I guess.

All taken care of.
>
>> > Oh, I am still not within the logic of how exactly the "within area"
>> > point detection works.
>
>> Do you really want to know?
>
> At some points yes. I am curious to understand stuff - at least their basic
> concept.
>
>> Start with
>> http://en.wikipedia.org/wiki/Point_in_polygon
>> GRASS uses the ray casting algorithm
>
> Seems to have all the answers I want. Thanks.
>
>> > Markus, would you mind pointing out the code lines where distance
>> > measuring (points to areas) is happening?
>>
>> from
>> https://trac.osgeo.org/grass/browser/grass/trunk/vector/v.distance/main.c#L
>> 676 to
>> https://trac.osgeo.org/grass/browser/grass/trunk/vector/v.distance/main.c#L
>> 702
>
>> IMHO, there is not a lot of potential for speed improvements, a bit,
>> but that comes at the cost of more complicated code.
>
> Why are (supposedly) other tools faster in counting points within polygons?

Because if it's only about counting points in polygons, you can write
a special module doing exactly and only that.
>
>> I think dmax is the crucial option to speed things up. Without dmax,
>> v.distance has to determine for each point in <from> the distance to
>> all areas in <to>. With many points and many areas, this will
>> obviously take some time. A reasonably fast way to estimate a good
>> dmax helps much more than trying to hack v.distance for distances to
>> areas.
>
> Then my intentions to test v.distanve/v.what.vect with and without "dmax="
> where on the right track (right?).
>
I think so. But wait a little bit, I might get v.distance much faster soon...

Markus M


More information about the grass-dev mailing list