[GRASS-dev] About v.distance, v.what.vect (wrt "count points within...").

Moritz Lennert mlennert at club.worldonline.be
Wed Aug 11 07:42:10 EDT 2010


On 10/08/10 17:50, Nikos Alexandris wrote:

> Hmmm... "dmax=0.0": Is this _my_ problem perhaps? Instead of setting it
> directly I was trying to estimate it first with "v.distance -pa" which meands
> that I misunderstood the whole process :-/

dmax=0.0 means: only those features that are in the same place, i.e. in 
the case of from=points and to=areas => only those points which fall 
into areas.

>>
>> So, using the combination of v.distance and db.select I cannot reproduce
>> your problem with 600,000 points, but maybe the number and nature of
>> polygons can also play a role...
>
> That's interesting. Maybe I have done once again something very messy(?). I
> use the 3rd script inside the attached file in ticket # 804 [1]. Although this
> (old) script still executes so inefficiently a very large number of SQL
> statements, the problem is still only in v.what.vect (so in v.distance) before
> the SQL calls.
>
> The script counts several point maps (for example: 404347 points)  that fall
> inside boxes (that compose a fishnet which I call cell-grid, for example: 1320
> vector cells). One run with the above mentioned numbers takes more than 10h.
>
> The specific line(s) in the python script is:
>
> # carry low resolution grid-cell "CAT"s over to reference vector points
>                  grass.run_command('v.what.vect',\
>                  flags = '-v',\
>                  quiet = False,\
>                  vector = reference_points_map,\
>                  qvect = lowres_vector_grid,\
>                  column = gridcell_column,\
>                  qcolumn = "cat")
>
> Of course I checked the "problem" with the my data by testing only
> "v.what.vect" commands out and apart of my messy script.
>
> Equally, very slow are the trials I did with spearfish (random data). I can
> pass some of my data (off-list please) or let me find some time later or
> tomorrow to copy-paste from my history the exact commands of my test within
> spearfish60.

I just did a similar test with same points and a grid created by

v.mkgrid grid=35,40


(using same column cat_municip from previous test example)
time v.distance from=mypoints at sqlite to=mygrid upload=cat 
column=cat_municip dmax=0.0

real	2m21.205s

Then testing the idea from the link Markus N added to your bug report:

time v.db.update mygrid col=count value="(SELECT count(*) from mypoints 
WHERE mygrid.cat=mypoints.cat_municip group by cat_municip)"

real	5m28.312s

One hypothesis I had was that since v.what.vect uses the upload=to_attr 
option, thus making it necessary to query the to_map's attribute table, 
this might create significant overhead in database connection, but when 
using

time v.distance from=mypoints at sqlite to=mygrid upload=to_attr 
column=cat_municip to_column=cat dmax=0.0

I get

real	2m13.741s

so no significant difference...

And final test with v.what.vect:

time v.what.vect mypoints col=cat_municip qvector=mygrid qcolumn=cat

real	2m9.377s

I'm pretty much at large about what causes your problem...

>
> (
> Just a quick look: I did not set dmax=0.0 in my (v.distance) tests. Then
> again, in "v.what.vect" it is set by default to 0.0, right? Isn't this default
> dmax=0.0 passed (by default) to v.distance?
> )

Yes.

Moritz


More information about the grass-dev mailing list