[GRASS-dev] About v.distance, v.what.vect (wrt "count points
within...").
Moritz Lennert
mlennert at club.worldonline.be
Wed Aug 11 07:42:10 EDT 2010
On 10/08/10 17:50, Nikos Alexandris wrote:
> Hmmm... "dmax=0.0": Is this _my_ problem perhaps? Instead of setting it
> directly I was trying to estimate it first with "v.distance -pa" which meands
> that I misunderstood the whole process :-/
dmax=0.0 means: only those features that are in the same place, i.e. in
the case of from=points and to=areas => only those points which fall
into areas.
>>
>> So, using the combination of v.distance and db.select I cannot reproduce
>> your problem with 600,000 points, but maybe the number and nature of
>> polygons can also play a role...
>
> That's interesting. Maybe I have done once again something very messy(?). I
> use the 3rd script inside the attached file in ticket # 804 [1]. Although this
> (old) script still executes so inefficiently a very large number of SQL
> statements, the problem is still only in v.what.vect (so in v.distance) before
> the SQL calls.
>
> The script counts several point maps (for example: 404347 points) that fall
> inside boxes (that compose a fishnet which I call cell-grid, for example: 1320
> vector cells). One run with the above mentioned numbers takes more than 10h.
>
> The specific line(s) in the python script is:
>
> # carry low resolution grid-cell "CAT"s over to reference vector points
> grass.run_command('v.what.vect',\
> flags = '-v',\
> quiet = False,\
> vector = reference_points_map,\
> qvect = lowres_vector_grid,\
> column = gridcell_column,\
> qcolumn = "cat")
>
> Of course I checked the "problem" with the my data by testing only
> "v.what.vect" commands out and apart of my messy script.
>
> Equally, very slow are the trials I did with spearfish (random data). I can
> pass some of my data (off-list please) or let me find some time later or
> tomorrow to copy-paste from my history the exact commands of my test within
> spearfish60.
I just did a similar test with same points and a grid created by
v.mkgrid grid=35,40
(using same column cat_municip from previous test example)
time v.distance from=mypoints at sqlite to=mygrid upload=cat
column=cat_municip dmax=0.0
real 2m21.205s
Then testing the idea from the link Markus N added to your bug report:
time v.db.update mygrid col=count value="(SELECT count(*) from mypoints
WHERE mygrid.cat=mypoints.cat_municip group by cat_municip)"
real 5m28.312s
One hypothesis I had was that since v.what.vect uses the upload=to_attr
option, thus making it necessary to query the to_map's attribute table,
this might create significant overhead in database connection, but when
using
time v.distance from=mypoints at sqlite to=mygrid upload=to_attr
column=cat_municip to_column=cat dmax=0.0
I get
real 2m13.741s
so no significant difference...
And final test with v.what.vect:
time v.what.vect mypoints col=cat_municip qvector=mygrid qcolumn=cat
real 2m9.377s
I'm pretty much at large about what causes your problem...
>
> (
> Just a quick look: I did not set dmax=0.0 in my (v.distance) tests. Then
> again, in "v.what.vect" it is set by default to 0.0, right? Isn't this default
> dmax=0.0 passed (by default) to v.distance?
> )
Yes.
Moritz
More information about the grass-dev
mailing list