[GRASS-dev] About v.distance,
v.what.vect (wrt "count points within...").
Nikos Alexandris
nikos.alexandris at felis.uni-freiburg.de
Thu Aug 12 01:17:51 EDT 2010
Nikos A:
> > Hmmm... "dmax=0.0": Is this _my_ problem perhaps? Instead of setting it
> > directly I was trying to estimate it first with "v.distance -pa" which
> > meands that I misunderstood the whole process :-/
Moritz L:
> dmax=0.0 means: only those features that are in the same place, i.e. in
> the case of from=points and to=areas => only those points which fall
> into areas.
All clear now!
> >> So, using the combination of v.distance and db.select I cannot reproduce
> >> your problem with 600,000 points, but maybe the number and nature of
> >> polygons can also play a role...
(I am repeating your commands, will report in separate post)
> > That's interesting. Maybe I have done once again something very messy(?).
> > I use the 3rd script inside the attached file in ticket # 804 [1].
> > Although this (old) script still executes so inefficiently a very large
> > number of SQL statements, the problem is still only in v.what.vect (so
> > in v.distance) before the SQL calls.
> > The script counts several point maps (for example: 404347 points) that
> > fall inside boxes (that compose a fishnet which I call cell-grid, for
> > example: 1320 vector cells). One run with the above mentioned numbers
> > takes more than 10h.
> > The specific line(s) in the python script is:
> > # carry low resolution grid-cell "CAT"s over to reference vector points
> >
> > grass.run_command('v.what.vect',\
> > flags = '-v',\
> > quiet = False,\
> > vector = reference_points_map,\
> > qvect = lowres_vector_grid,\
> > column = gridcell_column,\
> > qcolumn = "cat")
> > Of course I checked the "problem" with the my data by testing only
> > "v.what.vect" commands out and apart of my messy script.
> >
> > Equally, very slow are the trials I did with spearfish (random data). I
> > can pass some of my data (off-list please) or let me find some time
> > later or tomorrow to copy-paste from my history the exact commands of my
> > test within spearfish60.
> I just did a similar test with same points and a grid created by
>
> v.mkgrid grid=35,40
> (using same column cat_municip from previous test example)
> time v.distance from=mypoints at sqlite to=mygrid upload=cat
> column=cat_municip dmax=0.0
>
> real 2m21.205s
>
> Then testing the idea from the link Markus N added to your bug report:
Hmm... if I recall correctly, this is where I "stole" the how-to count points
in polygons in the past (when I was writing my "pareto" scripts).
> time v.db.update mygrid col=count value="(SELECT count(*) from mypoints
> WHERE mygrid.cat=mypoints.cat_municip group by cat_municip)"
>
> real 5m28.312s
>
> One hypothesis I had was that since v.what.vect uses the upload=to_attr
> option, thus making it necessary to query the to_map's attribute table,
> this might create significant overhead in database connection, but when
> using
>
> time v.distance from=mypoints at sqlite to=mygrid upload=to_attr
> column=cat_municip to_column=cat dmax=0.0
>
> I get
>
> real 2m13.741s
>
> so no significant difference...
Well, there is some difference.
> And final test with v.what.vect:
>
> time v.what.vect mypoints col=cat_municip qvector=mygrid qcolumn=cat
>
> real 2m9.377s
>
> I'm pretty much at large about what causes your problem...
Me too :-p -- whenever there is free-time you could have a look in my timings
(some already posted, test using your commands is coming...)
> > (
> > Just a quick look: I did not set dmax=0.0 in my (v.distance) tests. Then
> > again, in "v.what.vect" it is set by default to 0.0, right? Isn't this
> > default dmax=0.0 passed (by default) to v.distance?
> > )
>
> Yes.
Good then.
Nikos
More information about the grass-dev
mailing list