[GRASS-user] Does v.kernel have to take 16+ hours?

Aren Cambre aren at arencambre.com
Fri Nov 23 08:35:45 PST 2012


Thanks!

I am not familiar with GRASS's release customs. Will this become part of a
binary release soon, or should I just pull down the latest release in the
6.4.2 trunk? I'm assuming this has been merged into the trunk...

Aren

On Fri, Nov 23, 2012 at 7:32 AM, Markus Metz
<markus.metz.giswork at gmail.com>wrote:

> On Fri, Nov 23, 2012 at 2:07 PM, Aren Cambre <aren at arencambre.com> wrote:
> > Isn't taking about 10,000% too much time considered a bug? :-)
>
> Hmm, yes. v.kernel is fixed in devbr6 and relbr6 with r53982 and
> r53983, respectively.
>
> Markus M
>
> >
> > On Nov 23, 2012 5:11 AM, "Markus Metz" <markus.metz.giswork at gmail.com>
> > wrote:
> >>
> >> On Fri, Nov 23, 2012 at 4:14 AM, Aren Cambre <aren at arencambre.com>
> wrote:
> >> > I'm able to reproduce reliably here. I'll email you details privately.
> >>
> >> Thanks. I can confirm that v.kernel takes a long time in GRASS 6 with
> >> the settings provided by you. It does not crash, however.
> >>
> >> I can speed up v.kernel in GRASS 6 to complete in 10 minutes instead
> >> of 16+ hours, but I am not sure if this fix can/will go into GRASS 6.4
> >> because by now only bugs should be fixed.
> >>
> >> Markus M
> >>
> >> >
> >> > Aren
> >> >
> >> >
> >> > On Thu, Nov 22, 2012 at 9:02 AM, Markus Metz
> >> > <markus.metz.giswork at gmail.com>
> >> > wrote:
> >> >>
> >> >> On Sat, Nov 17, 2012 at 4:06 PM, Aren Cambre <aren at arencambre.com>
> >> >> wrote:
> >> >> > I have a dataset of just over 700,000 incidents that happened in
> >> >> > square-ish
> >> >> > Texas county that's about 30 miles on each side.
> >> >> >
> >> >> > Here's the parameters reported by v.kernel as it's executing:
> >> >> >
> >> >> > STDDEV: 1000.000000
> >> >> > RES: 111.419043 ROWS: 458   COLS: 447
> >> >> >
> >> >> > Writing output raster map using smooth parameter=1000.000000.
> >> >> >
> >> >> > Normalising factor=6482635.018778.
> >> >> >
> >> >> >
> >> >> > I am running this on a Windows 7 x64 machine with 8 GB RAM and an
> >> >> > Intel
> >> >> > Core
> >> >> > i7 Q720 1.6 GHz with 4 physical cores. I notice that it's not
> >> >> > multithreaded,
> >> >> > only using 1 core.
> >> >> >
> >> >> > It takes about 16 hours to complete. Is this correct? I'd like to
> use
> >> >> > this
> >> >> > on a dataset with closer to 5 million records, and I'm really
> >> >> > concerned
> >> >> > how
> >> >> > long it may take.
> >> >>
> >> >> The time required by v.kernel is a function of the number of cells
> and
> >> >> the input parameter stddeviation. The larger any of these values is,
> >> >> the more time v.kernel will need. Nevertheless, I think that the 16+
> >> >> hours are not correct. I tested with a vector with 3 million points
> >> >> for a grid with 2700 rows and 1087 columns, magnitudes larger than
> the
> >> >> grid used by you. v.kernel completes in just over one minute.
> >> >>
> >> >> >
> >> >> > I posted my question about the 16+ hours at
> >> >> >
> >> >> >
> >> >> >
> http://gis.stackexchange.com/questions/41058/how-do-i-compute-v-kernel-maps-in-less-than-16-hours/
> .
> >> >> > Bill Huber, who si apparently knowledgeable about kernel density
> >> >> > calculations in general, posted a response, and he felt like a
> kernel
> >> >> > density map shouldn't take much time at all. But digging more
> deeply,
> >> >> > turns
> >> >> > out he had come up with a kernel density calculation method over a
> >> >> > decade
> >> >> > ago using Fourier transforms. See
> >> >> > http://www.directionsmag.com/features/convolution/129753 and the
> next
> >> >> > two
> >> >> > articles linked to it (they are short articles). Apparently this
> >> >> > transforms
> >> >> > it from an O(n^2) problem to an O(n ln n) complexity problem.
> >> >>
> >> >> The approach of Bill Huber is raster-based, not vector based, making
> >> >> some things easier, at the cost of precision. The coordinate
> >> >> precision, however, is only needed for kernel functions other than
> >> >> uniform. In GRASS, you could get something like a raster-based
> density
> >> >> map by
> >> >>
> >> >> - exporting the points with v.out.ascii
> >> >> - re-importing the points with r.in.xyz method=n to get the number of
> >> >> points per cell
> >> >> - running a neighborhood analysis using a circular window with
> >> >> r.neighbors method=sum -c
> >> >>
> >> >> Optionally you could use the gauss option of r.neighbors to get an
> >> >> equivalent to v.kernel kernel=gaussian
> >> >>
> >> >> HTH,
> >> >>
> >> >> Markus M
> >> >>
> >> >> >
> >> >> > I inspected v.kernel's main.c
> >> >> >
> >> >> >
> >> >> > (
> http://trac.osgeo.org/grass/browser/grass/trunk/vector/v.kernel/main.c),
> >> >> > and looks like v.kernel uses an output-centric method (using Bill's
> >> >> > wording)
> >> >> > of calculating the output, which seems like O(n^2) complexity.
> >> >> >
> >> >> > So I guess what I'm getting at is it appears to me that the
> algorithm
> >> >> > behind
> >> >> > GRASS GIS's v.kernel is straightforward but is a greedy algorithm
> >> >> > (http://en.wikipedia.org/wiki/Greedy_algorithm), which is fine,
> but
> >> >> > it
> >> >> > make
> >> >> > take a while to execute. Is this true?
> >> >> >
> >> >> > Is there not spatial indexing I could add to the dataset? I've done
> >> >> > various
> >> >> > Google searches on that and can't come up with anything clear.
> >> >> >
> >> >> > Aren
> >> >> >
> >> >> > _______________________________________________
> >> >> > grass-user mailing list
> >> >> > grass-user at lists.osgeo.org
> >> >> > http://lists.osgeo.org/mailman/listinfo/grass-user
> >> >> >
> >> >
> >> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20121123/25712075/attachment-0001.html>


More information about the grass-user mailing list