[GRASS-user] Does v.kernel have to take 16+ hours?

Markus Metz markus.metz.giswork at gmail.com
Thu Dec 6 10:17:28 PST 2012


On Thu, Dec 6, 2012 at 3:13 PM, Aren Cambre <aren at arencambre.com> wrote:
> Thanks. I am using EPSG:3081, and its unit is meters. So is it the case that
> a SD of x on EPSG:3081 means a search radius of 4x meters? If so, why 4x?

Because the gaussian function is infinite, a cutoff of 3 to 6 SDs is
usually applied when using a gaussian function as kernel density
function. In theory, an alternative is to combine the infinite
gaussian function with a finite function, e.g. uniform. I think
historically v.kernel had only one kernel function, gaussian. The
original authors decided to ask for SD and not for the search radius
as input and have set the cutoff to 4 x SD where the gaussian function
is reasonably close to zero.

BTW, search radius = 4 x SD applies only to the gaussian kernel, for
all other kernel functions the search radius is equal to SD.

Markus M

>
> On Thu, Dec 6, 2012 at 2:29 AM, Markus Metz <markus.metz.giswork at gmail.com>
> wrote:
>>
>> On Thu, Dec 6, 2012 at 3:58 AM, Aren Cambre <aren at arencambre.com> wrote:
>> > It's gotten slow again. This run will probably take more than 10 hours.
>> > However, I am using a standard deviation of 1000. Is that what could be
>> > causing this?
>>
>> Yes. With a standard deviation of 1000, the search radius is now 4000,
>> that is, for each cell a 8000x8000 box is searched. With many densely
>> packed points, this can take quite some time.
>>
>> Markus M
>>
>> >
>> > v.kernel input=tickets at PERMANENT output=tickets_new_heatmap_1000
>> > stddeviation=1000
>> > STDDEV: 1000.000000
>> > RES: 18.290457 ROWS: 2370 COLS: 2650
>> >
>> > Writing output raster map using smooth parameter=1000.000000.
>> >
>> > Normalising factor=6482635.018778.
>> >
>> > On Sat, Nov 24, 2012 at 9:03 PM, Aren Cambre <aren at arencambre.com>
>> > wrote:
>> >>
>> >> I installed r53983. The v.kernel execution that took almost a day now
>> >> executes in 25.5 minutes. Thank you!
>> >>
>> >> Aren
>> >>
>> >>
>> >> On Fri, Nov 23, 2012 at 12:51 PM, Markus Metz
>> >> <markus.metz.giswork at gmail.com> wrote:
>> >>>
>> >>> On Fri, Nov 23, 2012 at 5:35 PM, Aren Cambre <aren at arencambre.com>
>> >>> wrote:
>> >>> > Thanks!
>> >>> >
>> >>> > I am not familiar with GRASS's release customs. Will this become
>> >>> > part
>> >>> > of a
>> >>> > binary release soon, or should I just pull down the latest release
>> >>> > in
>> >>> > the
>> >>> > 6.4.2 trunk? I'm assuming this has been merged into the trunk...
>> >>>
>> >>> It should be available as a binary for Windows by tomorrow in the
>> >>> nightly builds [0].
>> >>>
>> >>> Markus M
>> >>>
>> >>> [0] http://wingrass.fsv.cvut.cz/grass64/
>> >>>
>> >>> >
>> >>> > Aren
>> >>> >
>> >>> >
>> >>> > On Fri, Nov 23, 2012 at 7:32 AM, Markus Metz
>> >>> > <markus.metz.giswork at gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> On Fri, Nov 23, 2012 at 2:07 PM, Aren Cambre <aren at arencambre.com>
>> >>> >> wrote:
>> >>> >> > Isn't taking about 10,000% too much time considered a bug? :-)
>> >>> >>
>> >>> >> Hmm, yes. v.kernel is fixed in devbr6 and relbr6 with r53982 and
>> >>> >> r53983, respectively.
>> >>> >>
>> >>> >> Markus M
>> >>> >>
>> >>> >> >
>> >>> >> > On Nov 23, 2012 5:11 AM, "Markus Metz"
>> >>> >> > <markus.metz.giswork at gmail.com>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> On Fri, Nov 23, 2012 at 4:14 AM, Aren Cambre
>> >>> >> >> <aren at arencambre.com>
>> >>> >> >> wrote:
>> >>> >> >> > I'm able to reproduce reliably here. I'll email you details
>> >>> >> >> > privately.
>> >>> >> >>
>> >>> >> >> Thanks. I can confirm that v.kernel takes a long time in GRASS 6
>> >>> >> >> with
>> >>> >> >> the settings provided by you. It does not crash, however.
>> >>> >> >>
>> >>> >> >> I can speed up v.kernel in GRASS 6 to complete in 10 minutes
>> >>> >> >> instead
>> >>> >> >> of 16+ hours, but I am not sure if this fix can/will go into
>> >>> >> >> GRASS
>> >>> >> >> 6.4
>> >>> >> >> because by now only bugs should be fixed.
>> >>> >> >>
>> >>> >> >> Markus M
>> >>> >> >>
>> >>> >> >> >
>> >>> >> >> > Aren
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > On Thu, Nov 22, 2012 at 9:02 AM, Markus Metz
>> >>> >> >> > <markus.metz.giswork at gmail.com>
>> >>> >> >> > wrote:
>> >>> >> >> >>
>> >>> >> >> >> On Sat, Nov 17, 2012 at 4:06 PM, Aren Cambre
>> >>> >> >> >> <aren at arencambre.com>
>> >>> >> >> >> wrote:
>> >>> >> >> >> > I have a dataset of just over 700,000 incidents that
>> >>> >> >> >> > happened
>> >>> >> >> >> > in
>> >>> >> >> >> > square-ish
>> >>> >> >> >> > Texas county that's about 30 miles on each side.
>> >>> >> >> >> >
>> >>> >> >> >> > Here's the parameters reported by v.kernel as it's
>> >>> >> >> >> > executing:
>> >>> >> >> >> >
>> >>> >> >> >> > STDDEV: 1000.000000
>> >>> >> >> >> > RES: 111.419043 ROWS: 458   COLS: 447
>> >>> >> >> >> >
>> >>> >> >> >> > Writing output raster map using smooth
>> >>> >> >> >> > parameter=1000.000000.
>> >>> >> >> >> >
>> >>> >> >> >> > Normalising factor=6482635.018778.
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > I am running this on a Windows 7 x64 machine with 8 GB RAM
>> >>> >> >> >> > and
>> >>> >> >> >> > an
>> >>> >> >> >> > Intel
>> >>> >> >> >> > Core
>> >>> >> >> >> > i7 Q720 1.6 GHz with 4 physical cores. I notice that it's
>> >>> >> >> >> > not
>> >>> >> >> >> > multithreaded,
>> >>> >> >> >> > only using 1 core.
>> >>> >> >> >> >
>> >>> >> >> >> > It takes about 16 hours to complete. Is this correct? I'd
>> >>> >> >> >> > like
>> >>> >> >> >> > to
>> >>> >> >> >> > use
>> >>> >> >> >> > this
>> >>> >> >> >> > on a dataset with closer to 5 million records, and I'm
>> >>> >> >> >> > really
>> >>> >> >> >> > concerned
>> >>> >> >> >> > how
>> >>> >> >> >> > long it may take.
>> >>> >> >> >>
>> >>> >> >> >> The time required by v.kernel is a function of the number of
>> >>> >> >> >> cells
>> >>> >> >> >> and
>> >>> >> >> >> the input parameter stddeviation. The larger any of these
>> >>> >> >> >> values
>> >>> >> >> >> is,
>> >>> >> >> >> the more time v.kernel will need. Nevertheless, I think that
>> >>> >> >> >> the
>> >>> >> >> >> 16+
>> >>> >> >> >> hours are not correct. I tested with a vector with 3 million
>> >>> >> >> >> points
>> >>> >> >> >> for a grid with 2700 rows and 1087 columns, magnitudes larger
>> >>> >> >> >> than
>> >>> >> >> >> the
>> >>> >> >> >> grid used by you. v.kernel completes in just over one minute.
>> >>> >> >> >>
>> >>> >> >> >> >
>> >>> >> >> >> > I posted my question about the 16+ hours at
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > http://gis.stackexchange.com/questions/41058/how-do-i-compute-v-kernel-maps-in-less-than-16-hours/.
>> >>> >> >> >> > Bill Huber, who si apparently knowledgeable about kernel
>> >>> >> >> >> > density
>> >>> >> >> >> > calculations in general, posted a response, and he felt
>> >>> >> >> >> > like a
>> >>> >> >> >> > kernel
>> >>> >> >> >> > density map shouldn't take much time at all. But digging
>> >>> >> >> >> > more
>> >>> >> >> >> > deeply,
>> >>> >> >> >> > turns
>> >>> >> >> >> > out he had come up with a kernel density calculation method
>> >>> >> >> >> > over a
>> >>> >> >> >> > decade
>> >>> >> >> >> > ago using Fourier transforms. See
>> >>> >> >> >> > http://www.directionsmag.com/features/convolution/129753
>> >>> >> >> >> > and
>> >>> >> >> >> > the
>> >>> >> >> >> > next
>> >>> >> >> >> > two
>> >>> >> >> >> > articles linked to it (they are short articles). Apparently
>> >>> >> >> >> > this
>> >>> >> >> >> > transforms
>> >>> >> >> >> > it from an O(n^2) problem to an O(n ln n) complexity
>> >>> >> >> >> > problem.
>> >>> >> >> >>
>> >>> >> >> >> The approach of Bill Huber is raster-based, not vector based,
>> >>> >> >> >> making
>> >>> >> >> >> some things easier, at the cost of precision. The coordinate
>> >>> >> >> >> precision, however, is only needed for kernel functions other
>> >>> >> >> >> than
>> >>> >> >> >> uniform. In GRASS, you could get something like a
>> >>> >> >> >> raster-based
>> >>> >> >> >> density
>> >>> >> >> >> map by
>> >>> >> >> >>
>> >>> >> >> >> - exporting the points with v.out.ascii
>> >>> >> >> >> - re-importing the points with r.in.xyz method=n to get the
>> >>> >> >> >> number
>> >>> >> >> >> of
>> >>> >> >> >> points per cell
>> >>> >> >> >> - running a neighborhood analysis using a circular window
>> >>> >> >> >> with
>> >>> >> >> >> r.neighbors method=sum -c
>> >>> >> >> >>
>> >>> >> >> >> Optionally you could use the gauss option of r.neighbors to
>> >>> >> >> >> get
>> >>> >> >> >> an
>> >>> >> >> >> equivalent to v.kernel kernel=gaussian
>> >>> >> >> >>
>> >>> >> >> >> HTH,
>> >>> >> >> >>
>> >>> >> >> >> Markus M
>> >>> >> >> >>
>> >>> >> >> >> >
>> >>> >> >> >> > I inspected v.kernel's main.c
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > (http://trac.osgeo.org/grass/browser/grass/trunk/vector/v.kernel/main.c),
>> >>> >> >> >> > and looks like v.kernel uses an output-centric method
>> >>> >> >> >> > (using
>> >>> >> >> >> > Bill's
>> >>> >> >> >> > wording)
>> >>> >> >> >> > of calculating the output, which seems like O(n^2)
>> >>> >> >> >> > complexity.
>> >>> >> >> >> >
>> >>> >> >> >> > So I guess what I'm getting at is it appears to me that the
>> >>> >> >> >> > algorithm
>> >>> >> >> >> > behind
>> >>> >> >> >> > GRASS GIS's v.kernel is straightforward but is a greedy
>> >>> >> >> >> > algorithm
>> >>> >> >> >> > (http://en.wikipedia.org/wiki/Greedy_algorithm), which is
>> >>> >> >> >> > fine,
>> >>> >> >> >> > but
>> >>> >> >> >> > it
>> >>> >> >> >> > make
>> >>> >> >> >> > take a while to execute. Is this true?
>> >>> >> >> >> >
>> >>> >> >> >> > Is there not spatial indexing I could add to the dataset?
>> >>> >> >> >> > I've
>> >>> >> >> >> > done
>> >>> >> >> >> > various
>> >>> >> >> >> > Google searches on that and can't come up with anything
>> >>> >> >> >> > clear.
>> >>> >> >> >> >
>> >>> >> >> >> > Aren
>> >>> >> >> >> >
>> >>> >> >> >> > _______________________________________________
>> >>> >> >> >> > grass-user mailing list
>> >>> >> >> >> > grass-user at lists.osgeo.org
>> >>> >> >> >> > http://lists.osgeo.org/mailman/listinfo/grass-user
>> >>> >> >> >> >
>> >>> >> >> >
>> >>> >> >> >
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>
>


More information about the grass-user mailing list