[GRASS-user] Does v.kernel have to take 16+ hours?
Markus Metz
markus.metz.giswork at gmail.com
Thu Dec 6 00:29:57 PST 2012
On Thu, Dec 6, 2012 at 3:58 AM, Aren Cambre <aren at arencambre.com> wrote:
> It's gotten slow again. This run will probably take more than 10 hours.
> However, I am using a standard deviation of 1000. Is that what could be
> causing this?
Yes. With a standard deviation of 1000, the search radius is now 4000,
that is, for each cell a 8000x8000 box is searched. With many densely
packed points, this can take quite some time.
Markus M
>
> v.kernel input=tickets at PERMANENT output=tickets_new_heatmap_1000
> stddeviation=1000
> STDDEV: 1000.000000
> RES: 18.290457 ROWS: 2370 COLS: 2650
>
> Writing output raster map using smooth parameter=1000.000000.
>
> Normalising factor=6482635.018778.
>
> On Sat, Nov 24, 2012 at 9:03 PM, Aren Cambre <aren at arencambre.com> wrote:
>>
>> I installed r53983. The v.kernel execution that took almost a day now
>> executes in 25.5 minutes. Thank you!
>>
>> Aren
>>
>>
>> On Fri, Nov 23, 2012 at 12:51 PM, Markus Metz
>> <markus.metz.giswork at gmail.com> wrote:
>>>
>>> On Fri, Nov 23, 2012 at 5:35 PM, Aren Cambre <aren at arencambre.com> wrote:
>>> > Thanks!
>>> >
>>> > I am not familiar with GRASS's release customs. Will this become part
>>> > of a
>>> > binary release soon, or should I just pull down the latest release in
>>> > the
>>> > 6.4.2 trunk? I'm assuming this has been merged into the trunk...
>>>
>>> It should be available as a binary for Windows by tomorrow in the
>>> nightly builds [0].
>>>
>>> Markus M
>>>
>>> [0] http://wingrass.fsv.cvut.cz/grass64/
>>>
>>> >
>>> > Aren
>>> >
>>> >
>>> > On Fri, Nov 23, 2012 at 7:32 AM, Markus Metz
>>> > <markus.metz.giswork at gmail.com>
>>> > wrote:
>>> >>
>>> >> On Fri, Nov 23, 2012 at 2:07 PM, Aren Cambre <aren at arencambre.com>
>>> >> wrote:
>>> >> > Isn't taking about 10,000% too much time considered a bug? :-)
>>> >>
>>> >> Hmm, yes. v.kernel is fixed in devbr6 and relbr6 with r53982 and
>>> >> r53983, respectively.
>>> >>
>>> >> Markus M
>>> >>
>>> >> >
>>> >> > On Nov 23, 2012 5:11 AM, "Markus Metz"
>>> >> > <markus.metz.giswork at gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Fri, Nov 23, 2012 at 4:14 AM, Aren Cambre <aren at arencambre.com>
>>> >> >> wrote:
>>> >> >> > I'm able to reproduce reliably here. I'll email you details
>>> >> >> > privately.
>>> >> >>
>>> >> >> Thanks. I can confirm that v.kernel takes a long time in GRASS 6
>>> >> >> with
>>> >> >> the settings provided by you. It does not crash, however.
>>> >> >>
>>> >> >> I can speed up v.kernel in GRASS 6 to complete in 10 minutes
>>> >> >> instead
>>> >> >> of 16+ hours, but I am not sure if this fix can/will go into GRASS
>>> >> >> 6.4
>>> >> >> because by now only bugs should be fixed.
>>> >> >>
>>> >> >> Markus M
>>> >> >>
>>> >> >> >
>>> >> >> > Aren
>>> >> >> >
>>> >> >> >
>>> >> >> > On Thu, Nov 22, 2012 at 9:02 AM, Markus Metz
>>> >> >> > <markus.metz.giswork at gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> On Sat, Nov 17, 2012 at 4:06 PM, Aren Cambre
>>> >> >> >> <aren at arencambre.com>
>>> >> >> >> wrote:
>>> >> >> >> > I have a dataset of just over 700,000 incidents that happened
>>> >> >> >> > in
>>> >> >> >> > square-ish
>>> >> >> >> > Texas county that's about 30 miles on each side.
>>> >> >> >> >
>>> >> >> >> > Here's the parameters reported by v.kernel as it's executing:
>>> >> >> >> >
>>> >> >> >> > STDDEV: 1000.000000
>>> >> >> >> > RES: 111.419043 ROWS: 458 COLS: 447
>>> >> >> >> >
>>> >> >> >> > Writing output raster map using smooth parameter=1000.000000.
>>> >> >> >> >
>>> >> >> >> > Normalising factor=6482635.018778.
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > I am running this on a Windows 7 x64 machine with 8 GB RAM and
>>> >> >> >> > an
>>> >> >> >> > Intel
>>> >> >> >> > Core
>>> >> >> >> > i7 Q720 1.6 GHz with 4 physical cores. I notice that it's not
>>> >> >> >> > multithreaded,
>>> >> >> >> > only using 1 core.
>>> >> >> >> >
>>> >> >> >> > It takes about 16 hours to complete. Is this correct? I'd like
>>> >> >> >> > to
>>> >> >> >> > use
>>> >> >> >> > this
>>> >> >> >> > on a dataset with closer to 5 million records, and I'm really
>>> >> >> >> > concerned
>>> >> >> >> > how
>>> >> >> >> > long it may take.
>>> >> >> >>
>>> >> >> >> The time required by v.kernel is a function of the number of
>>> >> >> >> cells
>>> >> >> >> and
>>> >> >> >> the input parameter stddeviation. The larger any of these values
>>> >> >> >> is,
>>> >> >> >> the more time v.kernel will need. Nevertheless, I think that the
>>> >> >> >> 16+
>>> >> >> >> hours are not correct. I tested with a vector with 3 million
>>> >> >> >> points
>>> >> >> >> for a grid with 2700 rows and 1087 columns, magnitudes larger
>>> >> >> >> than
>>> >> >> >> the
>>> >> >> >> grid used by you. v.kernel completes in just over one minute.
>>> >> >> >>
>>> >> >> >> >
>>> >> >> >> > I posted my question about the 16+ hours at
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > http://gis.stackexchange.com/questions/41058/how-do-i-compute-v-kernel-maps-in-less-than-16-hours/.
>>> >> >> >> > Bill Huber, who si apparently knowledgeable about kernel
>>> >> >> >> > density
>>> >> >> >> > calculations in general, posted a response, and he felt like a
>>> >> >> >> > kernel
>>> >> >> >> > density map shouldn't take much time at all. But digging more
>>> >> >> >> > deeply,
>>> >> >> >> > turns
>>> >> >> >> > out he had come up with a kernel density calculation method
>>> >> >> >> > over a
>>> >> >> >> > decade
>>> >> >> >> > ago using Fourier transforms. See
>>> >> >> >> > http://www.directionsmag.com/features/convolution/129753 and
>>> >> >> >> > the
>>> >> >> >> > next
>>> >> >> >> > two
>>> >> >> >> > articles linked to it (they are short articles). Apparently
>>> >> >> >> > this
>>> >> >> >> > transforms
>>> >> >> >> > it from an O(n^2) problem to an O(n ln n) complexity problem.
>>> >> >> >>
>>> >> >> >> The approach of Bill Huber is raster-based, not vector based,
>>> >> >> >> making
>>> >> >> >> some things easier, at the cost of precision. The coordinate
>>> >> >> >> precision, however, is only needed for kernel functions other
>>> >> >> >> than
>>> >> >> >> uniform. In GRASS, you could get something like a raster-based
>>> >> >> >> density
>>> >> >> >> map by
>>> >> >> >>
>>> >> >> >> - exporting the points with v.out.ascii
>>> >> >> >> - re-importing the points with r.in.xyz method=n to get the
>>> >> >> >> number
>>> >> >> >> of
>>> >> >> >> points per cell
>>> >> >> >> - running a neighborhood analysis using a circular window with
>>> >> >> >> r.neighbors method=sum -c
>>> >> >> >>
>>> >> >> >> Optionally you could use the gauss option of r.neighbors to get
>>> >> >> >> an
>>> >> >> >> equivalent to v.kernel kernel=gaussian
>>> >> >> >>
>>> >> >> >> HTH,
>>> >> >> >>
>>> >> >> >> Markus M
>>> >> >> >>
>>> >> >> >> >
>>> >> >> >> > I inspected v.kernel's main.c
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > (http://trac.osgeo.org/grass/browser/grass/trunk/vector/v.kernel/main.c),
>>> >> >> >> > and looks like v.kernel uses an output-centric method (using
>>> >> >> >> > Bill's
>>> >> >> >> > wording)
>>> >> >> >> > of calculating the output, which seems like O(n^2) complexity.
>>> >> >> >> >
>>> >> >> >> > So I guess what I'm getting at is it appears to me that the
>>> >> >> >> > algorithm
>>> >> >> >> > behind
>>> >> >> >> > GRASS GIS's v.kernel is straightforward but is a greedy
>>> >> >> >> > algorithm
>>> >> >> >> > (http://en.wikipedia.org/wiki/Greedy_algorithm), which is
>>> >> >> >> > fine,
>>> >> >> >> > but
>>> >> >> >> > it
>>> >> >> >> > make
>>> >> >> >> > take a while to execute. Is this true?
>>> >> >> >> >
>>> >> >> >> > Is there not spatial indexing I could add to the dataset? I've
>>> >> >> >> > done
>>> >> >> >> > various
>>> >> >> >> > Google searches on that and can't come up with anything clear.
>>> >> >> >> >
>>> >> >> >> > Aren
>>> >> >> >> >
>>> >> >> >> > _______________________________________________
>>> >> >> >> > grass-user mailing list
>>> >> >> >> > grass-user at lists.osgeo.org
>>> >> >> >> > http://lists.osgeo.org/mailman/listinfo/grass-user
>>> >> >> >> >
>>> >> >> >
>>> >> >> >
>>> >
>>> >
>>
>>
>
More information about the grass-user
mailing list