[GRASSLIST:8906] distance algorithim

Joel Peter William Pitt joel.pitt at gmail.com
Sun Nov 6 21:30:19 EST 2005


Hi all,

I'm trying to optimise a grass module that takes two maps of a
phenomenon at different times. It calculates the distance between the
raster cells in the later map to the nearest cell in the earlier map.

Obviously when the earlier map has alot of cells present it takes a
while to find the closest.

Currently I'm optimising it by making a list of cell coordinates for
the cells in the earlier map. I only add cells to the list that are
not surrounded by other cells - ie I'm only adding cells from the
outlines of contiguous areas. Each cell present in the later map is
compared to the earlier map to check if the same raster cell is
present, if so the distance is 0.

I also split this list into a grid. The grid is basically a lower
resolution raster of the same map, but contains a list in each cell of
the coordinates that fall within the grid position.

e.g

Say the raster region is 100,100 and the grid spacing is 16

This gives a grid size of 7 rounding up from 6.25.

Say we have a coordinate in the later map: 34,50. This evaluates to
grid position (3, 4). Because 34/16 = 2.125 and 50/16=3.125, and each
is rounded up.
To find the nearest cell in the past map we compare distances to all
the coords associated with grid position (3,4) - we also compare the
distances to the 8 surrounding grid positions in case coords in one of
these is closer. If no coords are found then the grid positions
surrounding the surrounding positions are searched, etc.

The grid spacing has some effect on performance - and for my purposes
it seems like 16 does alright.

----

So, the point of this post is:

Is this the best way to be doing this? Does anyone know of a better
algorithm? I know for linear 1 dimensional data I could be using a
range of structures to find the closest, but for two dimensional data
I don't know any.

Alternatively, if my current method is reasonable are there any
additional things I could do to speed it up?

Note, I am doing stochastic dispersal simulations, and although this
does run a 25 timestep simulation within half an hour, I need to do as
many replications as possible. The major hold up at the moment is this
distance evaluation. Part of the problem is that the present cells are
very scattered, so only adding the boundary cells to the lists doesn't
have as great an impact as it usually would.

Any suggestions or comment appreciated, thanks!

Joel




More information about the grass-user mailing list