[GRASSLIST:382] Re: a huge matrix ?
Roger Bivand
Roger.Bivand at nhh.no
Sat Jun 14 05:33:32 EDT 2003
I'm not sure that you found an answer to this yet, Ahmet - if you did,
please ignore this suggestion:
1) use r.mapcalc to create an integer combined index for the combinations
of your raster layers:
tmp1 = mapa*10 + mapb
tmp2 = tmp1*10 + mapc
....
or some such way of generating an integer index differentiating the
different types. If there are raster values outside 0-9, maybe reclass
first, or use 100 instead of 10 to left shift the index where there are
10-99 discrete values.
2) In R, generate the same integer index, and make a data frame with the
integer index in one column, and the probabilities in another.
3) Use the R/GRASS interface (or r.stats -1 and scan()) to move either the
whole of your index raster, or more likely tiles made by changing the
GRASS region, to R.
4) in R, use vectorised lookup to search for the probability value given
the index value - match() is fast and powerful (I haven't checked this,
but do use match() in other code - it will probably work much better on
tiles than your big raster) and assign the probability values to a new
vector.
5) move this vector back to GRASS as FCELL or DCELL. If you are tiling,
finish off by patching the tiles back together.
6) My guess is that the tiling, import, matching, and export can all be
done in a loop within R.
Rationale: the combined index avoids all the if statements - 11K if's is
rather a lot! I did think about pushing it all out say to postgresql and
using SQL (one table of RHS raster values, one table of lookups) but there
were still seven fields on each side to index on, so it seemed at least
worth trying to combine them automatically by left shifting (probably not
the right word) and addition, which are cheap operations in r.mapcalc.
Please let me know how you tackled this problem, your solution will have
general interest, and should be in the general GRASS knowledge base. I can
expand on this off-list if it is of any use.
Roger
On Wed, 11 Jun 2003, Glynn Clements wrote:
>
> orkun wrote:
>
> > >>I have a huge matrix containing 7 column and 10500 rows.
> > >>Each columns stand for different maps. Each rows represent "unique"
> > >>combinations of catagories of maps and probability values
> > >>of each row.
> > >>Something like:
> > >>
> > >> map1 map2 map3 ... prob
> > >>1 cat1 x x x x
> > >>2 cat2 x x x x
> > >>. .
> > >>. .
> > >>10500 .
> > >>
> > >>So I have 10500 if-clause to create a probability map.
> > >>I tried r.mapcalc (as from a text file) but grass gave "stack overflow"
> > >>error.
> > >>
> > >>
> > >
> > >I'm not surprised.
> > >
> > >
> > >
> > >>Is there anyone making suggestions ?
> > >>
> > >>
> > >
> > >You need to describe your problem in more detail; it isn't clear what
> > >you are trying to calculate.
> > >
> > >
> > >
> > Thank you for your interest
> >
> > You wanted to know the problem in more detail:
> >
> > what I did:
> > 1- using r.stat command I produced a text file for statistical calculations
> > 2- using R statistical package I produced probability results of each
> > map's for each categories.
> > it means 7 columns of maps 10500 rows plus probability of each
> > combination. Actually it is a contingency table. Any data in table
> > (matrix) is resulting from unique combination of maps and their
> > categories.
> > 3- I want to port these results to grass to create a map (of landslide
> > susceptibility map) . Firstly I thought I used r.mapcalc with piping
> > from text file. But because text file contains a huge matrix that is
> > made up of 10500 if statement (for same number of rows) process failed.
> > I attached a part of text file after modifying for r.mapcalc. you see
> > all ifs are unique.
>
> OK, so: each line is a rule; for each pixel you scan the list of rules
> to find the first match, then output the associated result. Right?
>
> The most straightforward approach (although not particularly
> efficient), would be to use an iterative approach, i.e.
>
> n=0
> last='null()'
> while read a1 a2 a3 a4 a5 a6 a7 a8 ; do
> result=pr2.pass$n
> r.mapcalc "$result = if (f28geo5==$a1 && f28slpcat==$a2 && f28concavity==$a3 && n_facing==$a4 && road_bas==$a5 &&str_buf==$a6 && f28f4==$a7, $a8, $last)"
> last=$result
> done < rules.txt
>
> where rules.txt looks like:
>
> 21 6 1 1 1 1 0 701.22
> 7 4 2 1 1 2 1 702.06
> ...
>
> Alternatively, you could combine this with your initial approach, only
> matching a limited number of rules on each pass, and using several
> passes.
>
> However, I note that there seem to be very few possible values for
> most of the maps. In that case, a far more efficient approach would be
> to:
>
> 1. Reclass each input map so that all of the significant categories
> are consecutively numbered from zero (0,1,2,3,...); any insignificant
> categories would be merged into the last category.
>
> 2. Compute a composite map, using e.g.
>
> composite = f28geo5 + 22 * (f28slpcat + 9 * (f28concavity + 3 * ...
>
> where the numbers are the total number of categories in the preceding
> map.
>
> 3. Assign the values as labels for the composite categories.
>
> 4. Use "@map" in r.mapcalc to use the label instead of the category.
>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand at nhh.no
More information about the grass-user
mailing list