[GRASS-user] v.rast.stats and Lidar data

Wesley Roberts wroberts at csir.co.za
Thu Jun 19 10:10:06 EDT 2008

Hi Andy and Hamish,

Thanks for the help, I am starting to get things going...almost. I am having trouble with the 'awk' command. After some fiddling with the command you suggested I have managed to run it with indifferent results. Here is the command I am using.

cat Area45TestNon-Ground.xyz | r.what in=grd_dtm | awk 'BEGIN {FS = "|"}; {print $1, $2, $3, $4, $3-$4}' > Area45CanopyHeight.xyz

I generated grd_dtm using v.surf.rst. Area45TestNon-Ground.xyz contains the non-ground returns for my tile. I had to add the FS and BEGIN command to split the records as the original identified the x, y, z columns only as $1 and thus z could not be used to calculate height. r.what returns the raster value at each point no problem. My issue begins when I print the various columns to Area45CanopyHeight.xyz. If I print only $3 I get the original z values and my fourth column intensity together (Shouldnt FS take care of this). Strangely my output Area45CanopyHeight.xyz has the raster values and the difference i.e. canopy height, all should then be well. However when I try use v.in.ascii on a portion of the data (to save time, still testing) I get the following error.

Maximum input row length: 39
Maximum number of columns: 4
Minimum number of columns: 2

z column number > minimum last column number
(incorrect field separator?)

Where am I going wrong? I think it has to do with 'awk' command and I know this list is for GRASS but I thought you guys may be able to help me out. I have specified space as column separator. 

If I open Area45CanopyHeight.xyz using GEDIT the data are formatted as follows 

-73119.090 -3302000.060 1038.310 0.0
1038.0426025391 0.267397
-73119.100 -3302000.060 1038.270 0.0
1038.0380859375 0.231914
-73001.800 -3302078.460 1023.450 0.0
1022.9335327148 0.516467
-73006.690 -3302075.130 1024.060 0.0
1023.5599365234 0.500063
-73000.920 -3302020.790 1033.830 0.0
1033.3255615234 0.504438
-73008.000 -3302019.730 1034.740 0.0
1034.205078125 0.534922

Many thanks for your help.


Wesley Roberts MSc.
Researcher: Forest Assessment (Remote Sensing & GIS)
Forestry and Forest Products Research Centre
Tel: +27 (31) 242-2353
Fax: +27 (31) 261-1216

"To know the road ahead, ask those coming back."
- Chinese proverb
>>> "andrew haywood" <ahaywood3 at gmail.com> 06/18/08 11:56 PM >>>
Hi Wesley,

as Hamish points out r.in.xyz is the way to go. I have been working with
large lidar data files over forest for a couple of months now and find the
intersection between the ground and the vegetation to create the Digital
Canopy Height Model - quite time consuming. Here is my basic workflow

# read in the ground using v.in.ascii with topology turned off
v.in.ascii -tzbr input=grd.txt  output=grd_pts x=2 y=3 z=4 fs=" "
# create the ground surface using v.surf.rast (this is useful when you have
irregularly spaced strikes under dense canopies)
v.surf.rst grd_pts elev=grd_dtm npmin=120 segmax=25 layer=0

# create a flat vegetation file with non-ground strikes intersecting the DTM
(this can be quite time consuming)
# I use awk to calculate the difference between the canopy elevation and
ground elevation = canopy height
cat AreaThreeNonGround.xyz  | r.what in=grd_dtm | awk 'print $1, $2, $3, $4,
$5, $3-$5)' | veg.txt

#I then use r.in.xyz to calculate statistics of binned data from the flat
vegetation file
g.region res=10
#calculate the mean vegetation height greater than 0.5m for a 10m cell
r.in.xyz input=veg.txt output=mean_10m method=mean type=FCELL fs='|' x=1 y=2
z=6 zrange=0.5,200 percent=100
#calculate the 90th vegetation height percentile
r.in.xyz input=veg.txt output=mean_p90_10m method=percentile pth=90
type=FCELL fs='|' x=1 y=2 z=6 zrange=0.5,200 percent=100

## I think the r.in.xyz has most of the statistics that you would like to
## reading in the point cloud into R can be quite memory intensive
## but it is fun to do



On Thu, Jun 19, 2008 at 3:44 AM, Hamish <hamish_b at yahoo.com> wrote:

> Hi Wesley,
> > I have some LiDAR data of a forested area and I am trying
> > to calculate tree heights / canopy heights based first and
> > second return points (pre-filtered ground and non-ground).
> > I have two questions that I hope some of you may be able to
> > help me with. Firstly I have windowed / clipped my data to a
> > small 'test' region with about 150 000 non-ground
> > and 100 000 ground points.
> >
> > 1. I have interpolated the ground points using idw in grass
> > and have taken that as my ground surface. I would now like
> > to assign an extra column to my non-ground vector file and
> > populate it with the raster values of the DSM to calculate
> > actual point height above ground (points are currently in
> > meters above sea-level). Traditionally, one would just
> > create a digital surface model and a canopy model and
> > subtract the two to get tree canopy height. I would like to
> > do the subtraction in the database and then interpolate the
> > result. I am trying to do this with v.rast.stats and it is
> > taking a really long time. How can I speed this up? Reading
> > the help page I also see that the vector data are rasterised
> > and values are assigend based on categories. I dont want
> > categories, I am just interested in the raster value
> > directly below each point? Is v.rast.stats the correct
> > application to use?
> No, r.in.xyz is. ;)  Well at least it is much less trouble to work with.
> (or if you want to stick with your current method, use v.what.rast instead
> of v.rast.stats to pick off single points at the data point; and v.surf.rst
> might make a nicer surface than IDW, at the risk of overshoots)
> > 2. While the above mentioned data set is fairly small my
> > largest data set is around 700MB with
> >
> > wc -l  =  20602625 AreaThreeNonGround.xyz
> >
> > If I import using v.in.ascii without a db and topology will
> > I still be able to manipulate the vector data as described
> > above
> without a DB you can not populate a DB..
> v.surf.* have been modified to work even without topology, just for this
> case.
> r.in.xyz should handle 700MB input|20m points without complaint.
> > and will I be able to use R scripts and functions on
> > the data set?
> don't know.
> > Apologies for the long mail, I hope it is not too
> > confusing.
> >
> > Many thanks and looking forward to your comments and
> > suggestions.
> hopefully r.in.xyz solves your problems, it is updated with more statistics
> and features in GRASS 6.3.0 if you can get your hands on that.
> Hamish
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-user

This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.  MailScanner thanks Transtec Computers for their support.

More information about the grass-user mailing list