[GRASS-user] combine & edit multiple text files
maning sambale
emmanuel.sambale at gmail.com
Mon Aug 14 06:19:47 EDT 2006
Finally did it, using awk, cat, sort. Maybe not the best way, but
gets the job done. Thank you!
# combine 96-02 data series
cat 0001ESA01.FIRE 0002ESA01.FIRE .... > file1
#get selected columns 1 to 4
cat file1 | awk '{print $1 $2 $3 $4}' > file2
# combine 03-06 data series
cat 200301ALGO1.FIRE 200302ALGO1.FIRE 200303ALGO1.FIRE .... > file3
# get selected columns 1, 3 to 5
cat file3 | awk '{print $1 $3 $4 $5}' > file4
#combine both files
cat file2 file3 > bigfile
sort -k 3 -g bigfile > file4 # sort to column 3
awk '$3== "5.000" , $3== "21.000" { print $0 }' file4 > file5
awk 'END { print NR }' file5 #counts lines
sort -k 4 -g file5 > file6 # sort to column 4
awk '$4== "114.795" , $4== "126.175" { print $0 }' file6 > file7
awk 'END { print NR }' file7 #counts lines
# import to grass
cat file7 | v.in.ascii out=fire_96_to_06_astr x=4 y=3 fs=" "
columns='label_date varchar(20), label_time varchar(20), x double, y
double'
Maning
On 8/12/06, David Finlayson <david.p.finlayson at gmail.com> wrote:
> Simple, yes, but it took me a few minutes on Google to remember the awk
> syntax. Unix is powerful, but it isn't intuitive.
>
> David
>
>
> On 8/11/06, maning sambale <emmanuel.sambale at gmail.com> wrote:
> > David,
> >
> > thank you! that's the one I need (for now). Funny how simple it is.
> > Another funny anecdote, about a year ago I passed by our GIS lab and
> > saw a girl editing very large ascii file (mouse click, edit, edit,
> > save, next line) much the same as my files I'm manipulating right now.
> > I asked her there might be a better way in doing this. She said it's
> > the only way her instructor and the lab technician thought them. :)
> >
> > cheers,
> >
> > Maning
> >
> > On 8/10/06, David Finlayson <david.p.finlayson at gmail.com> wrote:
> > > Try this to print column 1 and 3. I think it will work on all of your
> files
> > > no matter how many spaces are in between:
> > >
> > > cat file | awk '{print $1, $3}'
> > >
> > > David
> > >
> > >
> > > On 8/9/06, maning sambale <emmanuel.sambale at gmail.com > wrote:
> > > > David & Kevin,
> > > >
> > > > Yes, python or perl would be great. But what I need right now is a
> > > > quick (maybe dirty) approach. I do intend to study python as I've
> > > > heard a lot about it. But not this time, I'm trying to study Linux
> > > > tools the "modular way":)
> > > >
> > > > Cheers,
> > > >
> > > > Maning
> > > >
> > > > On 8/8/06, Slover, Kevin < kslover at dewberry.com> wrote:
> > > > > Maning,
> > > > > As David says, python or perl are used now for manipulating text
> > > > > files. I have done several quick scripts for doing this with Perl
> > > > > (thanks David for the python script, bout time I learn to use it).
> A
> > > > > basic perl script would look like this (and note, my Perl is not
> great,
> > > > > and am sure there are many other ways to do this) :
> > > > >
> > > > > Explanation: Files.txt is a ls/dir listing of the wanted files to
> > > > > combine. Then, the script reads in each file, stripping any sort of
> > > > > header information from the columns, and outputting everything into
> one
> > > > > file. Fairly simple, and a quick search on the web for file
> > > > > manipulation using Perl will come up with probably a better
> explanation.
> > > > >
> > > > > $in_file = " Files.txt";
> > > > > $out_file = "outfile.txt";
> > > > >
> > > > > open (INFILE, $in_file) || die "INFILE";
> > > > > open (OUTFILE, ">$out_file") || die "OUTFILE";
> > > > >
> > > > > @infiles = <INFILE>;
> > > > > close(INFILE);
> > > > >
> > > > > print OUTFILE "z,x,y\n";
> > > > >
> > > > > foreach $in_files (@infiles)
> > > > > {
> > > > >
> > > > > open (INFILE1, $in_files) || die "Cannot open $in_files";
> > > > > while (<INFILE1>)
> > > > > {
> > > > >
> > > > > chomp($_);
> > > > > ($x, $y, $z) = split ',',$_;
> > > > >
> > > > > if ($x != x) {
> > > > > print OUTFILE "$z,$x,$y\n"; }
> > > > > }
> > > > >
> > > > > close(INFILE1);
> > > > > }
> > > > >
> > > > > close(OUTFILE);
> > > > >
> > > > >
> > > > > Kevin Slover
> > > > > Coastal / GIS Specialist
> > > > > 2872 Woodcock Blvd Suite 230
> > > > > Atlanta GA 30341
> > > > > (P) 678-530-0022
> > > > > (F) 678-530-0044
> > > > >
> > > > > -----Original Message-----
> > > > > From: grassuser-bounces at grass.itc.it
> > > > > [mailto: grassuser-bounces at grass.itc.it] On Behalf Of maning sambale
> > > > > Sent: Tuesday, August 08, 2006 12:12 AM
> > > > > To: grassuser at grass.itc.it
> > > > > Subject: [GRASS-user] combine & edit multiple text files
> > > > >
> > > > > Hi!
> > > > >
> > > > > I have a number of ascii files downloaded from ASTR fire project
> from
> > > > > the ESA Ionia showing monthly fire incidences from 1996-2006. I
> > > > > intend to combine all these files, remove unwanted columns and get
> the
> > > > > records from my current region/study area only. All records combined
> > > > > is 929,155 records! My guess is I need to use the cat, cut, awk
> > > > > commands.
> > > > >
> > > > > Challenge: the files have different record formating
> > > > >
> > > > > file 1 is like this (take note of the space as the delimiter):
> > > > >
> > > > > Date Time Lat Lon NDVI Station
> > > > > 020201 032428.163 -38.379 -66.334 -.-- ESR
> > > > > 020201 032428.163 -38.375 -66.323 -.-- ESR
> > > > > 020201 032428.312 -38.378 -66.359 -.-- ESR
> > > > > 020201 032428.312 -38.374 -66.348 -.-- ESR
> > > > > 020201 032428.312 -38.371 -66.337 -.-- ESR
> > > > >
> > > > > file 2 looks like this:
> > > > > Date Orbit Time Lat
> > > Lon
> > > > > 20030101 4384 81704.016 19.364 -155.103
> > > > > 20030101 4384 81704.164 19.373 -155.105
> > > > > 20030101 4384 81704.164 19.375 -155.096
> > > > > 20030101 4385 100833.648 56.638 161.281
> > > > > 20030101 4386 130756.352 -20.340 134.099
> > > > >
> > > > > I only need the columns for date, time, lat, lon
> > > > >
> > > > > Here's what I did:
> > > > >
> > > > > #combine all file (monthly)
> > > > > cat 9904ESA01.FIRE 9905ESA01.FIRE 9906ESA01.FIRE 9907ESA01.FIRE
> > > > > 9908ESA01.FIRE ... > test
> > > > >
> > > > > # cut only desired columns (1_4) delimeiter is spac ' '
> > > > > cut -d' ' -f1 test > 1
> > > > > cut -d' ' -f2 test > 2
> > > > > cut -d' ' -f3 test > 3
> > > > > cut -d' ' -f4 test > 4
> > > > >
> > > > > # combine all columns
> > > > > paste 1 2 3 4 > test5
> > > > >
> > > > > example output:
> > > > >
> > > > > 021231 223941.761 11.035 -5.016 -.-- ESR
> > > > > 021231 224005.303 12.226 -6.243 -.-- ESR
> > > > > 20030101 4380 25934.057 -37.022 -69.589
> > > > > 20030101 4382 45951.090 33.005 -110.772
> > > > >
> > > > > The problem is for the file example 1, lat and lon columns contain
> > > > > spaces other than the delimiter example " -38.00" while another is
> > > > > "120.00" In the file2 example, more spaces are there. I think I
> need
> > > > > to process different file formats separately but how do I solve the
> > > > > problem for spaces in the lat/lon columns?
> > > > >
> > > > > One last question how do I get the records for my current region
> only?
> > > > >
> > > > > north: 20:00: 01.49976N
> > > > > south: 5:00:01.499767N
> > > > > west: 115:00:01.5012E
> > > > > east: 130:00:01.501193E
> > > > >
> > > > >
> > > > > I'm starting to understand awk (reading the gawk manual right now)
> but
> > > > > may take a while to get do something magical.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Maning
> > > > >
> > > > > --
> > > > >
> > >
> |---------|----------------------------------------------------------|
> > > > > | __.-._ |"Ohhh. Great warrior. Wars not make one great." -Yoda
> |
> > > > > | '-._"7' |"Freedom is still the most radical idea of all"
> -N.Branden|
> > > > > | /'.-c |Linux registered user #402901, http://counter.li.org/
> |
> > > > > | | /T |http://esambale.wikispaces.com|
> > > > > | _)_/LI
> > > > >
> > >
> |http://www.geocities.com/esambale/philbiodivmap/philbirds.html
> > > |
> > > > >
> > >
> |---------|----------------------------------------------------------|
> > > > >
> > > > > _______________________________________________
> > > > > grassuser mailing list
> > > > > grassuser at grass.itc.it
> > > > > http://grass.itc.it/mailman/listinfo/grassuser
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > >
> |---------|----------------------------------------------------------|
> > > > | __.-._ |"Ohhh. Great warrior. Wars not make one great." -Yoda |
> > > > | '-._"7' |"Freedom is still the most radical idea of all" -
> N.Branden|
> > > > | /'.-c |Linux registered user #402901, http://counter.li.org/ |
> > > > | | /T |http://esambale.wikispaces.com|
> > > > | _)_/LI
> > >
> |http://www.geocities.com/esambale/philbiodivmap/philbirds.html
> > > |
> > > >
> > >
> |---------|----------------------------------------------------------|
> > > >
> > > > _______________________________________________
> > > > grassuser mailing list
> > > > grassuser at grass.itc.it
> > > > http://grass.itc.it/mailman/listinfo/grassuser
> > > >
> > >
> > >
> > >
> > > --
> > > David Finlayson
> >
> >
> > --
> >
> |---------|----------------------------------------------------------|
> > | __.-._ |"Ohhh. Great warrior. Wars not make one great." -Yoda |
> > | '-._"7' |"Freedom is still the most radical idea of all" -N.Branden|
> > | /'.-c |Linux registered user #402901, http://counter.li.org/ |
> > | | /T |http://esambale.wikispaces.com|
> > | _)_/LI
> |http://www.geocities.com/esambale/philbiodivmap/philbirds.html
> |
> >
> |---------|----------------------------------------------------------|
> >
> > _______________________________________________
> > grassuser mailing list
> > grassuser at grass.itc.it
> > http://grass.itc.it/mailman/listinfo/grassuser
> >
>
>
>
> --
> David Finlayson
--
|---------|----------------------------------------------------------|
| __.-._ |"Ohhh. Great warrior. Wars not make one great." -Yoda |
| '-._"7' |"Freedom is still the most radical idea of all" -N.Branden|
| /'.-c |Linux registered user #402901, http://counter.li.org/ |
| | /T |http://esambale.wikispaces.com|
| _)_/LI |http://www.geocities.com/esambale/philbiodivmap/philbirds.html |
|---------|----------------------------------------------------------|
More information about the grass-user
mailing list