[GRASS-user] combine & edit multiple text files

maning sambale emmanuel.sambale at gmail.com
Fri Aug 11 06:49:58 EDT 2006


David,

thank you!  that's the one I need (for now).  Funny how simple it is.
Another funny anecdote, about a year ago I passed by our GIS lab and
saw a girl editing very large ascii file (mouse click, edit, edit,
save, next line) much the same as my files I'm manipulating right now.
 I asked her there might be a better way in doing this.  She said it's
the only way her instructor and the lab technician thought them. :)

cheers,

Maning

On 8/10/06, David Finlayson <david.p.finlayson at gmail.com> wrote:
> Try this to print column 1 and 3. I think it will work on all of your files
> no matter how many spaces are in between:
>
> cat file | awk '{print $1, $3}'
>
> David
>
>
> On 8/9/06, maning sambale <emmanuel.sambale at gmail.com> wrote:
> > David & Kevin,
> >
> > Yes, python or perl would be great.  But what I need right now is a
> > quick (maybe dirty) approach.  I do intend to study python as I've
> > heard a lot about it.  But not this time, I'm trying to study Linux
> > tools the "modular way":)
> >
> > Cheers,
> >
> > Maning
> >
> > On 8/8/06, Slover, Kevin <kslover at dewberry.com> wrote:
> > > Maning,
> > >   As David says, python or perl are used now for manipulating text
> > > files.  I have done several quick scripts for doing this with Perl
> > > (thanks David for the python script, bout time I learn to use it).  A
> > > basic perl script would look like this (and note, my Perl is not great,
> > > and am sure there are many other ways to do this) :
> > >
> > > Explanation:  Files.txt is a ls/dir listing of the wanted files to
> > > combine.  Then, the script reads in each file, stripping any sort of
> > > header information from the columns, and outputting everything into one
> > > file.  Fairly simple, and a quick search on the web for file
> > > manipulation using Perl will come up with probably a better explanation.
> > >
> > > $in_file = "Files.txt";
> > > $out_file = "outfile.txt";
> > >
> > > open (INFILE, $in_file) || die "INFILE";
> > > open (OUTFILE, ">$out_file") || die "OUTFILE";
> > >
> > > @infiles = <INFILE>;
> > > close(INFILE);
> > >
> > > print OUTFILE "z,x,y\n";
> > >
> > > foreach $in_files (@infiles)
> > > {
> > >
> > >         open (INFILE1, $in_files) || die "Cannot open $in_files";
> > >         while (<INFILE1>)
> > >         {
> > >
> > >                 chomp($_);
> > >                 ($x, $y, $z) = split ',',$_;
> > >
> > >                 if ($x != x) {
> > >                 print OUTFILE "$z,$x,$y\n"; }
> > >         }
> > >
> > >         close(INFILE1);
> > > }
> > >
> > > close(OUTFILE);
> > >
> > >
> > > Kevin Slover
> > > Coastal / GIS Specialist
> > > 2872 Woodcock Blvd Suite 230
> > > Atlanta GA 30341
> > > (P) 678-530-0022
> > > (F) 678-530-0044
> > >
> > > -----Original Message-----
> > > From: grassuser-bounces at grass.itc.it
> > > [mailto: grassuser-bounces at grass.itc.it] On Behalf Of maning sambale
> > > Sent: Tuesday, August 08, 2006 12:12 AM
> > > To: grassuser at grass.itc.it
> > > Subject: [GRASS-user] combine & edit multiple text files
> > >
> > > Hi!
> > >
> > > I have a number of ascii files downloaded from ASTR fire project from
> > > the ESA Ionia showing monthly fire incidences from 1996-2006.  I
> > > intend to combine all these files, remove unwanted columns and get the
> > > records from my current region/study area only. All records combined
> > > is 929,155 records!  My guess is I need to use the cat, cut, awk
> > > commands.
> > >
> > > Challenge: the files have different record formating
> > >
> > > file 1 is like this (take note of the space as the delimiter):
> > >
> > > Date   Time       Lat       Lon     NDVI  Station
> > > 020201 032428.163  -38.379  -66.334 -.-- ESR
> > > 020201 032428.163   -38.375  -66.323 -.-- ESR
> > > 020201 032428.312  -38.378  -66.359 -.-- ESR
> > > 020201 032428.312  -38.374  -66.348 -.-- ESR
> > > 020201 032428.312  -38.371  -66.337 -.-- ESR
> > >
> > > file 2 looks like this:
> > >     Date                Orbit  Time           Lat
>   Lon
> > >     20030101        4384     81704.016    19.364  -155.103
> > >     20030101        4384     81704.164    19.373  -155.105
> > >     20030101        4384     81704.164    19.375  -155.096
> > >     20030101        4385    100833.648    56.638   161.281
> > >     20030101        4386    130756.352   -20.340   134.099
> > >
> > > I only need the columns for date, time, lat, lon
> > >
> > > Here's what I did:
> > >
> > > #combine all file (monthly)
> > > cat 9904ESA01.FIRE 9905ESA01.FIRE 9906ESA01.FIRE 9907ESA01.FIRE
> > > 9908ESA01.FIRE ... > test
> > >
> > > # cut only desired columns (1_4) delimeiter is spac ' '
> > > cut -d' ' -f1 test > 1
> > > cut -d' ' -f2 test > 2
> > > cut -d' ' -f3 test > 3
> > > cut -d' ' -f4 test > 4
> > >
> > > # combine all columns
> > > paste 1 2 3 4 > test5
> > >
> > > example output:
> > >
> > > 021231 223941.761   11.035   -5.016 -.-- ESR
> > > 021231 224005.303   12.226   -6.243 -.-- ESR
> > >     20030101        4380     25934.057   -37.022   -69.589
> > >     20030101        4382     45951.090     33.005  -110.772
> > >
> > > The problem is for the file example 1, lat and lon columns contain
> > > spaces other than the delimiter example " -38.00" while another is
> > > "120.00"  In the file2 example, more spaces are there.  I think I need
> > > to process different file formats separately but how do I solve the
> > > problem for spaces in the lat/lon columns?
> > >
> > > One last question how do I get the records for my current region only?
> > >
> > > north:      20:00:01.49976N
> > > south:      5:00:01.499767N
> > > west:       115:00:01.5012E
> > > east:       130:00:01.501193E
> > >
> > >
> > > I'm starting to understand awk (reading the gawk manual right now) but
> > > may take a while to get do something magical.
> > >
> > > Thanks!
> > >
> > > Maning
> > >
> > > --
> > >
> |---------|----------------------------------------------------------|
> > > | __.-._  |"Ohhh. Great warrior. Wars not make one great." -Yoda     |
> > > | '-._"7' |"Freedom is still the most radical idea of all" -N.Branden|
> > > |  /'.-c  |Linux registered user #402901, http://counter.li.org/     |
> > > |  |  /T  |http://esambale.wikispaces.com|
> > > | _)_/LI
> > >
> |http://www.geocities.com/esambale/philbiodivmap/philbirds.html
>   |
> > >
> |---------|----------------------------------------------------------|
> > >
> > > _______________________________________________
> > > grassuser mailing list
> > > grassuser at grass.itc.it
> > > http://grass.itc.it/mailman/listinfo/grassuser
> > >
> > >
> >
> >
> > --
> >
> |---------|----------------------------------------------------------|
> > | __.-._  |"Ohhh. Great warrior. Wars not make one great." -Yoda     |
> > | '-._"7' |"Freedom is still the most radical idea of all" - N.Branden|
> > |  /'.-c  |Linux registered user #402901, http://counter.li.org/     |
> > |  |  /T  |http://esambale.wikispaces.com|
> > | _)_/LI
> |http://www.geocities.com/esambale/philbiodivmap/philbirds.html
>   |
> >
> |---------|----------------------------------------------------------|
> >
> > _______________________________________________
> > grassuser mailing list
> > grassuser at grass.itc.it
> > http://grass.itc.it/mailman/listinfo/grassuser
> >
>
>
>
> --
> David Finlayson


-- 
|---------|----------------------------------------------------------|
| __.-._  |"Ohhh. Great warrior. Wars not make one great." -Yoda     |
| '-._"7' |"Freedom is still the most radical idea of all" -N.Branden|
|  /'.-c  |Linux registered user #402901, http://counter.li.org/     |
|  |  /T  |http://esambale.wikispaces.com|
| _)_/LI  |http://www.geocities.com/esambale/philbiodivmap/philbirds.html   |
|---------|----------------------------------------------------------|




More information about the grass-user mailing list