[GRASS-user] combine & edit multiple text files

David Finlayson david.p.finlayson at gmail.com
Tue Aug 8 01:44:50 EDT 2006


The modern solution for problems like these is a script language like Perl
or Python.

In Python a simple script for working with columns of data might like like
this:

fin = open(infile)
for record in fin:
    fields = rec.split()   # this part splits the fields on white space
    date = fields[0]      # pick the fields you want
    time = fields[1]

    ...

    value2 = fields[9]

   print "%f %f %f" % (date, time, value2)   # print them to stdout or write
to a file


run the script and capture the output to a file
python script.py > bigfile.txt

I find cut, paste, sed work will for quick jobs (and they would work in your
case). But as soon as I need to look up the documentation on sed I have
usually reached the point where a Python script would be easier to
impliment. For that reason, I never use awk any more.

My 2 cents,

David






On 8/7/06, maning sambale <emmanuel.sambale at gmail.com> wrote:
>
> Hi!
>
> I have a number of ascii files downloaded from ASTR fire project from
> the ESA Ionia showing monthly fire incidences from 1996-2006.  I
> intend to combine all these files, remove unwanted columns and get the
> records from my current region/study area only. All records combined
> is 929,155 records!  My guess is I need to use the cat, cut, awk
> commands.
>
> Challenge: the files have different record formating
>
> file 1 is like this (take note of the space as the delimiter):
>
> Date   Time       Lat       Lon     NDVI  Station
> 020201 032428.163  -38.379  -66.334 -.-- ESR
> 020201 032428.163  -38.375  -66.323 -.-- ESR
> 020201 032428.312  -38.378  -66.359 -.-- ESR
> 020201 032428.312  -38.374  -66.348 -.-- ESR
> 020201 032428.312  -38.371  -66.337 -.-- ESR
>
> file 2 looks like this:
>     Date                Orbit  Time           Lat         Lon
>     20030101        4384     81704.016    19.364  -155.103
>     20030101        4384     81704.164    19.373  -155.105
>     20030101        4384     81704.164    19.375  -155.096
>     20030101        4385    100833.648    56.638   161.281
>     20030101        4386    130756.352   -20.340   134.099
>
> I only need the columns for date, time, lat, lon
>
> Here's what I did:
>
> #combine all file (monthly)
> cat 9904ESA01.FIRE 9905ESA01.FIRE 9906ESA01.FIRE 9907ESA01.FIRE
> 9908ESA01.FIRE ... > test
>
> # cut only desired columns (1_4) delimeiter is spac ' '
> cut -d' ' -f1 test > 1
> cut -d' ' -f2 test > 2
> cut -d' ' -f3 test > 3
> cut -d' ' -f4 test > 4
>
> # combine all columns
> paste 1 2 3 4 > test5
>
> example output:
>
> 021231 223941.761   11.035   -5.016 -.-- ESR
> 021231 224005.303   12.226   -6.243 -.-- ESR
>     20030101        4380     25934.057   -37.022   -69.589
>     20030101        4382     45951.090    33.005  -110.772
>
> The problem is for the file example 1, lat and lon columns contain
> spaces other than the delimiter example " -38.00" while another is
> "120.00"  In the file2 example, more spaces are there.  I think I need
> to process different file formats separately but how do I solve the
> problem for spaces in the lat/lon columns?
>
> One last question how do I get the records for my current region only?
>
> north:      20:00:01.49976N
> south:      5:00:01.499767N
> west:       115:00:01.5012E
> east:       130:00:01.501193E
>
>
> I'm starting to understand awk (reading the gawk manual right now) but
> may take a while to get do something magical.
>
> Thanks!
>
> Maning
>
> --
> |---------|----------------------------------------------------------|
> | __.-._  |"Ohhh. Great warrior. Wars not make one great." -Yoda     |
> | '-._"7' |"Freedom is still the most radical idea of all" -N.Branden|
> |  /'.-c  |Linux registered user #402901, http://counter.li.org/     |
> |  |  /T  |http://esambale.wikispaces.com|
> |
> _)_/LI  |http://www.geocities.com/esambale/philbiodivmap/philbirds.html   |
> |---------|----------------------------------------------------------|
>
> _______________________________________________
> grassuser mailing list
> grassuser at grass.itc.it
> http://grass.itc.it/mailman/listinfo/grassuser
>



-- 
David Finlayson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/grass-user/attachments/20060807/959bb6b3/attachment.html


More information about the grass-user mailing list