[GRASS-user] Starspan/GRASS question

Thu Mar 6 11:17:24 EST 2008

So if I understand this correctly, you have some polygon surface that 
defines the extent of your subbasin, which you want to use to query and 
summarize the raster values falling within it (e.g. precip and temp), 
correct?  How are you doing it currently?  The main speedup that 
starspan gets you is that the polygon -> raster query should be 
significantly faster than a zonalstats type analysis.  With that said 
(and I'll need to talk to Carlos, the lead programmer of starspan, about 
this) there might be some tricks you can use if you are extracting using 
the same polygons and the same grids (in terms of size of raster, number 
of samples, lines, geographic extent, etc...)  The idea is basically 
there are two processing bottlenecks: 1) determining the polygon/raster 
intersection and 2) the i/o and actual extraction of the data.  #1 could 
be made faster if you are always using the same polys and grids, since 
you could (in theory) only determine the intersection once...

While we kick this around on our end, I'd recommend grabbing the latest 
and greatest version of starspan, and trying it out!

--j

Thomas Adams wrote:
> List:
> 
> My apologies for the background info. leading-up to my question…
> 
> I've seen mention of starspan previously and now I think it's time for 
> me to learn more. Let me pose a problem to you to see if starspan would 
> be of help. I am working on a modeling project with a couple of other 
> people that involves the following:
> 
> (1) downloading and decoding gridded fields of numerical weather 
> prediction (NWP) model output
> (2) ingesting the decoded data into GRASS to calculate basin average 
> precipitation & temperature (separate grids) for each subbasin
> (3) writing out all calculated basin average values for each grid to 
> separate files (one file contains one time step of data for all 
> subbasins) for both temperature & precipitation
> (4) for data management reasons, the files from (3) are written to a 
> PostgreSQL database
> (5) once all time steps of gridded precipitation and temperature field 
> data are written to the PostgreSQL database, another process collects 
> the data and generates individual ascii time series files for each 
> subbasin for both temperature & precipitation
> (6) once (5) is completed a hydrologic model runs using the temperature 
> & precipitation time series as input and hydrologic forecasts are 
> generated.
> 
> This is suppose to be a *real-time* process. The problem I am having is 
> a matter of scale. What I did not say is that there are 12 different 
> sets of NWP output covering a period of 168 hours at 6-hour time steps 
> for both temperature & precipitation. So, this means I must process 
> 12*2*(168/6) = 672 grids. Also, I need the mean areal values for 686 
> subbasins within the domain for each of the grids.
> 
> Steps (2-4) take about 20 seconds total for each of the grids… which is 
> ~7.5 hours total
> Step-5 also takes about 20 seconds for each time series file… which is 
> ~7.5 hours total
> 
> So, about 15 hours total. Now, I can cut this time in half by running 
> the processing of the temperature & precipitation grids and generating 
> their separate time series files in parallel, rather than sequentially. 
> So, I can get to about 7 hours fairly easily — what I am shooting for is 
> to get the processing time from 7 hours to about 3 hours (or less).
> 
> I need to more efficiently generate the many basin average time series 
> files from the numerous grids. Can starspan help by reducing the time to 
> calculate the the basin average values faster?
> 
> I would also appreciate any/all suggestions on how to efficiently go 
> from 'starspan generated basin average values' to my time series files. 
> Realize, of course, the the individual grids are only a slice in time, 
> so I have to track the grids and their resulting individual basin values 
> (in time) to generate the time series files.
> 
> To compound the problem, very soon, I need to add model grids from an 
> additional 21 models, bringing the total from 12 to 33!
> 
> Regards,
> Tom
> 

-- 

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307