[GRASS5] r.series, max files open

Glynn Clements glynn.clements at virgin.net
Tue May 6 03:32:21 EDT 2003


H Bowman wrote:

> I am trying to get r.series to produce time series stats for 365 raster
> files.
> 
> It dies after opening about 254 files through. (sorry don't have the
> exact error message on hand)
> 
> I assume it is running up against a max files open limit,

Yes. Specifically, libgis has a fixed limit of 256 open raster maps,
set at the top of src/libes/gis/G.h:

	#define MAXFILES    256

> so I haven't
> looked too closely, could be something else (memory;
> inputfiles=raster1,raster2,...,raster365 makes the command line pretty
> long; for example),

That could also be an issue, but it isn't the problem here (if the
command line was too long, you would get an error from the shell,
before r.series was run).

> or a dumb mistake on my part.
> 
> I'm running Linux 2.4.19 with r.series from CVS.
> 
> 
> If it is a max file limit thing, r.series should be able to work around
> that.. open 200, dump them to memory, fclose all, read another 200, etc.
> until all are loaded and then do the math.

Store the entire series in memory?

For large files, that would just replace an out-of-descriptors error
with an out-of-memory error.

> I would think a time series program should be able to handle at
> least 8760 (number of hours/year) records? 365 at minimum anyway.

Unless these are *really* small files, reading 8760 of them into
memory is probably out of the question. OTOH, having 8760 files open
simulataneously is likely to be equally problematic.

We could just increase the MAXFILES value. However, each slot uses 552
bytes on x86, so memory consumption could be an issue (bearing in mind
that it affects every process which uses libgis). Also, there's no
point increasing it beyond the OS limit (so 8760 files may not be
possible, even if you can afford an extra 4.6Mb per process).

The 552-byte figure could be reduced a bit by more sensible memory
management. E.g. each slot includes a "struct Reclass", which
statically allocates 100 bytes for the name and mapset of the base
map, whereas two pointers would only use 8 bytes.

But primarily, we would want to allocate the array of slots
dynamically, so that only processes which actually used thousands of
slots would allocate the memory for them.

The main problem there is that the code in question is critical to
GRASS. Any errors could result in large chunks of GRASS being
unusable. The other problem is that most of the code which uses that
structure is an illegible mess.

-- 
Glynn Clements <glynn.clements at virgin.net>




More information about the grass-dev mailing list