[GRASS-dev] r.series map names buffer overflow
    Glynn Clements 
    glynn at gclements.plus.com
       
    Fri Nov 14 11:44:36 EST 2008
    
    
  
Markus Neteler wrote:
> >> I am unsure how to track this down (maybe there is a fixed buffer in libgis?).
> >
> > What does "ulimit -n" say? That's the OS-imposed limit on the number
> > of open file descriptors per process.
> 
> Bingo:
> ulimit -n
> 1024
> 
> 
> > On my system, both the hard and soft limits are 1024. The soft limit
> > can be changed with e.g. "ulimit -n 1500", but only up to the hard
> > limit. The hard limit can only be changed by root.
> 
> I forgot about this limitation.
> This is somewhat dangerous, say, could it be trapped? If r.series
> gets more input files than ulimit -n (C equivalent) allows, could
> it spit out an error (the manual suggesting than to split into smaller
> jobs)?
It's possible to detect that this has occurred, but only in the lowest
levels of libgis, i.e. in G__open(). open() should return EMFILE if it
fails due to exceeding the per-process resource limit (or ENFILE for
the system-wide limit, but that's rather unlikely).
It isn't feasible to accurately predict that it will occur before the
fact. Apart from the descriptors for the [f]cell files, which are held
open throughout the process, other descriptors will already be open on
entry (at least stdin, stdout and stderr will be open, and often a few
others inherited from the caller), and additional descriptors will be
opened temporarily throughout the life of the process (but it's hard
to know how many, e.g. some libc functions will read configuration or
data files upon first use).
OTOH, it would be straightforward to print a warning if the number of
maps exceeds e.g. limit * 0.95:
	#ifndef __MINGW32__
	#include <sys/resource.h>
	struct rlimit lim;
	if (getrlimit(RLIMIT_NOFILE, &rlim) < 0)
	    G_warning("unable to determine resource limit (shouldn't happen)");
	else if (nmaps > rlim.rlim_max * 0.95)
	    G_warning("may exceed hard limit on number of files; consult your sysadmin in event of errors");
	else if (nmaps > rlim.rlim_cur * 0.95)
	    /* ulimit is a Bourne-shell command; csh uses `limit' and `unlimit' */
	    G_warning("may exceed soft limit on number of files; use `ulimit -n' in event of errors");
	#endif /* __MINGW32__ */
BTW, now that the G__.fileinfo array is allocated dynamically, I have
been thinking about making libgis keep open the descriptor for the
null bitmap. Re-opening the file every few rows can have a significant
performance impact for modules which are I/O-bound.
However, this would mean that you need twice as many descriptors (or
will hit the limit with half the number of maps). AFAICT, this was
(part of) the original reason for not keeping the null bitmap open. 
But that was when Linux had a system-wide limit of (by default) 1024
open files, set at compile time. Nowadays, typical defaults are 1024
files per process and ~200k files system-wide, both of which can be
changed at run time (with ulimit -n and /proc/sys/fs/file-max
respectively).
Ultimately, if you want to be able to use r.series (or other modules
which process several maps concurrently) with large numbers of maps,
you (or your sysadmin) need to ensure that resource limits are set
accordingly.
-- 
Glynn Clements <glynn at gclements.plus.com>
    
    
More information about the grass-dev
mailing list