[GRASS-dev] r.series map names buffer overflow

Glynn Clements glynn at gclements.plus.com
Fri Nov 14 18:54:30 EST 2008


Markus Neteler wrote:

> >> >> I am unsure how to track this down (maybe there is a fixed buffer in libgis?).
> >> >
> >> > What does "ulimit -n" say? That's the OS-imposed limit on the number
> >> > of open file descriptors per process.
> >>
> >> Bingo:
> >> ulimit -n
> >> 1024
> >>
> >>
> >> > On my system, both the hard and soft limits are 1024. The soft limit
> >> > can be changed with e.g. "ulimit -n 1500", but only up to the hard
> >> > limit. The hard limit can only be changed by root.
> 
> ulimit -n 1500
> bash: ulimit: open files: cannot modify limit: Operation not permitted
> 
> On my box I can only *reduce* the limit as normal user (1023 works).

Yep. You can't increase the soft limit above the hard limit, and you
can't increase the hard limit (-H flag) unless you're root (or have
the CAP_SYS_RESOURCE capability on systems with capabilities).

The soft limit protects against a runaway process; the hard limit
protects against a user willfully hogging resources.

> >> I forgot about this limitation.
> >> This is somewhat dangerous, say, could it be trapped? If r.series
> >> gets more input files than ulimit -n (C equivalent) allows, could
> >> it spit out an error (the manual suggesting than to split into smaller
> >> jobs)?
> >
> > It's possible to detect that this has occurred, but only in the lowest
> > levels of libgis, i.e. in G__open(). open() should return EMFILE if it
> > fails due to exceeding the per-process resource limit (or ENFILE for
> > the system-wide limit, but that's rather unlikely).
> 
> Is just counting the number of input maps given to the parser a no-op?
> If the user gives more than rlim.rlim_max * 0.95 input files then bail out.

A fatal error is possibly overkill. That will happen anyhow if the
process actually exceeds the limit.

The only reason I can see for making it a fatal error is if the peak
usage occurs at the end. In that situation, it might perform all of
the work then fail when it starts closing the maps. But that seems
unlikely; particularly if the loop which closes the input maps is
moved before the output maps are closed and their history written.

> > BTW, now that the G__.fileinfo array is allocated dynamically, I have
> > been thinking about making libgis keep open the descriptor for the
> > null bitmap. Re-opening the file every few rows can have a significant
> > performance impact for modules which are I/O-bound.
> >
> > However, this would mean that you need twice as many descriptors (or
> > will hit the limit with half the number of maps). AFAICT, this was
> > (part of) the original reason for not keeping the null bitmap open.
> 
> This would definitely be a showstopper for me since I regularly work
> with (multi year) time series.

Well, it isn't a problem if you have root access (or an accommodating
sysadmin).

If a user wants to tie up resources, they can do a pretty good job
with the defaults:

	$ ulimit -a
	...
	open files                      (-n) 1024
	...
	max user processes              (-u) 15863

1024 * 15863 = 16243712, which will easily exceed the system-wide
limit.

[The biggest problem with Unix resource limits is the inability to set
cumulative limits per user. Apart from -u, the limits are for each
individual process.]

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list