[GRASS-dev] [GRASS-user] r.to.vect stats
Markus Metz
markus.metz.giswork at gmail.com
Thu May 4 23:43:33 PDT 2017
On Fri, May 5, 2017 at 5:07 AM, Vaclav Petras <wenzeslaus at gmail.com> wrote:
>
> On Wed, May 3, 2017 at 5:34 AM, Moritz Lennert <
mlennert at club.worldonline.be> wrote:
> >
> > On 02/05/17 15:53, Vaclav Petras wrote:
> >> I'm using pipe_command() which is just convenience function setting
> >> stdout=PIPE. Similarly feed_command() is just setting stdin=PIPE which
> >> I'm not using because I'm feeding the stdout of the other process
> >> directly (stdin=first_process.stdout). What I don't understand,
> >> regardless of using stdin=PIPE or stdin=first_process.stdout for the
> >> second process, is what should be next.
> >
> > Do you really need the in_process.communicate() ? Here's what I used
> > in a local script and it works, without communicate(). Then again,
> > I don't think the data flowing through this pipe ever exceeded
available memory.
> >
> > pin = gscript.pipe_command('v.db.select',
> > map = firms_map,
> > ...
> > total_turnover_map = 'turnover_%s' % nace2
> > p = gscript.start_command('r.in.xyz',
> > input_='-',
> > stdin=pin.stdout,
> > ...
> > if p.wait() is not 0:
> > gscript.fatal("Error in r.in.xyz with nace %s" % nace2)
>
> The Popen.wait() documentation [1] says: "Warning: This will deadlock
when using stdout=PIPE and/or stderr=PIPE and the child process generates
enough output to a pipe such that it blocks waiting for the OS pipe buffer
to accept more data. Use communicate() to avoid that."
>
> And since I'm using stdout=PIPE (pipe_command()), I use communicate().
What troubles me is that Popen.communicate(input=None) documentation [2]
says: "Note: The data read is buffered in memory, so do not use this method
if the data size is large or unlimited."
>
> It says "data read", so it probably talks about stdout=PIPE when
communicate() does not return None(s) but data (stdout=PIPE and communicate
with the same process), i.e. it doesn't apply to this case and I don't have
to be troubled. As for the wait(), I think that it may work (works most of
the time), it is just not guaranteed to work with large data and it depends
on how smart the OS will be.
Maybe it is safer to store the output of v.out.ascii in a temporary file,
then use that file as input for r.in.xyz. You can then not only check if
v.out.ascii finished successfully, but also use the percent option of
r.in.xyz to reduce memory consumption for large computational regions. The
percent option does not work when piping input to r.in.xyz.
Markus M
>
> Vaclav
>
> [1]
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait
> [2]
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20170505/43fa076c/attachment-0001.html>
More information about the grass-dev
mailing list