[GRASS-dev] Benchmark the overhead of calling GRASS modules

Wed May 1 15:03:30 PDT 2019

Hi Panos,

IMHO the overhead of calling GRASS modules is insignificant because it is
in the range of milliseconds. I am much more concerned whether executing a
GRASS module takes days or hours or minutes.

Also note that the base of GRASS are C modules using the GRASS C library.
GRASS python modules usually call GRASS C modules (or other GRASS python
modules calling GRASS C modules). The first thing a GRASS Python module
does is calling the GRASS C module g.parser, after that it calls (in the
end) some other GRASS C modules. That means it is not straightforward to
test the overhead of calling GRASS Python modules vs calling GRASS C
modules because it is really GRASS Python + C modules vs GRASS C modules
only. And the overhead is insignificant (not measurable) compared to actual
execution time for larger datasets/regions.

Markus M

On Mon, Apr 29, 2019 at 8:49 AM Panagiotis Mavrogiorgos <pmav99 at gmail.com>
wrote:

> Hello all
>
> You might find it easier to read the following text i
> <https://gist.github.com/pmav99/8f4546fe15940b3cb7db0cfb65e18d33>n a gist
> <https://gist.github.com/pmav99/8f4546fe15940b3cb7db0cfb65e18d33>
>
> ## Introduction
>
> I was trying to write a decorator/contextmanager that would temporary
> change the
> computational region, but while using it I noticed that there was some
> overhead the root
> of which seemed to be the usage of the GRASS modules. So in order to
> quantify this
> I wrote a small benchmark that tries to measure the overhead of calling a
> GRASS Module.
> This is what I found.
>
> ## tl;dr
>
> Calling a GRASS module incurs a constant but measurable overhead which, in
> certain
> cases, e.g. when writing a module that uses a lot of the other modules,
> can quickly
> add up to a significant quantity.
>
> ## Disclaimer
>
> If you try to run the benchmark on your own PC, the actual timings you
> will get will
> probably be different. The differences might be caused by having:
>
> - a stronger/weaker CPU
> - faster/slower hard disk.
> - using different Python version
> - using different compilation flags
>
> Still, I think that some of the findings are reproducible.
>
> For reference, I used:
>
> - OS: Linux 5.0.9
> - CPU: Intel i7 6700HQ
> - Disk type: SSD
> - Python: 3.7
> - `CFLAGS='-O2 -fPIC -march=native -std=gnu99'`
>
> ## Demonstration
>
> The easiest way to demonstrate the performance difference between using a
> GRASS module
> vs using the GRASS API is to run the following snippets.
>
> Both of them do exactly the same thing, i.e. retrieve the current region
> settings 10
> times in a row.  The performance difference is huge though. On my laptop,
> the first one
> needs 0.36 seconds while the second one needs just 0.00038 seconds. That's
> almost
> a 1000x difference...
>
> ``` python
> import time
> import grass.script as gscript
>
> start = time.time()
> for i in range(10):
>     region = gscript.parse_command("g.region", flags="g")
> end = time.time()
> total = end - start
>
> print("Total time: %s" % total)
> ```
>
> vs
>
> ``` python
> import time
> from grass.pygrass.gis.region import Region
>
> start = time.time()
> for i in range(10):
>     region = Region()
> end = time.time()
> total = end - start
>
> print("Total time: %s" % total)
> ```
>
> ## How much is the overhead exactly?
>
> In order to measure the actual overhead of calling a GRASS module, I
> created two new
> GRASS modules that all they do is parse the command line arguments and
> measured how much
> time is needed for their execution. The first module is [`r.simple`]() and
> is
> implemented in Python while the other one is [`r.simple.c`]() and is
> implemented in C.
> The timings are in msec and the benchmark was executed using Python 3.7
>
> | call method                    | r.simple | r.simple.c |
> |--------------------------------|:--------:|:----------:|
> | pygrass.module.Module          |   85.9   |    66.5    |
> | pygrass.module.Module.shortcut |   85.5   |    66.9    |
> | grass.script.run_command       |   41.3   |    30.5    |
> | subprocess.check_call          |   41.8   |    30.3    |
>
> As we can see, `gsrcipt.run_command` and `subprocess` give more or less a
> identical
> results, which is to be expected since `run_command` + friends are just a
> thin wrapper
> around `subprocess`.  Similarly `shortcuts` has the same overhead as
> "vanila"
> `pygrass.Module`.  Nevertheless, it is obvious that `pygrass` is roughly
> 2x times slower
> than `grass.script` (but more about that later).
>
> As far as C vs Python goes, on my computer modules implemented in C seem
> to be 25% faster than their
> Python counterparts.  Nevertheless, a 40 msec startup time doesn't seem
> extraordinary
> for a Python script, while 30 msec feels rather large for a CLI
> application implemented
> in C.
>
> ## Where is all that time being spent?
>
> ### C Modules
>
> Unfortunately, I am not familiar enough with the GRASS internals to easily
> check what is
> going on and I didn't have the time to try to profile the code. I suspect
> that
> `G_gisinit` or something similar is causing the overhead, but someone more
> familiar with
> the C API should be able to enlighten us.
>
> ### Python Modules
>
> In order to gain a better understanding of the overhead we have when
> calling python
> modules, we also need to measure the following quantities:
>
> 1. The time that python needs to spawn a new process
> 2. The startup time for the python interpreter
> 3. The time that python needs to `import grass`
>
> These are the results:
>
> |                         | msec |
> |-------------------------|:----:|
> | subprocess spawn        |  1.2 |
> | python 2 startup        |  9.0 |
> | python 2 + import grass | 24.5 |
> | python 3 startup        | 18.2 |
> | python 3 + import grass | 39.3 |
>
> As we can see:
>
> - the overhead of spawning a new process from within python is more or
> less negligible
>   (at least compared to the other quantities).
> - The overhead of spawning a python 3 interpreter is 2x bigger than Python
> 2; i.e.  the
>   transition to Python 3 will have some performance impact, no matter what.
> - The overhead of spawning a python 3 interpreter accounts for roughly 50%
> of the total
>   overhead (18 msec out of 41 msec).
> - The other 50% of the overhead is pretty much caused just by importing
> the `grass`
>   library.
>
> ## Why is Pygrass 2x times slower?
>
> I haven't carefully looked into this (i.e. I haven't profiled the code),
> but it seems
> that the culprit is [this
> line](
> https://github.com/GRASS-GIS/grass-ci/blob/fe814a4fb73937ce96a36c11b6876db42acf28fa/lib/python/pygrass/modules/interface/module.py#L528
> ).
> In other words, `pygrass` calls a module twice, once to get the
> interface's description
> and once to actually run the module. That's why the overhead is double
> both for
> C modules and Python ones.
>
> ## Is this truly a problem?
>
> Well, it depends :)
>
> The most important factor is probably the size of the computational
> region.  If you are
> dealing with really large regions, then the overhead is probably
> miniscule. If you are
> dealing with much smaller ones though, then it can be significant.
>
> That being said, there are at least two cases where I think that this
> overhead is
> important:
>
> - interactive sessions
> - running tests
>
> Just to give an example, `i.pansharpen` calls ~50 other GRASS modules. On
> my laptop, the
> overhead for these calls is almost 2 seconds. Is this a lot? Well, if you
> are
> pansharpening Landsat Tiles, it probably is not that much, but you will
> have this
> overhead even if you are pansharpening a 4 pixel map (e.g. when running
> tests).
>
> I am pretty sure that there are numerous other modules that are just like
> that, too.
>
> ## What is to be done?
>
> I think that there are two different areas where work can be done:
>
> 1. Reduce the actual overhead; i.e. make calling GRASS modules faster.
> 2. Convert calls to GRASS modules with API calls which, as shown earlier,
> are orders of
>    magnitude faster.
>
> ### Reduce the overhead
>
> The overhead of Python and C modules has different causes.
>
> As I said, I haven't looked into what causes C modules to take so long so
> i can't make
> any suggestions. This should be further looked into though. The reason is
> that modules
> like e.g. `g.region` are being used practically everywhere, so speeding
> these up should
> give a measureable performance improvement.
>
> As far as Python modules go, unfortunately, the overhead seems to be
> relatively
> inelastic.  Speeding up the startup time of the Python interpreter is not
> something that
> GRASS can do or rely upon. So, the only thing that can be actually be done
> is to try to
> reduce the time needed for importing the `grass` library. That being said,
> this should
> probably not skim more than a few ms at best, but at least the gains
> should be there for
> all python modules.
>
> Luckily there is at least one relatively low hanging fruit. Speeding
> pygrass should not
> be that difficult. I haven't looked into this but pre-generating the
> modules' XML
> descriptions seems feasible and it should remove the need to call each
> module twice,
> thus making pygrass performance comparable to `grass.script.run_command`
>
> ### Convert module calls to API calls
>
> Converting the module calls to API calls is what can potentially give the
> bigger
> benefits, but at the same time is what needs the most work.
>
> There are some low hanging fruits here too. E.g. functions like
> `use_temp_region()` and
> `raster_info()` can probably be refactored to use the pygrass API, while
> calls to e.g.
> `g.region` can probably be replaced with `pygrass.gis.Region` objects etc.
>
> Nevertheless, this does not really touch the root of the problem which
> IMHV is the tight
> coupling between the GRASS modules functionality and the CLI. In layman
> terms, at the
> moment, if you want to use module A from module B you are forced to spawn
> a new process
> and suffer the overhead that this entails.
>
> TBH, I am not sure if this a problem that can even be tackled at this
> stage, but if each
> module had one or more functions that could be imported/called by other
> modules,
> everything would be much easier and performance would be significantly
> better.
>
> ## How to run the benchmark?
>
> If you want to run this benchmark on your own you need to:
>
> 1. `git remote add panos https://github.com/pmav99/grass-ci.git`
> <https://github.com/pmav99/grass-ci.git>
> 1. `git fetch --all`
> 1. `git checkout overhead`
> 1. `cd scripts/r.simple && make && cd ../../`
> 1. `cd raster/r.simple.c && make && cd ../../`
> 1. Start a grass session
> 1. `python benchmark.py`
>
> I haven't run the benchmark under Python 2, but even if there are
> incompatibilities,
> they should be trivial to fix.
>
> I will be happy to hear any remarks :)
>
> with kind regards,
> Panos
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20190502/2c24ae05/attachment-0001.html>