[GRASS-dev] split GRASS (lib / cli / modules / wx / qt / web / etc.)

Mon Mar 21 02:48:08 PDT 2016

On 18/03/16 18:38, Pietro wrote:
> On Fri, Mar 18, 2016 at 2:16 PM, Moritz Lennert
> <mlennert at club.worldonline.be> wrote:
>> On 18/03/16 12:58, Pietro wrote:
>> In your opinion is this true at the module level, or mostly for the wxGUI ?
>
> No in my opinion things are quite mixed also in C/python modules.
>
>
>>> Let's start with a simple example: most of the GRASS modules, mix
>>> nicely logic and cli, several of them have a single main function with
>>> everything inside. I think could be useful to have a more clear
>>> distinction between logic/algorithms and their public interface
>>> (cli/gui).
>>
>>
>> Why ? I really like the fact that each module works kind of like a
>> high-level function with a defined public interface.
>
> Yes, and I like too! Indeed I don't want to change this. What I would
> like to change is to better distinguish this high-level
> functionalities from the low level parts. So for instance just opening
> randomly a GRASS core module: r.resamp.stats
>
> https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.resamp.stats/main.c
>
> here are defined:
> - static const struct menu, that probably could be useful also for
> other modules so should go to the grass-lib
> - static char *build_method_list(void), that could be generalized to
> used also from other modules and should go to the grass-lib
> - static int find_method(const char *name), could be also generalized
> to be used by other modules and should go to the grass-lib
> - static void resamp_unweighted(void), this function could be also
> changed to be more general and moved to the grass-lib too
> - static void resamp_weighted(void), same as before.
>
> For each of the above function we can build tests to improve the
> reliability, verify performance regression and so on.
>
> So far you can access to this functions only from the CLI interface,
> if we clear separate this two level then we can access to this using
> CLI, but also using C/python/etc. So for instance If I need select an
> option from a menu list on a new module I have to reinvent the wheel,
> write my own buggy code and as GRASS developers we end up having
> duplicate buggy code in each module.
>
> The main function could stay or be rewritten in python, this is not
> really relevant because it is just defining the CLI interface and
> calling the functions and finally adding the history metadata and set
> the color table.
>
>
>
>>> If we clearly split these two things the GRASS modules
>>> became just an interface to some functions inside the GRASS libraries.
>>
>>
>> Many modules (if not most) are already that: a combination of GRASS library
>> function calls in order to achieve the specific goal the module is set out
>> for.
>
> yes they are calling GRASS library functions, but they are also adding
> functionalities that (imho) should be included to the GRASS library.
> Because they could be useful not only for this specific modules but
> also for others.

The above points are a call for refactoring of the code, not necessarily 
reorganizing it into different packages in different source trees.

IMHO, this will always be an issue because of the structure of GRASS: 
Someone develops a new module. Code is specific to this module. Then 
someone else develops a second module and reproduces part of the code of 
the former, because it does what they want and so they just copy it.
At one point several modules share the same code and it becomes clear 
that there is a need for this code at library level.

Unless we work with a much more centralized development system, where a 
limited number of developers review each proposed module and then check 
whether parts of the code should go into the library instead of the 
module, I don't really see a different way of doing things as the way 
that has grown organically throughout the development history of GRASS...

We could decide to organize concerted code review moments aiming at 
identifying relevant parts of code that should go into libraries, but I 
have the feeling that current ad-hoc management is more efficient: 
whenever the need is felt, we do it.

BTW, I don't see how separating the source trees solves this issue. Many 
people will still continue to code things in modules first and only 
after the same code is used in several modules will it become apparent 
that it should go into the libs.

> mmh, ok, so let's add a more layer: grass-lib, grass-py, and then the others..
>
> So grass-lib will contain only C (C++?) code and will be not available
> at the PyPi website.
> grass-py add the python wrap to grass-lib and add API and go to PyPi.
>
> This is the same approach of GDAL[0], PROJ4[1], mapnik[2] that are all
> available as python packages.
>
> I do think that add grass to PyPi can only open new prospective and
> use cases reaching a broader group of users and developers.

I have the feeling that the question about possibly extracting the 
python libs into an installable PyPi package is a different issue. But 
then again, what use would these libraries be without the C-libraries 
and actually even without the modules ? Yes Pygrass allows you to code 
directly with low-level routines, but for me one of the big strength of 
the GRASS modular structure and our Python APIs is that they allow to 
work with the modules as "functions", so without forcing to use any 
low-level access.

>
>
>>> - grass-modules: provides all the GRASS core modules (this could be
>>> also a pure python interface calling functions in the C/Python
>>> libraries), and could be split in other sub categories (e.g. imagery,
>>> temporal, terrain, etc).
>>
>>
>> What is the difference between these modules and the existing ones ? Except
>> for your idea to make all module Python modules.
>
> Because things are complicated and the current status quo is not
> flexible and/or very limited.
> Basically you can create python packages that act as both library and
> grass-addons.

That is again another issue: how to handle addons that consist of more 
than just one single file script ? I'm not sure that I would agree that 
everytime someone codes a more complex addon, the lib part should 
immediately go into the core GRASS libs. So, I'd rather see an easier 
way to handle such libs in addons. Currently, it is not easy for people 
(see [1] for a recent example), but even though I'm talking without 
enough knowledge, here, I don't think it should be too difficult for 
addon modules to contain files that get installed in .grass7/addons/lib 
instead of the current mix of different solutions.

> So for instance the r.green modules require scipy and numexpr to run,
> and the only way available in 2016 is that user have to care about it
> installing the missing libraries and even worst if some of our modules
> depend on another addons there is no way to handle this.

I agree that dependency management between addons would be nice, but I 
don't see it as that much of an issue. You can always include in your 
module a check for the existence of another and stop with a fatal error 
encouraging the user to install that module.

AFAIU, there is also the option to use toolboxes, but personally, I 
haven't looked at this in detail, yet.

> Other things that are not working with the current set up is that
> there is not way to said for each version of grass the addon is
> available.

> So for instance r.green was developed and tested with grass7 stable,
> Sören has improved pygrass vector API in trunk (thank you Sören), but
> now in the grass-addons repository I have to choose if I want to
> support grass-stable or grass-trunk, I have to specify this somewhere
> in the manual and hope user read it. Instead I could create a python
> package and installing r.green (v0.3) will use grass-lib(v7.0.x) or
> installing r.green(v0.4) will use grass-lib(v7.1.x).

Another option might be to distinguish in the addons repository between 
grass_stable and grass_trunk, possibly with some easy option to create a 
sort of "symbolic" link between the two for modules than run with both.

> Moreover I could
> also have the module documentation using sphinx, instead of writing
> html code.

That's a totally different issue again.

> Perhaps in the future I would like to add a dedicated GUI to this set
> of modules or a web interface and then I will have more dependencies,
> and make almost impossible to use this new features for an average
> user.

I think that if you go that far away from the KISS principle in GRASS 
module elaboration, then you are probably better off packaging the whole 
thing...

> So basically we can remove g.extension and rely on pip or we have to
> reinvent the wheel to get these functionalities in g.extension.

I'm not familiar enough with pip to judge what this would entail for GRASS.

> Sorry If I was not clear, hopefully now it is a bit clearer what kind
> of advantages we (as developers) could have.

Well, I mainly find that you mix many different issues, and I'm not 
convinced that the proposed solution of breaking up the source tree 
really solves all of these.

Maybe it might be better to break up your suggestions into separate 
parts to discuss them individually. If at the end we see that they all 
point to the same solution, your argument will be ever more convincing :-)

Just my 2¢,

Moritz

[1] https://lists.osgeo.org/pipermail/grass-dev/2016-March/079481.html