[Live-demo] OSGeoLive 9.5 status: pre-RC - Jupyter Notebooks

Angelos Tzotsos gcpp.kalxas at gmail.com
Wed Mar 9 05:33:50 PST 2016


Thank you Massimo for your detailed reply.

Some comments inline:

On 03/09/2016 09:50 AM, massimo di stefano wrote:
> Hi,
>
> I’ll try to address some of the issues raised in the last couple of mails:
>
> 1)  concern about adding extra materials to the live
> 2) Include or drop jupyter from 9.5 release
> 3) contributing new notebooks to the live
> 4) which notebook to include on Osgeo-Live 9.5
>
>
> 1) About the concern in adding extra materials to the live:
>
> To address:
> Cameron:
>>>> I concerned about how much new material we are attempting add related to Jupyter notebooks, all at the last moment.
> Angelos:
>> Furthermore, as part of the Jupyter setup, a big number [2] [3] of Python projects/libraries were also added to OSGeoLive, as soft dependencies to support Notebooks, without official review or decision by the community,
>
>  From the dependencies listed by Angelos and linked here:
>
>> [2] https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L30 <https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L30>
>> [3] https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L43 <https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L43>
> My proposed PR —>  https://github.com/OSGeo/OSGeoLive-Notebooks/pull/5 <https://github.com/OSGeo/OSGeoLive-Notebooks/pull/5>
>
>
> uses the following packages:
>
> ###
> python-matplotlib
> python-scipy
> goal-bin
>
> python-geographiclib
> python-geocoder
> ###
>
>  From which only the last 2 are new to  the osgeolive.
> The other packages where already installed in the previous version of the OSGeo-live
> and are pretty common in the OSGeo software ecosystem.
>
> In the PR comments, I added notes on its footprint, which is in the order of 10 MB.

Thank you for the detailed assessment of the dependencies brought in by 
your work.
Not including gdal-bin was actually a bug ;)
You are correct that the other packages were already present in the 
previous release, I just wanted to point out that we are being strict to 
the Notebook itself, but we include other projects without even 
discussing about it...

>
> 2) Include jupyter YES or NO
>
> I’ll leave this to the community, but i should clarify that:
>
> -  ’there is a tremendous amount of work behind the finally complete jupiter packaging’ that makes the live a better product.
> -   all the notebooks developed for the GSoC are working just fine with ipython notebook  as well as with jupiter notebook.
>
> And I should point out that:
> -  ipython notebook is not something new to the OSGeo-live, it is more than one year that is shipped within the OSGeo-live.
> -  the notebook has been successful adopted by some of the main OSGeo projects during OSGeo-conference workshops
>     (see [1] [2] as example)
> - to have the GSoC merged into the current OSGeo-live, there is no need of jupiter ... if you really want to drop it :( .

For the record, I am in favor of including the Python stack in 
OSGeo-Live, including Jupyter.
We DO need a working Jupyter quickstart to be able to include it though, 
and it has to be ready within the next 48h

>
>
> 3) contributing new notebooks to the live:
>
> Although the notebook can be used as a “scripting tool” or like a simple ide IMHO the notebook is not just that, the notebook adds the capabilities to include rich descriptive narrative to a data processing workflow including the possibilities to interact with data through a nicely done widget interactive system  and finally enabling the printing of publication-ready reports (see latex inline rendering and pdf export).

+1

>
> Also, on the OSGeo-live, we are aiming (AFAIK) to use the same `common dataset` among projects (or, at least, we attempt to) this way different software projects can be more easily compared,  say for complexity of their usage or for output quality/performance comparison.

Very valid point raised here regarding the use of the common dataset in 
the Notebooks. This should be a requirement for selection.

>
> Having said that,
> from a first look at all the other notebooks (or I should say, the only ones if the PR will be not accepted ) shipped within the OSGeo-live I noticed  that almost all the notebooks under the “projects” folder [3] are not the result of original work. Some of those notebooks are just a carbon copy of the examples .py (script) code from the relative `project src code` on GitHub and the potential offered by the notebook platform is not exploited at all. … no problem from my side in having them included on the OSGeo-live, but I hope the contributors of those notebook will make a more effective use of the tool.
>
> Moreover, almost none of those notebooks make use of the data we already ship within the live…
> Hint: the notebook developer (I said developer, right … not just contributors) can make use of one of the power of the notebook which is the capability of mixing python and bash in the same document . ...
> eg. you need a geoJSON file? just use ‘ogr2ogr' to convert one of our ’shape file' to the format you need … or make a numpy loop generate user defined novel data .. perhaps from a nice query to PostGIS (via psycopg) … it  will take few lines of code and a nicely done HTML description for each step)

I agree on the term developer vs contributor.

>
> 4) which notebook to include on OSGeo-Live 9.5
>
> trying to address this:
>
>>> I'm proposing that we release just a few of the Notebooks first, seek community feedback on this small subset, adapt if required. But most importantly build an OSGeo-Live notebook community and buy in before going too wide.
>
> while I’d love to see more people contributing to the notebook ecosystem on the OSGeo-live as well. In regards to the GSoC notebooks, I can’t see how possibly can be to apply this:
>
> ‘'' release just a few of the Notebooks ‘’’
>
> To the GSoC’s notebooks.
>
> If you have an idea of what the GSoC-2015 was all about (and as principal mentor you should!)  you will notice that those notebooks are linked together, they are a whole .. what you propose in the sentence above (if applied to the GSoC’s notebooks), sounds like publishing a book without all the pages….

As a co-mentor of the GSoC project, I have fully reviewed the Notebooks 
you provided and reproduced your results back then.
You recently opened a PR about including them in OSGeoLive. Wearing my 
OSGeoLive maintainer hat, I had to re-evaluate your Notebooks in case of 
missing libraries, dead links etc. This process/review did not imply 
that your code was not working.
I also have to report that the Notebooks from GSoC are working as 
expected in nightly builds of OSGeoLive.

>
> In the proposed PR I already reduced the number of notebooks, I removed extra features or incomplete, placeholders and other notebook based on python3 (which is not shipped for this release)

Thank you for removing the placeholders, they were sending out the wrong 
impression of an incomplete work. I hope the interesting topics that got 
removed can be added again later with a full content.

>
> as per vote .. I want to see my GSoC merged and that’s why I’m still working on the OSGeo-live.
> Of course, I’m biased and I let the community decide.

Thank you for opening the motion on a different thread.

>
> Cheers,
> Massimo.

Best,
Angelos

>
>
> [1] https://github.com/wenzeslaus/python-grass-addon <https://github.com/wenzeslaus/python-grass-addon>
> [2] https://github.com/zarch/workshop-pygrass <https://github.com/zarch/workshop-pygrass>
> [3] https://github.com/OSGeo/OSGeoLive-Notebooks/tree/master/projects <https://github.com/OSGeo/OSGeoLive-Notebooks/tree/master/projects>
>
>> On Mar 7, 2016, at 8:15 PM, Angelos Tzotsos <gcpp.kalxas at gmail.com> wrote:
>>
>> Hi,
>>
>> This discussion is not only related to the work made by Massimo and other contributors, but has further implications on how we include new projects in OSGeoLive:
>>
>> So far, Jupyter has been included in the development builds as a natural evolution of the IPython project (which was also recently included in OSGeoLive), so Jupyter never followed the path to be officially included [1]. As we speak (far past the feature freeze date) only the overview doc is committed so one could argue in favor of dropping Jupyter from this release completely...
>>
>> Furthermore, as part of the Jupyter setup, a big number [2] [3] of Python projects/libraries were also added to OSGeoLive, as soft dependencies to support Notebooks, without official review or decision by the community, just by following volunteers' vision of what functionality should be demonstrated by the Notebooks. Part of this vision was to keep OSGeoLive relevant in terms of tools and trends in the open source geospatial world.
>>
>> Now the proposal is to select a subset of Notebooks to present to the community in order to get feedback and build a notebook community to support further adoption. A natural follow-up question is: should we also select which supporting projects will be included? If we drop some of the Notebooks, their dependencies should also be dropped, right?
>>
>> So how do we decide which notebooks/supporting libraries to keep? And how do we do that without being disrespectful to all volunteers who contributed their time to maintain the Jupyter/Python stack for some months now (including hacking notebooks, creating debian packages, mentoring or participating in the GSoC project)?
>>
>> Here is my proposal:
>> 1. Massimo and Brian are named official maintainers of Jupyter, as the only actual contributors of Notebooks [4]
>> 2. Jupyter maintainers have to provide all the needed documentation (overview and quickstart) ASAP, else Jupyter is dropped from 9.5
>> 3. Jupyter maintainers get to decide which Notebooks to include in the final iso.
>> 4. If they disagree, the community votes weather to include all notebooks OR drop Jupyter from 9.5 release and re-evaluate for 10.0
>>
>> Regards,
>> Angelos
>>
>> [1] https://wiki.osgeo.org/wiki/Live_GIS_Disc#How_to_add_a_project_to_OSGeoLive <https://wiki.osgeo.org/wiki/Live_GIS_Disc#How_to_add_a_project_to_OSGeoLive>
>> [2] https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L30 <https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L30>
>> [3] https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L43 <https://github.com/OSGeo/OSGeoLive/blob/master/bin/install_jupyter.sh#L43>
>> [4] https://github.com/OSGeo/OSGeoLive-Notebooks/graphs/contributors <https://github.com/OSGeo/OSGeoLive-Notebooks/graphs/contributors>
>>
>> On 03/08/2016 12:17 AM, Cameron Shorter wrote:
>>> Hi all,
>>> I should clarify my statement below, (as has been to me off list), as it might appear that I'm implying a lack of future, or quality of notebooks.
>>>
>>> My comments below relate to level of external testing and size of community who have reviewed Massimo's notebooks.
>>>
>>> I think that Massimo has done an excellent job pioneering notebooks within the OSGeo-Live framework, and these notebooks provide a great platform from which to demonstrate OSGeo functionality.
>>> I think our next step is to work toward bringing a groundswell of community behind the development of these notebooks.
>>>
>>> My suggested approach differs a little with that proposed by Massimo, although I think we are aiming toward the same long term goal (of wide adoption and community maintenance of Notebooks within the OSGeo-Live framework).
>>>
>>> I'm proposing that we release just a few of the Notebooks first, seek community feedback on this small subset, adapt if required. But most importantly build an OSGeo-Live notebook community and buy in before going too wide.
>>>
>>> This question is still unresolved within the core OSGeo-Live team, and we need to make a decision fast, as our Release Candidate is due next Monday 14 March. Opinions from our OSGeo-Live community would be greatly appreciated so we can make a wise decision moving forward.
>>>
>>> Warm regards, Cameron
>>>
>>> On 7/03/2016 10:54 pm, Cameron Shorter wrote:
>>>> Angelos, all,
>>>> I concerned about how much new material we are attempting add related to Jupyter notebooks, all at the last moment.
>>>>
>>>> With OSGeo-Live, we have built our reputation around quality and stability, and I think we should be careful not to compromise that. We will attract more users to Jupyter notebooks if they try one excellent notebook, and look elsewhere for more, than if they try 10 notebooks which almost work.
>>>>
>>>> So before adding a new Notebook, I suggest that it should be tested start to finish, and then thoroughly reviewed  by the author, and then at least one other person, preferably 2.
>>>>
>>>> Am I right in understanding that we are currently proposing to add ~ 30 new notebooks? I'd be inclined to pick out 2 to 5 of these and focus on getting just these working.
>>>> (The remainder can be included on OSGeo-Live for testing and workshops, just ensure that you can only find it if provided with the correct URL)
>>>>
>>>> That said, who do we have available to help test notebooks? If you can help out, please reply to this email, volunteering your services.
>>>>
>>>> On 7/03/2016 10:34 pm, Angelos Tzotsos wrote:
>>>>> 2. Jupyter Notebooks: We currently have a git repository with notebooks to include in the final release and we also have an open pull request to merge the work from GSoC 2015 [5].
>>>>> There is a special nightly build [6][7] including the GSoC notebooks.
>>>>> We need to evaluate all our notebooks for this release and make a decision on the notebooks to be included.
>>>>> Perhaps we need a team of volunteers to go through all notebooks and review them? Perhaps we need a spreadsheet listing all notebooks and their status? Thoughts?
>>
>> -- 
>> Angelos Tzotsos, PhD
>> OSGeo Charter Member
>> http://users.ntua.gr/tzotsos <http://users.ntua.gr/tzotsos>
>>
>> _______________________________________________
>> Live-demo mailing list
>> Live-demo at lists.osgeo.org <mailto:Live-demo at lists.osgeo.org>
>> http://lists.osgeo.org/mailman/listinfo/live-demo <http://lists.osgeo.org/mailman/listinfo/live-demo>
>> http://live.osgeo.org <http://live.osgeo.org/>
>> http://wiki.osgeo.org/wiki/Live_GIS_Disc <http://wiki.osgeo.org/wiki/Live_GIS_Disc>


-- 
Angelos Tzotsos, PhD
OSGeo Charter Member
http://users.ntua.gr/tzotsos




More information about the Osgeolive mailing list