[Zoo-discuss] 2022 OGC-OSGeo-ASF Code Sprint report

Venka venka.osgeo at gmail.com
Thu Mar 17 19:36:44 PDT 2022


Hi Gerald,

Thanks for the update about ZOO-Project participation in the Code Sprint.

Seems that you had a fruitful sprint implementing new features and
learning about some shortcomings integrating service execution using OGC 
API - Features as input using current approach.

Hope these new features will be included in the upcoming ZOO-Project
2.0.0 release.

Maybe be good to consider a feature freeze at some stage and focus on
reviewing/updating documentation.

Thanks again to all the ZOO-Keepers for the time and effort put during
the code sprint.

Best

Venka

On 3/15/2022 11:05 PM, Gérald Fenoy wrote:
> Hello everyone,
> I am glad to provide a short report about the ZOO-Proect participation during this year OGC-OSGeo-ASF Code Sprint.
> 
> Number of people from various organisations including the OGC, the OSGeo and the ASF gathered during three days, from March 8th to 10th 2022,  to make a joint effort in exploring and investigating the various standards available.
> 
> On the ZOO-Project specific side, as you already know, we had published an ideas page [1] for this code sprint. As always, we had many goals but we cannot get all of them accomplished only some, some partially, and for some in which we have made big progresses in, we realized that it may have been a wrong way to go.
> 
> Anyway, let start with the bright side. First of all, the first idea [2] has been partially implemented. Meaning that asynchronous requests are now effectively using RabbitMQ if there is any available. The last point, which is considering the use of such a mechanism in case of synchronous request, we evaluated that the current implementation would require significant modifications and refactoring for handling synchronous request as asynchronous per default. In addition, it would add new steps in the processing implying the run to take longer that it does now. On the other hand, it would add advantages amongst which, we can mention the ability to be able to return a statusInfo of the service in case the execution take longer than a predefined threshold that we may add to the main.cfg or the client would be able to send within a dedicated header parameter associated with the request. This has been discussed in the OGC API - Processes SWG already and this would be a way for us to support this new functionality in case it become a requirement.
> 
> The second idea [3] in the list was also considered during the code sprint period, and I would like to personally thanks Blasco Brauzzi for joining the Code Sprint and helping us in various way. First of all, he did help me for setting up properly on my platform the ADES software and get it working properly until the stage-out phase which was the one on which we get blocked. But once the min.io S3 bucket was correctly setup and access to it from other pods used by the ADES platform correctly configured, then this step was also working. So, as it is greatly explained on their GitHub repo [4], the ADES platform gives you the opportunity to deploy (and undeploy) new services then execute them. These services are written in Common Workflow Language (CWL), when a client trigger the execution of the CWL « execution unit » (so, a service) ADES do create a namespace with dedicated pods inside it for executing every steps of the service. In case, there are operation that may be run in parallel, then the pods are started at the same time and the execution can run in parallel. Obviously, once the service execution ended, the pods and dedicated namespace is automatically removed by ADES. I would like to mention that I am impressed by the capability offered out of the box by the ADES platform.
> 
> Still about this third idea, Blasco have implemented the support for automatic workspace creation relying on a shell script and made it available from here [5] in the feature/user-services branch. By investigating this new feature we realized that it may be better to use a similar mechanism but that would not rely on the use of the external shell script but on a directory that would be considerate as a skeleton for creation of the new workspaces. Something very similar to what is used on Unix and GNU/Linux platform when you create a new user and specify a directory to be used as a model for creating the new home directory. We expect this new feature to be available in the coming weeks.
> 
> About the third idea [6], we did not have enough time to work actively on this. Nevertheless, we have detected some issues in the answer provided by the ZOO-Kernel for the execute request [7], we have extras keys that are not required. So, this has proven that some minimal testing of the OGC API - Processes would be a real asset to the project helping us to detect issues when they are inserted in the code as it is the case for WPS already. So, this is a target we should keep in mind and planned for future work items.
> 
> For the forth idea [8], we have started by creating a new Dockerfile adding the MapServer latest version available on the official GitHub repository [9]. Nevertheless, by doing so, we realized that when you are willing to use OGC API - Feature, you should use a simplified URLs such as /mapserv/MyMap/ogcapi/ where MyMap should point to the full path of the corresponding MapServer’s mapfile as an entry in the /etc/mapserver.conf (cf. RFC 135 [10]). So, it was a requirement for being able to publish outputs as OGC API - Features, to be able to update the mapserver.conf by adding a new entry for all the created mapfiles. We tried locally to modify the Mapserver source code by adding this new function to save the MapServer 8.0 config file and successfully build and setup this demonstration instance [11]. Along the three days, the MapServer community integrated the proposed new function in the official GitHub repo. In consequence, we did modify the Dockerfile to use the official repository rather than the local copy we were using before and pushed this version on GitHub. As this work was very last moment, we decided to host it on my personal fork for further testing, cleaning and debugging before integration within the official GitHub repos of the ZOO-Project.
> 
> Actually, by integrating an upcoming version of MapServer inside the Dockerfile we realized that it may be better to host the Dockerfiles  and associated docker-compose.yaml in another repository than the ZOO-Project itself. This way we would be able to build multiple version of the software and, we may also consider offering different flavor of the ZOO-Kernel rather than having everything integrated within the default latest ZOO-Kernel binary docker image [12]. Any volunteer for this task would be much appreciated.
> 
> Also, on the other hand, Rajat joined also the Code Sprint and have worked on a Demo HTML UI that ay be used as a base for presenting dataset available as OGC API - Features [13]. This can be used as a base for integrating service execution using OGC API - Features as input.
> 
> Actually, on this specific topic, this is where I told you in introduction that it may be done the wrong way. To explain it quickly, using OGC API - Features for input, for instance, is a bit more complicated than simply passing a URL for fetching the data. Indeed, in case the pre-defined server limit for the number of items returned is reached, then your processing will run only a part of the dataset. In consequence OGC API -Features support for input would require more than what is currently available for downloading the data. On the other hand, there are ongoing discussions on GitHub [14] about accessing directly the outputted OGC API - Features collections to get the processing to be run on the fly for the subset you are currently requesting. One more time, this would require more development to be supported.But it would definitely be a great addition.
> 
> To finish and be complete with the report, unfortunately due to the time spent on other ideas, we did not have enough time to investigate more the fifth idea [15] but, we still think that it would be a worth investigating.
> 
> To conclude with the code sprint report, I would like to say that it was very nice to be able to meet and discuss with participants.
> 
> Any feedbacks/comments are welcome.
> 
> Best regards,
>   
> 
> [2] https://github.com/opengeospatial/developer-events/tree/master/2022/Joint-OGC-OSGeo-ASF-Code-Sprint%202022/ideas/ZOO-Project/RabbitMQ-OAPIP
> [3] https://github.com/opengeospatial/developer-events/tree/master/2022/Joint-OGC-OSGeo-ASF-Code-Sprint%202022/ideas/ZOO-Project/ADES
> [4] https://github.com/EOEPCA/proc-ades
> [5] https://github.com/bbrauzzi/ZOO-Project/tree/feature/user-services
> [6] https://github.com/opengeospatial/developer-events/tree/master/2022/Joint-OGC-OSGeo-ASF-Code-Sprint%202022/ideas/ZOO-Project/OAPIP-Testing
> [7] https://github.com/ZOO-Project/ZOO-Project/issues/15
> [8] https://github.com/opengeospatial/developer-events/tree/master/2022/Joint-OGC-OSGeo-ASF-Code-Sprint%202022/ideas/ZOO-Project/MapServer-OAPIF
> [9] https://github.com/gfenoy/ZOO-Project/commit/b7952aaee04abc105fb11aedbc9d0bd8cdf51f26
> [10] https://mapserver.org/development/rfc/ms-rfc-135.html
> [11] http://cs2022.geolabs.fr:8112/
> [12] https://hub.docker.com/r/djayzen/zookernel/tags
> [13] https://omshinde.github.io/ogc-api-features-leaflet/
> [14] https://github.com/opengeospatial/ogcapi-processes/issues/279#issuecomment-1046682392
> [15] https://github.com/opengeospatial/developer-events/tree/master/2022/Joint-OGC-OSGeo-ASF-Code-Sprint%202022/ideas/ZOO-Project/ActiveMQ-OAPIP
> 
> 
> Gerald Fenoy
> Chair, ZOO-Project PSC
> 
> 
> 
> _______________________________________________
> ZOO-discuss mailing list
> ZOO-discuss at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/zoo-discuss



More information about the ZOO-discuss mailing list