[Pywps-dev] Guidelines for netCDF file and opendap accesss within pywps

David Huard huard.david at ouranos.ca
Wed Jun 27 05:47:54 PDT 2018


I've got something working, but it's not pretty... If the input mime type
is application/x-ogc-dods, the href handler skips the downloads and assigns
the link to the `data` attribute. If I ask for the file attribute, pywps
will download the file locally.

Now if I want my process to support both netCDF files and opendap link, for
file input it's the file_handler that'll set the file attribute, but then
the data attribute will hold the actual file's content, not the path to the
file. I guess I could special case the netcdf mime type in the file_handler
to set data to the file path, but it feels clunky.

I'm wondering if anyone has a better design idea in mind, that could extend
gracefully to other mime types? Should ComplexInput be subclassed by
mimetype, so that the file, stream and data handling as well as validation
is encapsulated in a class ?

One problem I can see cropping up is that as pywps extends support for
other "special" mimetypes, the dependencies will become harder to maintain.
Indeed, the netcdfvalidator requires netCDF4 to be installed, which is not
a light dependency. My guess is that pywps should support out of the box
the "light" mime types, and have a plugin mechanism for more complicated
ones.

David


On Tue, Jun 26, 2018 at 9:23 AM David Huard <huard.david at ouranos.ca> wrote:

> Thanks !
>
> I'll look at it and come back with a PR.
>
> On Tue, Jun 26, 2018 at 9:16 AM Jachym Cepicky <jachym.cepicky at gmail.com>
> wrote:
>
>> I belive,
>>
>> here is the place, where data get downloaded
>> https://github.com/geopython/pywps/blob/master/pywps/app/Service.py#L191
>>
>> út 26. 6. 2018 v 14:57 odesílatel David Huard <huard.david at ouranos.ca>
>> napsal:
>>
>>> Hi Jachym,
>>>
>>> Thanks for the pointers, I've started writing validators for netCDF. I'm
>>> still wondering where the decision to download a file is made? Can I
>>> shortcut that decision and avoid a file download if the href is a valid
>>> opendap link, ie it passes the validatenetcdf checks?
>>>
>>>
>>> On Fri, Jun 22, 2018 at 4:53 AM Jachym Cepicky <jachym.cepicky at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> yes ComplexInput should work for you - you can pass the url with the
>>>> data using "<Reference ... />" element.. see [1] for example
>>>>
>>>> Any Format can have (and has by default) `validator` function, which
>>>> return's, whether the input data are valid or no [3]. You can also use
>>>> `get_format` function [4] and set the validator there.
>>>>
>>>> Example, how validating function can look can be shapefile or gml
>>>> validators [5]
>>>>
>>>> You should probably extend foramts [2] with NetCDF mimetype
>>>>
>>>> But, this will check the file only after it was downloaded to PyWPS -
>>>> not the URL. Still. is that sufficient?
>>>>
>>>> Jachym
>>>>
>>>> [1]
>>>> https://github.com/geopython/pywps/blob/master/tests/requests/wps_execute_request-responsedocument-1.xml#L24
>>>> [2]
>>>> https://github.com/geopython/pywps/blob/master/pywps/inout/formats/__init__.py
>>>> [3]
>>>> https://github.com/geopython/pywps/blob/master/pywps/inout/formats/__init__.py#L42
>>>> [4]
>>>> https://github.com/geopython/pywps/blob/master/pywps/inout/formats/__init__.py#L215
>>>> [5]
>>>> https://github.com/geopython/pywps/blob/master/pywps/validator/complexvalidator.py
>>>>
>>>>
>>>>
>>>> čt 21. 6. 2018 v 17:15 odesílatel David Huard <huard.david at ouranos.ca>
>>>> napsal:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'd like to contribute a pull request to better handle netCDF files in
>>>>> pywps but I don't know where to start.
>>>>>
>>>>> We have a number of processes taking netCDF
>>>>> <https://www.unidata.ucar.edu/software/netcdf/> files as inputs. For
>>>>> those less familiar with the format, netCDF is based on HDF5 and a set of
>>>>> conventions <http://cfconventions.org/>. It is the standard data
>>>>> format in oceanography and climatology. netCDF files are usually stored on
>>>>> servers with support for opendap <https://www.opendap.org/>. This
>>>>> means that users can either download the netCDF file and then open it
>>>>> locally, or use the opendap protocol to open it remotely. What that means
>>>>> is that you can do
>>>>>
>>>>> from netCDF4 import nc
>>>>> ds1 = nc.Dataset("<path to local file>")
>>>>> ds2 = nc.Dataset("<link to opendap address>")
>>>>>
>>>>> and both ds1 and ds2 will behave identically. However ds2 is not
>>>>> downloaded locally, but rather read remotely on demand. If a file contains
>>>>> a 3D matrix (time, lat, lon), you can read one slice of the matrix without
>>>>> downloading it all.
>>>>>
>>>>> Some of our pywps.Process support both netCDF file and opendap access.
>>>>> We define a ComplexInput for the address to an actual netCDF file, and a
>>>>> LiteralInput for the opendap address.
>>>>>
>>>>> My question is whether there would be a clean way for pywps to support
>>>>> both modes with one ComplexInput? Internally, pywps would check if the
>>>>> address supports opendap (just check if nc.Dataset(url) works), and if not,
>>>>> would download the file locally to the server.
>>>>>
>>>>> In both cases, we could do
>>>>>
>>>>> ds = nc.Dataset(requests.inputs['resource'][0].file)
>>>>>
>>>>> I'm willing to put the time to do it, I just don't know where to
>>>>> start.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> pywps-dev mailing list
>>>>> pywps-dev at lists.osgeo.org
>>>>> https://lists.osgeo.org/mailman/listinfo/pywps-dev
>>>>
>>>>
>>>>
>>>> --
>>>> Jachym Cepicky
>>>> e-mail: jachym.cepicky gmail com
>>>> URL: http://les-ejk.cz
>>>> GPG: http://les-ejk.cz/pgp/JachymCepicky.pgp
>>>> _______________________________________________
>>>> pywps-dev mailing list
>>>> pywps-dev at lists.osgeo.org
>>>> https://lists.osgeo.org/mailman/listinfo/pywps-dev
>>>
>>>
>>
>> --
>> Jachym Cepicky
>> e-mail: jachym.cepicky gmail com
>> URL: http://les-ejk.cz
>> GPG: http://les-ejk.cz/pgp/JachymCepicky.pgp
>> _______________________________________________
>> pywps-dev mailing list
>> pywps-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/pywps-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pywps-dev/attachments/20180627/137d3941/attachment.html>


More information about the pywps-dev mailing list