[Zoo-discuss] zoo.update_status in a Python Zoo service

Tue Oct 18 04:14:20 PDT 2016

Dear Gerald,

Many thanks for a comprehensive and helpful reply.  I am now running my service asynchronously, which I clearly hadn't told Zoo to do before, despite thinking I was!  I am pleased the GetStatus request is available in Zoo even using WPS 1.0.0 where it is not implemented in the standard.  I was thinking I would have to use 2.0.0, where Execute only supports POST within the body of the request.

I think I'm pretty close to having this working now.  The only problem I now have relates to the URL rewriting.  My execute response document statusLocation looks like:

statusLocation="http://<server>/cgi-bin/zoo_loader.cgi/call/GetStatus/c40d5f9c-951e-11e6-a413-0050560119d2"

which is a hybrid of the non-mod_rewrite URL /cgi-bin/zoo_loader.cgi?... and the more REST-style URL I would expect (/zoo/call/GetStatus/<sid>).  I suspect this is due to the entries in main.cfg:

serverAddress = http://<server>/zoo
rewriteUrl = call

The .htaccess file containing these rules:

RewriteEngine on
RewriteRule call/GetStatus/(.*) /cgi-bin/zoo_loader.cgi?request=Execute&service=WPS&version=1.0.0&Identifier=GetStatus&DataInputs=sid=$1 [L,QSA]
RewriteRule call/(.*)/(.*) /cgi-bin/zoo_loader.cgi?request=Execute&service=WPS&version=1.0.0&Identifier=$1&DataInputs=uuid=$2 [L,QSA]
RewriteRule (.*)/(.*) /cgi-bin/zoo_loader.cgi?metapath=$1 [L,QSA]
RewriteRule (.*) /cgi-bin/zoo_loader.cgi [L,QSA]

is in /var/www/zoo.  As you can see I've modified the documented ones a bit as my service has a "uuid" data input, and I think GetStatus requires it to be called "sid".  My understanding is that these should work, and indeed I can call my service successfully using:

http://<server>/zoo/call/<ServiceIdentifier>/<uuid>

as expected.  What's controlling the URL that gets written into the statusLocation attribute?

Many thanks,

David.

________________________________
From: Fenoy Gerald <gerald.fenoy at geolabs.fr>
Sent: 18 October 2016 10:19:40
To: Herbert, David J.
Cc: ZOO-discuss
Subject: Re: [Zoo-discuss] zoo.update_status in a Python Zoo service

Hello David,
it seems that you have identified clearly what the update_status is for.

As you can see is the very basic example [1] demo service which use update_status, you can also add a message by using the conf[[« lenv »][« message »]. Personally, I would try this service first on the same platform where the error occurred to be sure that the issue also occur for this one.

Obviously, only using the update_status function won’t make automatically your service running asynchronously. Indeed, this is defined in the client request by using the storeExecuteResponse=true and status=true for WPS 1.0.0.

I think in your case this is the issue, you are trying to execute the service without asking to run it asynchronously, which mean that it will run only after returning to the client a Url (statusLocation) that can be polled for details about the ongoing status.

So to try with the demo service, load the following Url:

http://localhost/cgi-bin/zoo_loader.cgi?request=Execute&service=wps&version=1.0.0&Identifier=demo&DataInputs=a=toto&status=true&storeExecuteResponse=true

This should return a valid Execute response document containing the statusLocation that you can poll. Something like the following:

http://localhost/cgi-bin/mm/zoo_loader.cgi?request=Execute&service=WPS&version=1.0.0&Identifier=GetStatus&DataInputs=sid=B9E80F8C-9512-11E6-8081-000C6C07034C&RawDataOutput=Result

Note that the ressource produced by the GetStatus service is not stored but generated by coupling the execute response stored on the file system and the last couple message/pourcent stored in a shared memory segment or in a db backend.

I hope this simple example will work on your side and will help you to figure out what happen in your own service, in other case please feel free to add more information about the debugging of the service execution, maybe by running it from the command line.

I hope to hear back from you.
Best regards,

[1] http://zoo-project.org/trac/browser/trunk/zoo-project/zoo-services/utils/status/cgi-env/service.py

> Le 14 oct. 2016 à 16:55, Herbert, David J. <darb1 at bas.ac.uk> a écrit :
>
> Hello,
>
> I am developing a Zoo WPS service in Python on an Ubuntu 14.04 Linux platform.  The particular service includes the download of quite large image files, so can obviously be quite long running, especially for slow ftp sites.
>
> I am confused by the documentation about the Zoo status service, which I am thinking allows the querying by a client of the continued progress of the long running service.  I am assuming it uses the storeExecuteResponse capability of WPS to update a document at a given URL with progress information (a percentage complete figure, for example).
>
> So I have Python code which iterates chunk-wise over the content of a large image, computes an integer percentage complete figure, and calls zoo.update_status(conf, percent_complete).  I would expect this to update a web-accessible document somewhere (and another question I have is : where exactly, as a URL, is this document so I can access progress of my service?).  I am confident I am passing a valid service configuration object to the call, and a valid integer between 0 and 100. When I make the call, I get the error:
>
> /usr/lib/python2.7/threading.py:1160: RuntimeWarning: tp_compare didn't return -1 or -2 for exception
> return _active[_get_ident()]
>
> The request then hangs and eventually times out.  If I remove the update_status call, the download proceeds perfectly, according to trace in the Apache log and the returned XML document from the service itself.  Surrounding the zoo.update_status call with a blanket try/except doesn't reveal any more information about what the error was unfortunately.  It seems like it might be a rather low level error?
>
> I built the status service according to:
>
> http://www.zoo-project.org/docs/services/status.html
>
> and did not receive any errors in the build process, and copied all necessary files to the right places as far as I know.  I am also using the mod_rewrite rules for URLs according to:
>
> http://www.zoo-project.org/docs/kernel/install-debian.html#rewrite-rule-configuration
>
> Which appears to be working well, again as far as I can see.
>
> Can someone enlighten me as to what the above error might be?  Googling other occurrences of this particular error (only found in contexts other than Zoo) haven't sparked off any ideas on my part.
>
> Hope someone can help!  Happy to provide more info/code etc if this can help tracking it down.
>
> Best regards,
>
> David Herbert
> British Antarctic Survey
> This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
> _______________________________________________
> Zoo-discuss mailing list
> Zoo-discuss at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/zoo-discuss

Gérald Fenoy
http://wiki.osgeo.org/wiki/User:Djay

________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/zoo-discuss/attachments/20161018/af253ff9/attachment.html>