Working towards Running Taverna Workflows on NGS resources
NGS resources are accessible and secured via internationally agreed grid protocols. In some cases, however, user communities require quick and simple access mechanisms which are at odds with the complexities of these middle grid layers. The pressure on service providers to keep up the secure interfaces is often in conflict with the requirements of the user communities to make access easy and we have to hide away unnecessary complexities in order to facilitate them.
Taverna is a user community driven middleware solution to manage computational workflows. Its primary mode of use is via a Graphical User Interface (GUI) to an underlying workflow enaction engine. This allows users to express and manage their scientific process as a workflow. "Effectively, Taverna allows a scientist with limited computing background and limited technical resources and support to construct highly complex analyses over public and private data and computational resources, all from a standard PC, UNIX box or Apple computer“. The GUI provides a user-friendly environment in which this work can take place but has limits in time, compute and data resources imposed by the computer on which the Taverna WMS is running.
To address these limits it is necessary to be able to enact long-running or CPU/data intensive tasks remotely and detached from the Taverna GUI. Fortunately as of version 1.7 of the Taverna software a command-line tool was provided allowing workflows to be enacted without the need for the GUI. This opens up the possibility to run workflows in a HPC/HTC batch scheduled environment, e.g. the NGS compute environment. The most recent release of the Taverna software has a second method for achieving this goal: the enactor can be run as a service.
At the e-Science All Hands meeting 2010, a team from the University of Manchester demonstrated how one might set up a user friendly mechanism to enact Taverna workflows on NGS compute resources, using the preproduction cloud offering at Oxford and the established grid computing offering at Manchester. The demonstration took on two forms to reflect both versions of Taverna.
Grid Approach
For the grid computing approach we installed Taverna 1.7.2 on grid resources (currently limited to vidar.ngs.manchester.ac.uk). This application is available in the same way that other NGS application have been made available, specifically we can start the enactor using Globus commands (for those interested instructions are available). To adopt the usability mantra central to Taverna's mission, it is necessary to hide both the Globus and the grid specific authentication from the user, we did this with the combination of a simple portal and the use of the NGS SARoNGS service to obtain necessary credentials using the UK's Shibboleth federation.
What remained for this part of the demonstration was to help the user to obtain and handle the workflows. Fortunately the myExperiment community repository is designed to enable storage and social sharing of such workflows. Each workflow page on the myExperiment website has a number of buttons to enable download, or execution (if you have Taverna installed). The natural extension to this for this demo was to place a "Run on the NGS" button for all Taverna 1 workflows it finds (Fig. 1). We were able to achieve this rather slyly without modifying the myExperiment service: modifying the browser's rendering of the myExperiment page. This was effected for the Firefox browser using GreaseMonkey and a script designed to modify myExperiment pages as they load.

Fig. 1 myExperiment with "Run on NGS" button
The grid part of the demonstration was to illustrate in real time the principles behind running Taverna 1 workflows via established NGS tooling. As such it is not currently set up to run in an asynchronous mode (Fig. 2 & 3). We hope to be able to provide such a service in the near future, and with the help of myExperiment team incorporate the "Run on the NGS" button fully into the production myExperiment site.
Fig. 2 Using SARoNGS to authenticate

Fig. 3 Any workflow input needed a test value called "condition", in this case (left), then results returned in a tarball (right)
Service Approach via cloud
We also demonstrated the first release of Taverna Server 2, which supports the substantially richer workflow model of Taverna 2 while providing more flexibility for management and security. The experimental version we showed has already been used for work in areas from genomic annotation processing to solar wind physics. It supports access from many different types of clients, especially including portals, and we also have a virtual machine image so that people can get working with Taverna Server 2 very rapidly and can work with it deployed on Cloud resources as well as the Grid. Key features planned for the future are seamless integration with the Taverna Workbench and myExperiment portal, as well as support for full integration with NGS so workflows will be able to manage jobs running on the NGS.


