Connecting Infrastructure, Connecting Research
Syndicate content
Updated: 12 min 58 sec ago

A flurry of activity

5 hours 31 min ago
There has been a batch of new speakers recently announced for the NGS Innovation Forum '10 and further information about their presentations are now on the website. The latest presentations are NGS tool demos which will consist of walk-throughs of NGS tools using real research examples so delegates can leave the event with the knowledge of new tools to use in their research.

NGS tool demos
1. Transcriptome Analysis using the NGS User Interface /Workload Management System (UI/WMS) – Jonathan Churchill, NGS, STFC RAL
The UI/WMS is a tool which allows users to easily submit jobs to the whole of the NGS relying on the WMS to chose which NGS resources to use for their jobs. Use of the UI/WMS will be demonstrated with a user case study in which analysis time of mRNA was decreased from a month to less than 12 hours.

2. Accessing the NGS using the Application Hosting Environment (AHE) – Stefan Zasada, UCL
An overview of how access to the NGS can be simplified using the Application Hosting Environment, a lightweight application portal system.

3. Using the HERMES data management tool – David Wallom, NGS, University of Oxford
Here we will show how easy it is to install and connect into various NGS resources to move data between them, your home institution and your desktop.

4. The NGS from the CCP4 desktop – Matteo Turilli, NGS, University of Oxford
The NGS R&D theme have been working to build access to the NGS into the desktop tools that researchers use on a day-to-day basis, in this presentation we look at the example of CCP4: Software for Macromolecular X-Ray Crystallography.

We also have a presentation from the Director of the NGS -
The future of the NGS – Neil Geddes, NGS Director, STFC RAL
This presentation will look at the focus of activities for the NGS for the coming 2-3 years and possible longer term opportunities.

Remember that registration for the event is now open and that the call for poster abstracts closes on the 10th of September!

Registration for NGS Innovation Forum open now!

31 August, 2010 - 12:23
The registration for the NGS Innovation Forum is now open - details of how to register can be found on the event page on the NGS website.

We are pleased to announce another speaker for the event. Neil Geddes, director of the NGS, will be speaking about the future of the NGS so if you are a long term NGS user who wants to know the future direction of the NGS or a new user who is planning to use the NGS for the long term, then make sure you attend!

A reminder that we are still looking for NGS users to submit poster abstracts demonstrating how they use our resources in their research. The deadline for abstracts is the 10th of September so it's approaching soon! There are many benefits of submitting an abstract and attending the event -

  • Walk through demos of new NGS tools
  • NGS staff on-hand to answer your questions
  • The opportunity to contribute and feedback to the future of the NGS
  • The poster abstracts will be peer reviewed by the NGS IF'10 programme committee
  • Publicity for your research both at the event and through accepted posters being placed on the NGS website
  • The chance to win a prize as "best poster" as voted for by IF'10 delegates
All that is required is a short 200 word abstract! Of course you are more than welcome to attend the event without submitting an abstract and you can attend for one or both days. We hope you can come along!

Ravioli code

27 August, 2010 - 22:31
The ngs-vo-tool - a utility for organising your virtual organisations - was released to the NGS's area on NeSCForge earlier this week.

Previous postings have described the reasons why the tool was written and how we worked out what it needed to do. This one is about developing and testing the program code itself.

So if you thought the 'Rough guide to the User Account Service' was dull, look away now.

The ngs-vo-tool was meant to be an example of 'software as documentation'.

If you are running complex software, which needs configuration information to be scattered around multiple files in different formats, and want to describe how you set it up then you have two options...
  • Write a long and detailed description of every stage in the process and expect people to read and follow the documentation.
or
  • You write a program to do the dirty work and expect anyone who wants to know how it works to read the program.
We are doing the latter. This is why ngs-vo-tool is written in Python - a language designed to be readable by people who don't actually know the language. Take this example from the code:

class LcmapsMapper(BaseMapper):
"LCMAPS gridmapfile and groupmapfile entries"

... snip ...

def add_mapping(self, vo, acinfo):
self.assert_mapping_args(vo,acinfo)

gridmap_e,groupmap_e = mapfile_entries(vo,acinfo)

if gridmap_e not in self.gridmap_entries:
logger.debug("Adding <%s> to gridmapfile" % gridmap_e)
self.gridmap_entries.append(gridmap_e)

if groupmap_e not in self.groupmap_entries:
logger.debug("Adding <%s> to groupmapfile" % groupmap_e)
self.groupmap_entries.append(groupmap_e)

... snip ...


As long as you can wrap your brain around Python's way of using indentation to group related code together, that is, describes the process of adding VO information to the contents of a LCMAPS gridmapfile and groupmapfile.

The ngs-vo-tool was also an attempt to apply the kind of software engineering techniques that our colleagues at the Software Sustainability Institute want to encourage in academia. In particular, we tried to use something approximating to Test Driven Development (TDD) to keep the bugs under control.

in TDD, for each bit of program code, there is a corresponding bit of test code to put it through its paces. A test harness is used to run all the tests whenever the code is changed.

If you have the source code, run:

python setup.py test

to watch all 57 tests fly past.

In proper TDD, the test is written first and run before any attempt is made to write the thing being tested. I'm afraid that I was more pragmatic and wrote the tests and code at the same time.

This approach encourages the developer to split the code into largely-self-contained modules with well defined interfaces simply because these are much easier to write tests against. Some people refer to this as ravioli code.

I would be a fool to claim that the program is bug-free - although the testing did shake out bugs early in the development process and I do think that the code is cleaner and easier to read because of this approach.

Someone at eScience centre at Southampton once came up with the phrase 'making the useful usable' as a way of describing their work. I hope that someone will find the ngs-vo-tool useful in making their services to the grid usable.

Application hosting environment and UI/WMS - preparations continue!

24 August, 2010 - 13:32
I've just added some details of more presentations at the forthcoming Innovation Forum to the website. We have two demos - one of the UI/WMS which has proved to be a big hit with our users and the Application Hosting Environment (AHE). Full details of the presentations are available on our event page on the website. Delegates will see walk-throughs of how to use these tools on the NGS and the benefits they can bring to your research.

We have also updated the UI/WMS tutorials if you want to give the UI/WMS a go beforehand!

User success stories at the NGS

20 August, 2010 - 11:12
The programme for the 3rd NGS Innovation Forum is really beginning to come together now. I've managed to secure presentations from 3 NGS users from completely different research areas to talk about how they have used the NGS in their work.

First up we have Zhongwei Guan from the University of Liverpool who leads a research group where many researchers use the NGS. They research into the impact of explosions on aircraft fuselages as well blasts and impacts on concrete amongst many other things. I have seen Zhongwei speak at previous events and his presentations are colourful and interesting!

Next up in a complete change of direction, we have Luke Rendell from the Centre for Social Learning and Cognitive Evolution at the University of St Andrews. Luke came to the NGS to ask for resources to run an international computer tournament on the evolution of learning. The tournament and results were so productive that the resulting paper was published in Science and was featured in New Scientist.

Finally, and by no means least, we have Narcis Fernandes-Fuentes who has used the NGS for several years to discover novel therapeutic agents. Narcis has had his research turned into a NGS user case study as an example of the type of research that can be performed on the NGS.

The deadline for poster abstracts for the event is approaching (10th Sept) so if you would like to submit an abstract and be in with a chance of winning the best poster prize, please submit soon!

Source code archeology

18 August, 2010 - 23:38
I'm afraid this is going to be technical.

For the last month or so - in the gaps between holidays, meetings and dealing with a power-glitch that has knocked-out some rather important bits of ngs.leeds.ac.uk - I've been working on the ngs-vo-tool.

The ngs-vo-tool is a utility program that does the tweaking and fiddling needed when adding or removing support for all, or part, of a particular Virtual Organisation (VO).

One of the things the ngs-vo-tool needs to tweak and fiddle is the LCMAPS version of the gridmapfile. This controls how which bits of of which VO get assigned to which local accounts and consists of entries like...

"/training.ngs.ac.uk/*" .ngstrain
"/monitoring.ngs.ac.uk/lcas_lcmaps/*" .ngsmon

This particular example assigns anyone in the NGS's Training VO to an account in the 'ngstrain' pool but only members of the 'lcas_lcmaps' group within the NGS Monitoring VO to the ngsmon pool.

Groups can contain subgroups. You can also cherry-pick VO members with a particular role or a particular capability.

Among the last features due to be added to ngs-vo-tool is one to allow any combination of group/role/capability to specified and have this turned into something fit for an LCMAPS gridmapfile.

As always, it is not a simple as it first appears.

The bit in quotes is a pattern that matches a Fully Qualified Attribute Name (FQAN).

The FQAN is a representation of VO, group, subgroup, role and capability defined in http://edg-wp2.web.cern.ch/edg-wp2/security/voms/edg-voms-credential.pdf as

/VO[/group[/subgroup(s)]][/Role=role][/Capability=cap]
with the additional complication is that the FQAN for a VO member with no role can either omit the Role=role bit or explicitly include 'Role=NULL'.

So in order for the code to do the right thing, I need to work out..
  • What the LCMAPS uses to match a pattern to a string. In particular, is it fussy about where the '*' can be placed and how many '*'s can be used.
  • How the FQAN is constructed.
and do this for the slightly elderly version of LCMAPS that some NGS sites have deployed.

So a spot of source code archeology is required and luckily, I don't need to dig too deep as CERN kindly provide access to their source code repository on the web.

The LCMAPS code, and the rest of the gLite code, can be found at http://jra1mw.cvs.cern.ch. Released versions are even conveniently 'tagged' with the version number at that release - allowing the incurably geeky to jump directly to the relevant files.

This is work in progress. So far, I've worked out that the venerable Unix fnmatch function is used to match the pattern to the string and fnmatch allows '*'s to be used anywhere.

The exact details of FQAN construction are still buried somewhere but suitable fnmatch patterns should cope with the many variants of Role.

The code is in our local version control system. It will be copied to the source code repository at NeSCForge as soon as the important bits of ngs.leeds.ac.uk are back in service.

Innovative NGS

12 August, 2010 - 14:10
The more observant amongst you may have noticed a recent addition to the top level tabs on the NGS website.

As well as making sure that the NGS runs all the services required by our users and sites on a day to day basis, we are also working on services for the future. To keep you up to date with our work "behind the scenes", we have added a new Innovation section to the website.

This section currently includes information on our prototype cloud service and user interfaces such as as GSI-PuTTY and R through the NGS (Windows). Some of these services are looking for people to help develop the applications or to be the first users and to help develop them for the wider community.

We hope that you find this section interesting and useful!

Want to know how our databases work?

5 August, 2010 - 14:49
Then come along to the NGS Innovation Forum in November!

We have recently announced the details of our first few presentations with Simon Collins from the NGS team at the University of Manchester bravely giving two presentations.

His first presentation on day 1 will cover using databases on the NGS. It will give an overview of the databases supported on the NGS, what offerings and advantages these give to users and real-life case studies of how NGS users are taking advantage of this service.

Simon will also be giving a second presentation on day 2 aimed more at our member sites which will look at accounting services and solutions for NGS member sites. Current and potential sites will find out how accounting is supported on the NGS, the different choices available for sites to report accounting information and the information and tools that are available to sites that choose to do so.

Keep your eye on the NGS website over the next few weeks to see more details of presentations at the IF.

Splendid Isolation

2 August, 2010 - 10:45

I wandered lonely as a.... typical - I come back from a week's leave with a post on cloud computing to discover that every cloud related title has already been used.

The last few weeks have seen what can only be called a storm of cloud activity - the release of the Openstack cloud software; Amazon adding a cloud tuned for High Peformance Computing applications to their collection; the publishing of two reports on academic use of clouds and, or course, the public launch of the NGS's own prototype cloud service.

Some commentators have claimed the launch of the Amazon service means that The Grid is Dead. Obviously, those of us in the grid world disagree - as Craig Lee of the Open Science Grid explains.

Alternatively, we are now all officially zombies - mmm.. brains.

People who see the cloud as a computional grid with all the nasty authentication stuff ripped out are misunderstanding the point of nasty authentication stuff in a world where not all data is open and access control and audit trains are vital.

The people who say that the cloud is the greatest thing since sliced bread are also missing the point. Most clouds are quick-and-easy ways to get your hands on a virtual machine. It is virtual machines that are the greatest thing since sliced bread.

Why? It is all about keeping things separate.

Many modern applications are bad neighbours. They do not sit happily on a server but need plumbing into web service containers such as Tomcat or Glassfish and databases such as MySQL or PostgreSQL. It is not uncommon for the applications to require specific versions of their supporting software.

Trying to persuade two such applications to co-exist on the same server is the stuff of system administrators nightmares - sitting on the sysadmin scale of horror somewhere near an operating system upgrade or running out of coffee.

Virtual Machines let you dump these applications and their hangers on in an isolated environment comparatively cheaply and easily. Isolating applications isn't a new idea - it was implemented in the mainframe systems of the 1960s - and system admins have been using techniques such as chroot jails for decades to provide some level of separation.

With apologies for borrowing great poetry to make a cheap point: Wordsworth's poem Daffodils - the one that starts 'I wandered lonely as a cloud' - mentions the 'the bliss of solitude'. It is the solutude of computing clouds that makes them useful.



And relax

29 July, 2010 - 15:27
This time of year is always a bit quiet and it's an ideal opportunity to catch up on things and to get organised for the forthcoming conference season.

At the moment I've mainly been focusing on the organisation of the NGS Innovation Forum which will be held in November at STFC RAL.

One of the main new features of the IF will be a poster session on the Tuesday night which will enable NGS users to demonstrate how they have used NGS resources in their work. We're currently calling for abstracts and we'd like to strongly encourage all NGS users to submit a 200 word abstract for the event. There will also be a prize for the best poster as voted for by the delegates on the day.

Deadine for abstracts is the 10th of September!

I have also been busy confirming speakers so announcements will start appearing in the fortnightly email bulletin and on the website so keep an eye out on those! First announcement tomorrow (Friday) just to keep you all in suspense...

NGS clouds take off!

22 July, 2010 - 12:23
On Tuesday we announced that the NGS cloud prototypes were ready and that we were seeking users. Well I'm glad to say that we have had an excellent response and we have identified several use cases to kick off with. NGS staff are now in discussions with users to take these forward and get them up and running ASAP.

If you would like to be amongst the first to use the new NGS cloud prototypes and to receive assistance in getting started then let us know by contacting the helpdesk with details of your research. Contact details are available on the NGS website.

Cloudy skies

20 July, 2010 - 14:15
There has been a lot of news about clouds recently, and looking out of my office window, I can certainly see plenty! I thought this was supposed to be summer?

Here at the NGS we are talking about clouds in a good way with the announcement that our clouds prototype is now live and we are looking for NGS users who would be interested in using the resources.

We have two cloud infrastructure prototypes available - one based in Edinburgh and one based in Oxford. Both sites have staff on hand to help you get started using the resources. If you are interested in finding out more please see the announcement on the NGS website.

Delivering data

16 July, 2010 - 14:21
There now follows a Public Service announcement from The National Grid Service Department of stating the bleeding obvious.

There is very little point in using Grid software on a machine in Daresbury to run an application on a computer near Didcot if the data you need is stuck on a server in Darwin.

That statement is not going to be a surprise to anyone. After all, the Worldwide LHC Computing Grid was built to ship the flood of data from CERN to somewhere where it could be stored and then on to somewhere where it can can be analysed.

When delivering data, there is definitely more than one way to do it: you could use GridFTP or SRB or iRODS or SRM, or SFTP or FTP or WEBDAV or HTTP or even, if you are feeling old fashioned, read and write to files on a local disk.

Things get more complicated when you need to collect data through one mechanism and deliver it through another. In practice, this almost inevitably means that the data is copied onto local storage before being sent to its final destination.

This is not practical if there is a lot of data and you are on a comparatively slow network connection.

This is one of the problems that the DataMINX Data Transfer Service (DTS) aims to solve.

The DTS is an international collaboration jointly funded by the Australian Research Collaboration Service and OMII-UK. It isn't really NGS R+D but it is built on earlier work from the NGS and staff from the NGS have provided much of the development effort.

The idea behind DTS is that you give the job of delivering your data to the DTS in very much the same way as you would give the job of delivering a favourite Aunt's birthday present to a parcel courier service.

A courier will have a network of planes, trains, vans and delivery drivers to collect the parcel and carry it to its destination. You just have to book your collection. Auntie just needs to sign for the parcel.

Delivery in the DTS is done by pools of worker nodes with fast network connections and the wherewithall to send and receive data using the many network protocols. An internal messaging system that allows requests for data transfers to be made and for the status of the transfers to be reported.

In software terms, the developers of DTS have deliberately avoided reinventing the wheel - something for which the Grid has a not-entirely-undeserved reputation. Where possible, they have adopted and adapted existing widely-used libraries. For example:

There is much more to DTS than can be covered in a blog post. If you want to know more: a powerpoint presentation describing of how DTS works can be found, with the source code, on the projects web site (http://dtsproject.googlecode.com) and a formal paper describing the work due to be published in Philosophical Transactions of the Royal Society A in late July or early August.

[With thanks to David Meredith of the DTS project.]

Head in the clouds

14 July, 2010 - 15:43
That's certainly where my head has been for the last 2 weeks as I've been on holiday! A big thank you to Jason and Jens for keeping the blog ticking over whilst I've been away.

However the title doesn't just refer to holidays (even though that's what most people are thinking of at this time of year) but also cloud computing. When I was away JISC released 2 reports on Cloud for Research.

The first report is entitled "Using cloud computing for research" and aimed to
  • document use cases for cloud computing in research for data storage and computing;
  • develop guidance on the governance, legal and economic issues around using cloud services for storage and computing in academic research;
  • make recommendations to JISC on possible further work in the area for data storage and computing.
The second report is entitled "Technical review of cloud computing for research" and focused on the following areas -
  • the current status of cloud computing in research communities;
  • state-of-the-art cloud technologies in academic, commercial, and industrial domains;
  • technical guidance on the use, adoption and migration to cloud computing for research;
  • recommendations for future technical and standardisation work to JISC.
The reports and the resulting recommendations make for some interesting reading.

Organising Virtual Organisations

11 July, 2010 - 01:33
In one version of the future of the grid, we will be awash with Virtual Organisations (VOs).

There will be VOs representing everything from whole institutions and research areas, through regional grids, right down to individual research groups.

Grid service providers will pick and choose the VOs that they are willing, able - or even paid - to support and each and every supported VO will have to be added to the system.

Adding support for a VO is not entirely simple...

Each Virtual Organisation needs a Virtual Organisation Membership Service (VOMS). Unsurprisingly, the VOMS maintains the list of who is in the VO and though the magic of digital certificates can act as the definitive source for this information.

The Grid service must 'know' about the VOMS server before it can support the VO. It must also be able to associate the VO with local usernames and groups. So for each VO, you must
  • Add the contact details for the VOMS server to the directory /etc/grid-security/vomses
  • Add the public key for the VOMS server to the directory /etc/grid-security/vomsdir.
  • Create accounts and groups to be associated with the VO.
    This can be complicated where the grid site is part of a network where usernames and groups are managed centrally.
  • Add entries to the LCMAPS gridmapfile and groupmapfile mapping VO membership to local usernames and groups.
    The exact location of LCMAPS configuration files depends on your local configuration - they could be within the $GLITE_LOCATION directory or within /etc/grid-security.
  • If you are providing a 'pool' - a set of accounts set aside for a particular VO - add each account in the pool to the gridmapdir.
  • Apply local tweaks - such as modifying the monitoring/osg-user-vo-map.txt used for configuration of some versions of the Virtual Data Toolkit - to reflect your local VO to account mapping.
This is the kind of task that really needs to be automated.

If you use the YAIM tools to manage your site, you can add the VO details to the vo.d directory and the user accounts to user configuration. Our colleagues at Glasgow Scotgrid use the widely used CFENGINE automation tool to prepare, configure and run YAIM when creating VOs.

The NGS provide a script called ngs-voms-configure with the VDT installer from the NGS area on NeSCForge.

Ngs-voms-configure was written at a time when NGS partner sites were expected to support a common set of recognised VOs. It collects lists of VOs from a central service, locates and downloads their certificates and (optionally) creates accounts and updates any files that need updating.

The ngs-voms-configure script needs to be modified if - for example - your site uses more sophisticated methods for creating accounts. It also has problems collecting certificates when there are strict outgoing firewall rules in place.

One of the current R+D projects is the ngs-vo-tool - which extended the automation provided by ngs-voms-configure for the brave new world of VOs everywhere.

We are planning to use 'VO Cards' - downloadable blobs of XML containing almost everything you need to know about a VO and allow the mappings between VO and local pool accounts to be defined in a configuration file that contains sections like...

[ngs.ac.uk]

vocard = http://cic.gridops.org/downloadRP.php?section=lavoisier&rpname=vocardPu
blic&vo=ngs.ac.uk

local_user = ngs0001-1000
local_group = ngspool
The script is being written in python, developed at Leeds and - as of a few minutes ago - the local repository is being mirrored to the NGS code repository at NeSCForge in a module called ngs-vo-tool.

It is not yet complete. When it is, we hope it will be of use in really organising virtual organisations .