Drop and Compute
NGS Drop and Compute is based on the Drop and Compute idea by Ian Cottam at the University of Manchester. Drop and compute allows users to run and monitor jobs via the NGS WMS by manipulating files rather than using the gLite UI commands.
The system is based on the following components:
- A gLite UI node
- MEG (MyProxy Enabled Gsissh)
To make use of this software you'll need these two components installed and configured - you could take NGS R&D's Prepackaged UI as a starting point.
You will need to also install mailx and sendmail if you wish to allow email notification.
Licence: BSD Licence
Download: drop-and-compute-v2_4.zip
Features
The following features have been added to the basic system from Ian:
- Support for .zip, .tar, .tgz, .tar.gz formats
- Multiple job submissions in a single archive
- Cancel/status on sub-job or whole job basis
- Email notification of completion
- Supports multiple users and pool accounts
- User's jobs are fully authenticated as the submitted user
How to use
- User uses the NGS Certificate Wizard to store a VOMS proxy (ngs.ac.uk VO) in the NGS MyProxy server, with their choice of username and password.
- The user makes a SFTP connection (e.g. with a Graphical browser such as WinSCP) to port 2223 of the node running "drop and compute" using the above username and password. The MEG software tries the username and password with the Myproxy server and if successful downloads the stored credential, this is checked using a standard gsissh server on the node to see if the user is allowed access. If so it maps the users and stores a copy of the credetinal in the standard location.
- The user can then copy a tar/tgz/zip file containing their input files and JDL files to the "drop" directory on the "drop and compute" node. The files are expanded into a unique directory and any .jdl files within the archive are submitted to the WMS.
- The system will regularly check on the jobs and when complete will retrieve the output files and move them to a sub-directory of the submit directory. There will also be a file EXIT_INFO_X.txt with a detailed log of the job.
- When all the sub-jobs from a single archive are completed an "ALL_JOBS_DONE.txt" file is produced to inform the user that the jobs are complete. If the user sent a file called "notifyme" or "notifyme.txt" within the main directory of the archive and this file contains a valid email address they are also notified by email.
- When jobs are started (step 3.) other files are produced within the expanded submit directory. These have the form X.kill and X.status; and where there are more that one job in an archive also X.killall and X.statusall. These cause the system to cancel or return status information on the corresponding sub-jobs or all jobs when they are copied to the users "drop" directory. In the case of the status commands files are produced called: QUEUEINFO-X.txt in the "drop" directory.
There is also a .cleanup file - when copied to the "drop" directory this causes all subjobs in a job to be killed and all the files/directories which contain information about the job to be deleted.
Please note the system is tolerant of the case-insensitivity of Windows/Mac systems, but if the case of the .cleanup/.kill/.status/.killall/.statusall files are changed then the files will not be recognised correctly.
- Notification of errors also occurs through such text files.
Installation
Once the pre-requisites described in the introduction are installed then simply extract the "drop and compute" distribution to a suitable directory, for example /opt/drop. Then edit the settings at the top of etc/drop. The options are as follows:
- ROOT_DIR : default /opt/drop
The place where the "drop and compute" system was extracted to. - GLITE_LOCATION : default /opt/glite/etc/profile.d/grid-env.sh
This is the location of the grid-env.sh for your gLite UI install,- for RPM installs: /opt/glite/etc/profile.d/grid-env.sh
- for tar-based UI, installed in /opt/glite: /opt/glite/external/etc/profile.d/grid-env.sh
- USER_REGEX : default "ngs[0-9]+"
This is a space separated list of user regular expressions for pulling out username which the script should process for. Note these users also need a "drop" directory. It would be worth adding this to /etc/skel - OPTIMIZE_SEARCH : default yes
This toggles faster searching - saying no is kind of a debug mode as what the scriptis doing is much more obvious, but much much slower. - ADMIN_EMAIL : default ""
This enables notifcation emails and is the address that is put in the email to users for a contact in case of emails going to the wrong person. - DEBUG : default ""
This enables more detailed logging (command output of all commands) when set to "yes".
Finally copy the etc/drop script to /etc/init.d and use commands similar to the following(for RedHat/SL/CentOS) to start the service:
service drop start chkconfig drop on
The log files are in /opt/drop/log/drop.log and are verbose.

