NGS Workload Management System and User Interface tutorials
Purpose of this tutorial
By the end of this tutorial you will have been taken through the stages of running a job using the NGS UI-WMS (Work Load Management System).
This allows you to submit a job to the NGS as a whole and allow the WMS to select the most appropriate resources for running your job.
In order to do this you must go through the following steps:
1. You must have a valid digital certificate recognized by the NGS
2. You must create and upload a valid proxy certificate, which will allow the WMS to run jobs on your behalf.
3. You must describe to the WMS, using a JDL file, some of the parameters which you consider important in selecting the right resources to run your job.
4. You can then send your job to the WMS
5. After the job has run you can retrieve your output.
All of these steps will be covered in the tutorial.
Notes and prerequisites
This tutorial assumes you already have an NGS certificate which you have exported from your browser. If not please see the certificate section of the NGS web site (http://www.ngs.ac.uk/How-to-Join). The tutorial also assumes you have a NGS account (see the same section).
How to use the WMS
Creating and uploading a proxy
NOTE - Only create a normal proxy in myproxy.
The UI will create VOMS credentials for you correctly after you choose from a list of supported VOs (see screen grab from a UI login session below).
Login to the UI
To use the UI/WMS resource broker, users should login to the UI machine at
ngsui03.ngs.ac.uk
You can use either a normal SSH client (Putty, ssh command line etc), the VOMS enabled GSI-SSHTerm java terminal or other SSH clients you are familiar with.
Details below
SSH Login (preferred method)
You can login to the UI using any ssh client .
NOTE - Use port 2223 (rather than the normal default of 22).
The username and password to use, is that of an uploaded proxy in the myproxy server (at myproxy.ngs.ac.uk). eg
ssh –p 2223 <myproxy-name>@ngsui03.ngs.ac.uk
GSI-SSHTerm (Alternative method)
The NGS Portal tutorial includes a online tutorial in using the java VOMS enabled GSI-SSHTerm. You should start this tool from the GSI-SSHTerm page by clicking the orange "Launch" button.
If this is the first time you have run the VOMS enabled version of the tool, however you should run the Myproxy Uploader tool at least once to setup the NGS VOMS information required for GSI-SSHTerm.
On first logging in
* You need a valid 'VOMS proxy' to use the WMS to submit jobs. * Your proxy currently has no VOMS AC component or it has expired. * * Which VO, that you are a member of, can you run your job as ? dteam gin.ggf.org mott2.org nanocmos.ac.uk ngs.ac.uk ops training.ngs.ac.uk none : [ ngs.ac.uk ]
Hit <CR> to accept the default of ngs.ac.uk and a normal shell command prompt will be returned.
Checking Credentials
[ngs0055@ngsui03 ~]$ voms-proxy-info -all
subject : /C=UK/O=eScience/OU=CLRC/L=RAL/CN=jonathan churchill/CN=proxy/CN=proxy/CN=proxy issuer : /C=UK/O=eScience/OU=CLRC/L=RAL/CN=jonathan churchill/CN=proxy/CN=proxy identity : /C=UK/O=eScience/OU=CLRC/L=RAL/CN=jonathan churchill/CN=proxy/CN=proxy type : proxy strength : 512 bits path : /tmp/x509up_p9398.fileGD96zy.1 timeleft : 11:59:40 === VO ngs.ac.uk extension information === VO : ngs.ac.uk subject : /C=UK/O=eScience/OU=CLRC/L=RAL/CN=jonathan churchill issuer : /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/Email=support@grid-support.ac.uk attribute : /ngs.ac.uk/Role=NULL/Capability=NULL timeleft : 11:59:56 uri : voms.ngs.ac.uk:15010
This shows that your grid credentials are valid for 12hours (timeleft 11:59:40) and the VOMS part will last the same time (11:59:56). If your proxy expires during the time your job is running then it will fail (Note - this may differ from the behaviour of other NGS components).
Describe your job’s requirements: Create a JDL
Type = "Job";
JobType = "Normal";
Executable = "/bin/hostname";
StdOutput = "hostname.out";
StdError = "hostname.err";
OutputSandbox = {"hostname.err","hostname.out"};
Arguments = "-f";
RetryCount = 3;
ShallowRetryCount = -1;
Requirements = RegExp("ngs",other.GlueCEUniqueID);Checking resources: Job List Match
glite-wms-job-list-match -a hostname.jdl
Connecting to the service https://ngswms01.ngs.ac.uk:7443/glite_wms_wmproxy_server ========================================================================== COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* - ce2.ppgrid1.rhul.ac.uk:2119/jobmanager-pbs-ngs - cerb-condor.bris.ac.uk:2119/jobmanager-condor-ngs.ac.uk - grid.ecdf.ed.ac.uk:2119/jobmanager-sge-ngs - hepgrid2.ph.liv.ac.uk:2119/jobmanager-lcgpbs-ngs - ngs.oerc.ox.ac.uk:2119/jobmanager-pbs-workq - ngs.rl.ac.uk:2119/jobmanager-lsf-ngs - ngs.wmin.ac.uk:2119/jobmanager-pbs-default - vidar.ngs.manchester.ac.uk:2119/jobmanager-pbs-workq - condorngs.cf.ac.uk:2119/jobmanager-condor-INTEL_WINNT51 - ngs.leeds.ac.uk:2119/jobmanager-pbs-mpi - troilus.wrg.york.ac.uk:2119/jobmanager-sge-ngs.ac.uk ========================================================================== The details are not important at this point, beyond the fact that as long as you see one or more CE listed, then there is somewhere that your jobs should run successfully.
Note – this does not guarantee anything about the load on those nodes, just that the job would have the correct resources available to it on those nodes. Load information is one of the extra elements we could add to the JDL but it is not included yet.
If you create a second JDL file, without the Requirements constraint in the JDL given above, you will see a larger complete set of all NGS resources.
Checking NGS Node Status
lcg-infosites --vo ngs.ac.uk ce
You can use the output of this command to help you build up the requirements in your JDL. For example, you may wish to have a minimum number of CPUs to work with, you can check how many of the NGS nodes meet this requirement, before writing it into your JDL. The output will also give you some indication of how busy the various sites were when the command was run.
Note, the load information is by its nature always historical and should be used as a guide. The status of sites may change between the time of running lcg-infosites and submitting your job.
Note – this command starts with lcg- as it is a legacy command originating before the development of gLite (earlier all of the job submission and monitoring commands had been lcg- prior to the development of the WMS).
This gives an output such as:
lcg-infosites --vo ngs.ac.uk ce #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 249 1 707 222 460 ngs.rl.ac.uk:2119/jobmanager-lsf-ngs 252 0 259 65 194 ngs.leeds.ac.uk:2119/jobmanager-pbs-mpi 1000 110 792 651 101 scarf.rl.ac.uk:2119/jobmanager-lsf-scarf 252 91 11 11 0 ngs.oerc.ox.ac.uk:2119/jobmanager-pbs-workq 5 5 0 0 0 grid.ecdf.ed.ac.uk:2119/jobmanager-sge-ngs 492 370 102 102 0 grid2.lancs.ac.uk:2119/jobmanager-sge-serial 376 329 0 0 0 ce2.ppgrid1.rhul.ac.uk:2119/jobmanager-pbs-ngs 218 218 0 0 0 ngs.wmin.ac.uk:2119/jobmanager-pbs-default 0 0 0 0 0 grid2.lancs.ac.uk:2119/jobmanager-sge-lancaster 0 0 0 0 0 lancs1.nw-grid.ac.uk:2119/jobmanager-sge-serial 3360 51 1 0 1 lcgce02.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-gridS 0 0 0 0 0 troilus.wrg.york.ac.uk:2119/jobmanager-sge-ngs.ac.uk 144 92 49 49 0 cluster1.epsam.keele.ac.uk:2119/jobmanager-sge-all.q ....
Running a job: Job Submission
A job can be submitted with the command
glite-wms-job-submit -a -o MyID hostname.jdl
The option "-o" allows you to specify a file (in this case "MyID") to store the unique identifier of your job.
Note - This is useful as these are long URIs which have to be re-typed to get the job status when you are monitoring your job.
You can store the UID's for multiple jobs in a single file. The option "-a" generates a name for the certificate proxy that is associated with this job - we'll discuss this more later. You should get output similar to the following:
Connecting to the service https://ngswms01.ngs.ac.uk:7443/glite_wms_wmproxy_server ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://ngswms01.ngs.ac.uk:9000/VW8-SUBr6tNo_HU1JTNEXg The job identifier has been saved in the following file: /home/ngs0055/MyID ==========================================================================
Checking on your job’s progress: Job Status
To check on the current status of your job use the command
glite-wms-job-status -i MyID
If you wish to see more information, you can use the additional options "-v 2" and "-v 3" to increase the amount of information displayed. MyID is the file you created in the previous step to hold your job identity. When your job is complete you should get a status message similar to the below:
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job : https://ngswms01.ngs.ac.uk:9000/VW8-SUBr6tNo_HU1JTNEXg
Current Status: Done (Success)
Exit code: 0
Status Reason: Job terminated successfully
Destination: ngs.oerc.ox.ac.uk:2119/jobmanager-pbs-workq
Submitted: Fri Oct 2 17:50:45 2009 BST
*************************************************************
Getting your results: Job Output
Note - In this tutorial we demonstrate the method for manually retrieving output. Further tutorials will demonstrate the methods for automatically retrieving output to the UI machine.

To manually retrieve the output:
glite-wms-job-output --dir . -i MyID
The output files will be saved in the current directory.
Use "--dir ./<directoryName>" to put the files in a subdirectory.
Note - If you do not use the "--dir <directoryName>" option then this directory will be created under /tmp with a name based on the ID of the job.
If successful you should get output similar to:
Connecting to the service https://ngswms01.ngs.ac.uk:7443/glite_wms_wmproxy_server ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://ngswms01.ngs.ac.uk:9000/VW8-SUBr6tNo_HU1JTNEXg have been successfully retrieved and stored in the directory: /home/ngs0055 ================================================================================
Stopping your job: Job Cancel
If anything goes wrong a job can be canceled by the command:
glite-wms-job-cancel -i MyID
The output in this case is similar to:
Are you sure you want to remove specified job(s) [y/n]y : y Connecting to the service https://ngswms01.ngs.ac.uk:7443/glite_wms_wmproxy_server ============================= glite-wms-job-cancel Success ============================= The cancellation request has been successfully submitted for the following job(s): - https://ngswms01.ngs.ac.uk:9000/RWqWoNgRQswP91FJq2wK7w ========================================================================================
This is the end of this tutorial. You should now have successfully:
1. Created and uploaded a valid proxy certificate.
2. Created a simple JDL file to provide the WMS with some information about your job.
3. Identified resources
4. Run a job using the WMS
5. Retrieved you job's output.
You should now be ready to go on to examine in more detail how to customise these steps.

