Connecting Infrastructure, Connecting Research

CoG Java API for Globus

Java API

This tutorial focuses on using the Java API for Globus - known as the CoG (Community Grid) Kit. The tutorial uses version 1.2 of the CoG Kit (the next release - version 4 has just recently been released).

To complete this tutorial you will require access to a Linux machine with Globus installed and you will also need to install the Java CoGKit. You will also need to download the tar file and unzip it in your home directory:

    tar -xvzf gram_srb_practical_files.tgz

All the files you need for this tutorial are now ready.

Note for anyone unfamiliar with java:
This section does not require a knowledge of Java. The focus is instead on how to use an API to interact with the underlying Globus architecture. A brief introduction to java can be found at http://forge.nesc.ac.uk/docman/view.php/44/116/leena2.doc.

Please also note: the phrase "after the line(s)" means after the line described and before the next (non-comment) line.

  1. Change directory to access the files you need:
    cd ~/gram2/java_api 

    This folder contains a folder called "cogExample". This folder contains all the java files needed for this tutorial including "Utility.java" (the source for the java class "cogExample.Utility") that provides a series of helper functions that are not connected to the CoGKit usage and hence need not be looked at in this tutorial. Also in this folder is "build.xml", which is used by ant (a commonly used java tool) to automate the process of building each stage of the tutorial.
     

  2. If you have not done so already please install the Java CoGKit on your machine.
  3. The starting point for this section of the tutorial is to repeat in Java the same pinging of a globus queue that was demonstrated in the previous section. This starting code has already been written for you and exists in the file "cogExample/GatekeeperPing.java". This code (and all subsequent java code) is designed so that the queue to be used is passed as a command line parameter. Compile stage1 with the command:
    ant stage1 

    If all goes well you should get the below output:

    [user00@pub-234 java_api]$ ant stage1 
        Buildfile: build.xml
        stage1:
        [javac] Compiling 2 source files
        BUILD SUCCESSFUL
        Total time: 4 seconds 

    You can now run the code by typing:

    java cogExample.GatekeeperPing ngs.oerc.ox.ac.uk/jobmanager-pbs 

    If this program successfully ran then similar output to the below should be obtained:

    [user00@pub-234 java_api]$ java cogExample.GatekeeperPing ngs.oerc.ox.ac.uk/jobmanager-pbs 
        Testing contact with ngs.oerc.ox.ac.uk/jobmanager-pbs
        Contact successfully tested
  4. Having got a simple program to run the next step is to expand the program to submit and manage a simple job. This step involves three new files:
    o cogExample/JobSubmission.java: The class that does all the job submission and management.
    o cogExample/SubmitSingleJob.java: This just defines the executable (and its parameters) the job will run and then calls the job submission function from the JobSubmission class.
    o cogExample/GJListen.java: A class that 'listens' to the status of a job.

    Throughout this tutorial new files are introduced. All of these files are commented and if you are familiar with Java then you are encouraged to examine each file.

    Before going into the details of the code, the RSL (Resource Specification Language) job definition language is introduced. An RSL string contains all the information needed for a job to run. In the previous tutorial when the option '-x' was used this was just forcing extra information to be included in the RSL string. Both globus-job-run and globus-job-submit are actually wrappers that produce the relevant RSL and pass this to the program globusrun. Run the command

    globus-job-run -dumprsl ngs.oerc.ox.ac.uk/jobmanager-pbs  /bin/hostname -f 

    By adding the '-dumprsl' option globus-job-run (and similarly globus-job-submit) merely displays the RSL and not does not run any job. You should get the following string:

    &(executable="/bin/hostname")
        (arguments= "-f") 

    Note that the format of an RSL string has changed with Globus Toolkit version 3 and above to use XML.
     

  5. Examine the file cogExample/JobSubmission.java by using the text editor:
    kwrite cogExample/JobSubmission.java & 

    The comments in this file explain how the example works. Before continuing read through these comments. This example will:
    1. submit this RSL string as an interactive job
    2. attach a process to listen for job status changes
    3. run the job and wait for it to finish

    Interactive in this context means that the code maintains a connection to the running job (comparable to globus-job-run). The opposite to interactive is batch (comparable to globus-job-submit). Compile the code by typing:

    ant stage2 

    Once the code has successfully been compiled run the example:

    java cogExample.SubmitSingleJob ngs.oerc.ox.ac.uk/jobmanager-pbs 

    You should get output similar to

    [user00@pub-234 java_api]$ java cogExample.SubmitSingleJob ngs.oerc.ox.ac.uk/jobmanager-pbs
        Testing contact with ngs.oerc.ox.ac.uk/jobmanager-pbs
        Contact successfully tested
        Job Details:
        contact = ngs.oerc.ox.ac.uk/jobmanager-pbs
        command = /bin/hostname -f
        RSL = &(executable=/bin/hostname)
            (arguments=-f)
        Status: https://ngs.oerc.ox.ac.uk:64016/24212/1127081664/ = PENDING
        Status: https://ngs.oerc.ox.ac.uk:64016/24212/1127081664/ = ACTIVE 
        

    You should quickly notice that something is not right. The job appears to be running and no errors have been received but the job is failing to reach the "DONE" state. There are two options available now: either wait for the job to timeout (this can take a while) or type <CTRL> - c to abort the program.
     

  6. The reason the job failed was that no process for handling the standard output and error of the job was implemented. In the same way as in the previous tutorial files were staged onto the head node it is necessary for the node running a job to be able to stage (via the head node) files back to the local machine. This is done by running locally a GASS (Global Access to Secondary Storage) server which uses the HTTP protocol to transfer the files. Whilst it is possible to run one GASS server for multiple jobs it then becomes difficult to associate each job with its outputs. A better solution is to run one instance of the GASS server for each job.
    Modify the code to start a GASS server before submitting the job. You should already have cogExample/JobSubmission.java open in the text editor. All the necessary changes occur in the submit function. There are four changes that need to be made:

    A. It is necessary to include the relevant classes into the program. Immediately after the line

    import org.globus.gram.GramJob; 

    add the line

    import org.globus.io.gass.server.GassServer; 

    B. Start the GASS server and obtain the URL it is running at. After the lines

    public void submit(String contact, String executable, String params) {
        try
        {

    add the lines

    GassServer gass = new GassServer();
        String gassUrl = gass.getURL();

    C. The next stage is to include in the RSL information about the GASS Server so that the site running the job can know where to stage the standard output/error to. After the lines

    if (params != null)
        {
        rsl += "(arguments="+params+")";
        }

    insert the line

    rsl += "(stdout=" + gassUrl + "/dev/stdout)" + "(stderr=" + gassUrl + "/dev/stderr)"; 

    3. Finally, the utility function 'display' can also display the GASS URl. Note that the _id parameter is important for the next section of this tutorial. Modify the line

    Utility.display(_id,contact,null,executable + " " + params,rsl); 

    to read

    Utility.display(_id,contact,gassUrl,executable + " " + params,rsl); 

     

  7. Compile the code with the command
    ant stage3 

    and then run the program in the same way as previously. You should get output similar to:

    Testing contact with grid-compute.oesc.ox.ac.uk/jobmanager-pbs
        Contact successfully tested
        Job Details:
            contact = ngs.oerc.ox.ac.uk/jobmanager-pbs
            GASS Server = https://129.215.30.234:20000
            command = /bin/hostname -f
            RSL = &(executable=/bin/hostname)
                (arguments=-f)
                (stdout=https://129.215.30.234:20000/dev/stdout)
                (stderr=https://129.215.30.234:20000/dev/stderr)
        Status: https://ngs.oerc.ox.ac.uk:64028/5965/1127663915/ = PENDING
        Status: https://ngs.oerc.ox.ac.uk:64028/5965/1127663915/ = ACTIVE
        ngs.oerc12.ox.ac.uk
        Status: https://ngs.oerc.ox.ac.uk:64028/5965/1127663915/ = DONE
        

     

  8. The code in the previous stage made no attempt to handle the standard output and error of the job, they were merely redirected to the standard output and error of the java program. The CoG Kit however includes functionality that makes it possible to handle any data staged back to the GASS Server. This is done by introducing a class to handle this data, which in this example this class is created in 'cogExample/JOListen.java' that has been already created for you. The changes that need to be made this time to 'cogExample/JobSubmission.java' are as follows:
    A. As in the previous stage the relevant class must be imported. After the line
    import org.globus.io.gass.server.GassServer; 

    add the line

    import org.globus.io.gass.server.JobOutputStream; 

    B. The second stage is to attach a copy of the JOListen class to the GASS server. Note that it is important that the first parameter to the registerJobOutputStream corresponds to the last part of the devices named in the RSL string. For simplicity the jobs standard output and error have been merged and are handled by one instance of the class. After the line

    Utility.display(_id,contact,gassUrl,executable + " " + params,rsl); 

    add the following lines

    JOListen joListen = new JOListen();
        JobOutputStream outStream = new JobOutputStream (joListen);
        gass.registerJobOutputStream("out", outStream);
        gass.registerJobOutputStream("err", outStream); 

    C. The final step is to display the standard output/error received. A simple utility function has been created for you to display this output (the need for this function will become more apparent in the next section of the tutorial. After the line

    ( gjListen.running ) { Thread.sleep(1000); }

    add the lines

    System.out.println(_id + "The following output/error was received");
        Utility.printOutput(_id,joListen.output); 

     

  9. Compile the code with the command
    ant stage4 

    and then run the program, in the same way as previously. You should get the output

    Testing contact with ngs.oerc.ox.ac.uk/jobmanager-pbs
        Contact successfully tested
        Job Details:
            contact = ngs.oerc.ox.ac.uk/jobmanager-pbs
            GASS Server = https://129.215.30.234:20000
            command = /bin/hostname -f
            RSL = &(executable=/bin/hostname)
                (arguments=-f)
                (stdout=https://129.215.30.234:20000/dev/stdout)
                (stderr=https://129.215.30.234:20000/dev/stderr)
        Status: https://ngs.oerc.ox.ac.uk:64026/28699/1127738894/ = ACTIVE
        Status: https://ngs.oerc.ox.ac.uk:64026/28699/1127738894/ = PENDING
        Status: https://ngs.oerc.ox.ac.uk:64026/28699/1127738894/ = DONE
        The following output/error was received
        ngs.oerc12.ox.ac.uk