Connecting Infrastructure, Connecting Research

NGS-SRB

SRB

The aim of this tutorial is to demonstrate how the Storage Resource Broker (SRB) can be used to store files that may be accessed from multiple locations. This will be achieved by running SRB commands on both a remote machine and your local machine. The local machine commands will require Globus and the Scommands to be installed on your local machine. The Scommands downloads and information on how to install them can be found on the SRB website.

To see how SRB has been used on the NGS, see the user case study on The effects of defibrillation on the heart.

1. Connect to the STFC-RAL node in the GSI-SSH terminal by typing the command

    gsissh -p 2222 ngs.rl.ac.uk

2. The first stage is to create the configuration file that is used by SRB to specify your default settings. This file in particular defines the default SRB server location, the default user name to use and the method of securely connecting.

A script to automatically generate the configuration file has been provided for you on the RAL node. This script will also automatically copy the file to all NGS core nodes for you. Use the following command (see information on the arguments below):

    [ngs0249@grid-data ngs0249]$ /home/srb/create-mdas ngs <your-srb-username>

The arguments to the script are
1) ngs - this is the SRB domain and will be the same for most NGS users

2) your-srb-username - the SRB username sent to you in the registration email

Inspect the contents of the file using the command

    [ngs0249@grid-data ngs0249]$ cat .srb/.MdasEnv

Note about default-resource - this is your SRB account's default storage vault. This is easily changed to your local vault. Using the vault nearest to you will reduce traffic on a single vault and decrease transfer time. The NGS-SRB vaults are:

  • STFC-RAL  ral-ngs1
  • Oxford   oxford-ngs1
  • Manchester  mc-ngs1
  • Leeds   leeds-ngs1

 

3. The NGS uses a system of accessing different groups of software, or modules, on request. To access the SRB commands (called the Scommands) it is necessary to modify your remote environment to find the relevant commands:

   [ngs0249@grid-data ngs0249]$  module load srb

(if you have also installed the Scommands on your local machine you will need to manually set up the user environment. Full instructions can be found on the SRB website)

4. Before running SRB commands it is necessary to initialise your environment, using the Sinit command. This command is needed to handle multiple simultaneous SRB sessions from the same host (not done in this tutorial). Run the command

    [ngs0249@grid-data ngs0249]$ Sinit
   

5. Before transferring a file into SRB storage a file must be first created. Create a file on the NGS node using the below command

   [ngs0249@grid-data ngs0249]$  hostname > myfile1.ngs18

6. Transfer myfile1.txt into your default (top level) directory, or collection in SRB terminology.

    [ngs0249@grid-data ngs0249]$  Sput myfile1.ngs18 .

You can check if the file is now stored in SRB by using

    [ngs0249@grid-data ngs0249]$  Sls

7. This file may now be easily accessed and read from a local machine that has the Scommands and Globus installed:

       local machine> cd
    local machine> Sget myfile1.ngs18 .
    local machine> cat myfile1.ngs18

8. New collections (directories) may be easily created using the Scommands and files copied into them.

       [ngs0249@grid-data ngs0249]$  Smkdir mydir.ngs18
    [ngs0249@grid-data ngs0249]$  hostname > myfile2.ngs18
    [ngs0249@grid-data ngs0249]$  Sput myfile2.ngs18 mydir.ngs18

9. It is also possible to read a file from SRB without having to copy the file out of SRB first

    [ngs0249@grid-data ngs0249]$  Scat mydir.ngs18/myfile2.ngs18

10. Having now covered basic usage a couple of more advanced topics can be covered. NOTE: These final topics all involve commands run on the local machine although they could just as easily be run on the remote machine.

The first of these advanced topics is replication. SRB allows files to be replicated to additional data stores (vaults). When working with files that are replicated in multiple vaults the SRB commands will (by default) choose which copy to use. The primary benefits of file replication are firstly that a copy that is 'close' to where a job is running will be quicker to access. Secondly the reliability of access to the file is improved (e.g. a vault can fail without you losing access to your file). The default vault that has been used in the previous steps has been ral-ngs1 but there is also a vault at oxford-ngs1. Run the following commands to see replication in action:

       [ngs0249@grid-data ngs0249]$  Sls -l myfile1.ngs18
    [ngs0249@grid-data ngs0249]$  Sreplicate -S oxford-ngs1 myfile1.ngs18
    [ngs0249@grid-data ngs0249]$  Sls -l myfile1.ngs18
    [ngs0249@grid-data ngs0249]$  SgetD myfile1.ngs18

11. The previous commands have in general only referred to handling single files or collections. A potentially more realistic scenario is when a collection (and sub collections) needs to be synchronized with a local copy. Synchronizing (as opposed to copying everything) prevents un-needed transfers into and out of SRB. SRB uses checksums of each file to identify files that need updating. As a preparatory step it is helpful to tell SRB to generate checksums on each file in the collection to be synchronized (this only needs to be done once).

The first step is to test if any checksums already exist:

     local machine> Schksum -l -r mydir.ngs18

The "0" in your output means the checksum has not yet been generated. Generate the checksum with

    local machine>  Schksum -f -r mydir.ngs18

and then list the checksums again to see the difference.

To synchronize the contents of the remote collection "mydir.ncess" to the local folder "srb" use the commands:

       local machine> cd srb
    local machine> ls
    local machine> Srsync -r s:mydir.ngs18 .
    local machine> ls

The "s:" indicates that the following file or collection is in SRB.

Finally create a new local file and then re-synchronize the local folder with the collection in SRB:

  local machine> hostname > newfile.ngs18
    local machine> Srsync -r . s:mydir.ngs18
    local machine> Sls

12. The final topic to be discussed in this tutorial is deleting files and collections. To completely remove a file (or collection) from SRB is a two step process. To remove the file "newfile.ncess" first use the command

    local machine> Srm mydir.ngs18/newfile.ngs18

which moves the file to your "trash" collection (c.f. the windows recycle bin). The contents of your trash collection can be viewed using:

    local machine> Sls /ngs/trash/home/<srbname>.ngs

where <srbname> is the name of your account in srb. In this case your srb account name is identical to your account name on training-ui.nesc.ed.ac.uk. To remove files from your trash collection use the command:

    local machine> Srmtrash

List the contents of your trash collection to see that the contents are now empty. Since this tutorial is about finished delete all the files and collections you have created in SRB .

13. When you are finished using srb it is sensible to exit the session and (optionally) unload the NGS srb module

      local machine> Sexit
     [ngs0249@grid-data ngs0249]$ Sexit
    [ngs0249@grid-data ngs0249]$ module unload srb
    [ngs0249@grid-data ngs0249]$ exit