*This page is no longer relevant to BaBar users as they can no longer access the RAL Tier A*

Information regarding the RAL xrootd setup and book keeping can be found here

BaBar RAL Tier A

  1. RAL CSF Linux
  2. First Time BaBar Users
  3. Getting a RAL Account
  4. Changing your Password
  5. Setting up the BaBar user
  6. Your Home Area and Quotas
  7. Tier A
  8. CVS
  9. CM2 data at RAL
  10. Releases
  11. PID and tracking tables
  12. AFS
  13. Job Submission
  14. Job Status
  15. Grid job submission
  16. Book Keeping Tools
  17. Mirror of the BaBar website
  18. Archiving your data
  19. Support
  20. Old documents

RAL CSF Linux

The Linux farms are managed by the RAL e-Science Centre. BaBar users are no longer able to access the Tier A farm. The farm is now restricted to Simulation Production (SP). Users wanting to run jobs or perform data analysis should contact to their AWG convenor who will inform you where to log on to. For SP, babar machines may be accessed by Secure Shell (ssh) (telnet access has been turned off). Full documentation can be found in the CSF User Guide. It should be noted that the CSF User Guide was not written with BaBar in mind. In the case of minor difference between the documentation and BaBar procedures, information in these pages takes precedence.

To get an account for the Linux machines (both farms), read what steps need to be taken on obtaining a RAL account and fill in the online web form.

First Time BaBar Users

First time users should follow the following steps to get going at RAL:

Getting an Account

To obtain a RAL account please follow the instructions here

Changing your Password

At RAL your AFS and computer passwords (CSF) are independent. To change your AFS password type To change your CSF password (password required to log into machines) type

Setting up the BaBar user environment

Your home directory is /home/csf/username on NFS.

Most users won't have to do any special setup. If your RAL CSF account is registered for a different experiment and bfactory is not your default group (eg. you are working on both BaBar and CMS), then you should either

csf> mkdir ~/.hepix
csf> echo "bfactory" > ~/.hepix/preferred-group
or
if ( -r /usr/local/lib/hepix/central_login.csh ) then
   source /usr/local/lib/hepix/central_login.csh
endif
if ( -r /usr/local/lib/hepix/central_env.csh ) then
     source /usr/local/lib/hepix/central_env.csh
endif

Your Home Area and Quotas

Unlike SLAC your home area is on NFS disk as opposed to AFS. Therefore you do not require an AFS token to write to this disk. The top of your home directory has the path:
/home/csf/username
To keep track of the disk space you are using in your home area you can use the command "quota". For example:
[olaiya@csfd]~% quota
Disk quotas for user olaiya (uid 27527):
     Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
csf-home.rl.ac.uk:/home/csf
                 289444  650000  700000            8607       0       0
csf-varmail.rl.ac.uk:/var/mail
                  46108   50000  150000               1       0       0

Where the usage, quota and limit numbers have units of kBytes

Tier A

We have finished the process of expanding the Linux farm which has access to a total 75 TB of disk space for BaBar usage. Most of this disk is being used to store CM2 data whilst roughly 3 TB is being used for user and AWG scratch disk. Regarding CPU at present, there are approximately 452 dual processor machines running Scientific Linux 3 (SL3). Of these processors, 284 are 1.4 GHz PIIIs, 8 are 2.66 GHz Xeons and 160 are 2.8 GHz Xeons. The number of CPUs allocated to the farm may vary very slightly from time to time. Their setup is as follows:

Login Machines

Disk

AWG Disk Adminsitration

CVS

At RAL there is a mirror of the CVS repository for all BaBar code at SLAC. The clone is updated nightly (actually at about 1 am UK time) and so the latest versions, as well as all previous releases, should always be available. All the major programs maintained at SLAC are copied to RAL.

Important: So updates to BaBar packages are not scattered between RAL and SLAC, updates are not allowed in the RAL CVS repository. It is locked for writing and you should see warnings if you try to access the RAL repository. Updates must be made directly to SLAC CVS were they will be copied to RAL the following day. To check packages into or update directly from the SLAC repository you can use the command "bcvs". You must have a SLAC AFS token for this to work. Examples of bcvs commands in your release directory:

CM2 data at RAL

Releases

Information on the latest releases can be found here or look under $BFROOT/dist/releases.

Pid and Tracking Tables

The PID and tracking tables are copied over every night. However you should be using the database access method.

/nfs/farm/babar/AWG/Tracking/TrkSvtBasedEff ---> /afs/rl.ac.uk/bfactory/AWG/Tracking/TrkSvtBasedEff

/nfs/farm/babar/AWG/PID/pidtables ---> /afs/rl.ac.uk/bfactory/physicstools/pid/pidtables

AFS

You can access your SLAC home directory from RAL via AFS, however you will need a SLAC AFS token. If you login from a computer that already has an AFS token and use a version of ssh that supports AFS token passing, then that same token will also be available at RAL. SLAC, CERN, and many other HEP sites have this version - you can check with ssh -h. If the -k option is listed then your ssh supports token passing.

Otherwise, you can obtain a SLAC token with klog user@slac.stanford.edu, or the shortcut, kslac. Eg.

csfe ~ > kslac
klog adye@slac.stanford.edu
Password:
csfe ~ > ls /afs/slac.stanford.edu/u/ec/adye
You can also use your RAL AFS area so your RAL files are easily accessible from elsewhere. Your AFS directory is of the form /afs/rl.ac.uk/user/u/username (eg. /afs/rl.ac.uk/user/a/adye). Note that when you log in you do not obtain an AFS token automatically, so you should use klog before you try to write there (by default, read access is universal).

Job submission

RAL uses PBS (qsub) as opposed to LSF (bsub) at SLAC. However at RAL, we try to minimise this difference for BaBar users with the aid of the command bbrbsub. Jobs can be submitted to the farms from a SL3 frontend by specifying the prod PBS batch queue (this feeds into the sl4p queue).

The "-c hh:mm" option specfies the same amount of CPU time (cput) and per-process CPU time (pcput) requested for the job (the cput and pcput requested are required to be the same for a job). This "-c" option has the advantage over the equally valid qsub option "-l cput=hh:mm:ss,pcput=hh:mm:ss" of being simpler and the same as the SLAC bsub option. You will notice that with the "-c" option you cannot specify the amount of seconds requested, which is no significant loss as CPU requests at the level of minutes is a more than sufficient precison.

Also, as the submission queue prod is the default for Linux, omitting the -q option works equally well. Therefore, one can simply submit a job as follows

We also have an SL4 express queue for testing one or two short jobs. One can submit jobs to the SL4 express queue as follows:


bbrbsub

bbrbsub is a wrapper script for the PBS qsub command. We recommend Babar users use bbrbsub as opposed to qsub as it takes care of a lot of the differences between qsub and bsub. bbrbsub has the following properties:


1. Makes the batch job execute in directory where execution starts

2. Transfers environment variables (with no failure due to character limits)

3. Keeps stdout and stderr on batch node until job ends

4. Enforces use of the prod feeder queue which directs jobs to the sl3p queue

5. Provides a job name of binary/script but limited to 15 letters

6. Allows multiple arguments to be passed to the script on the command line (e.g. bbrbsub -c hh:mm file.job arg1 arg2 arg3 arg4)


bbrbsub uses the same PBS flags as qsub and as with qsub AFS tokens are not passed with the job. However at RAL this is not such a problem as you can use your NFS mounted disk, where an AFS token is not required.It should be noted that bbrbsub does not recognise PBS directives like qsub (i.e commands such as '#PBS -j oe' in the job file). See man page of qsub for details of qsub options that bbrbsub passes straight through.

bbrbkill

This is a simple script that functions just as qdel but also offers the additonal feature of deleting all your jobs in the queue.

Batch queue scheduling

Details of the batch job scheduling are given here.

One additional control (not currently documented there, 24/Jan/06) is the maximum CPU time you specify on your job with the "-l cput=<time>" option.

The maximum amount of time a job can run on a batch machine is 48 realtime hours. This information can be obtained with the command:

qmgr -c "l q sl3p"

Your jobs may start earlier and you might be able to run more at the same time if you make a good estimate of the CPU time they will take (how much of an effect this will be depends on the mix of other jobs in the system). In the long run you will, however, not be penalised by using an estimate that is too long. Also take care not to make the CPU time estimate too tight as a job running above the limit will get killed.

To estimate the CPU time a job requires you can run a test job (with a high limit) and look at the stdout at the end of the batch job. Remember to add a safety margin as the job will be killed prematurely if it runs for too long.

$WORKDIR

The batch machines at RAL have local disk available for you to write the output of your jobs. The disk is accessible via the $WORKDIR variable on the batch machines. The advantage of using this area is that it is usually much faster to write ntuples etc to the local disk and then copy the final file to /stage/babar-user1 at the end of the job. Not only does it speed up the I/O of your jobs but it also reduces the NFS load on the user and AWG disks. More information on the $WORKDIR variable for the batch machines can be found here

Job Status

To see what jobs are running on the farm you can use the command To check the status of your jobs you can run To see the number of jobs running in the different queues at RAL type Keep in mind that BaBar user will have only running jobs in the sl3p queue.

To look at the output of jobs that are currently running you can use the qcat command

Grid job submission

The RAL Tier A will accept LCG jobs from users with an LCG certificate in the BaBar Virtual Organisation (among others). See the BaBar Grid Registration page for details of obtaining a certificate, registering it with LCG, and adding it to the BaBar VO.

Grid submission for BaBar analysis jobs is still experimental, but is being worked on.

Book Keeping Tools

The book keeping tools used at SLAC are also available at RAL. To access the relevant database for the RAL site the flag --dbsite=ral must be specified. For example to see what skims are at RAL you can run To produce onpeak 4 body tcl files each identifying 200000 events you can run the command

Mirror of BaBar website

An up to date mirror the BaBar website can be found at RAL here

Archiving your data

If you no longer need your data on the home filesystem (/home/csf), user data disk (/stage/babar-user1), or AWG data disk (/stage/babar-awgN), please consider freeing up the space by deleting it - perhaps after copying it back to your desktop machine or local institution, or archiving it to tape at RAL.

Support

All questions BaBar questions specific to the RAL Tier 1A center should be posted on the hypernews: RAL Tier A Centre and UK Farms.

If these do not get a response then contact Tim Adye or E.Olaiya@rl.ac.uk directly

Old Documents

Old documents
Valid HTML 4.01! Best viewed with ANY browser! http://hepunx.rl.ac.uk/BFROOT/csflnx.html last modified 5th Jan 2009 by
Emmanuel Olaiya, <E.Olaiya@rl.ac.uk>
and
Tim Adye, <T.J.Adye@rl.ac.uk>