Category: Platform LSF

Content related to the scheduler software, IBM Platform LSF. i.e. bsub, bjobs, etc.


Queue Information

To view the available queues and their basic limits, use the bqueues command. To find more detail on a particular queue, use bqueues -l <queuename> The most meaningful numbers are the JL/U, MAX, and the RUNLIMIT. JL\U is “Job Limit per User” or “Job Slot Limit per User.”  This is somewhat misleading, as it actually limits the number of cores...


Sequential Job Submission

For situations where you need to run several jobs back-to-back, with each waiting for the prior job’s completion, you can use the -K option with bsub.  This will have LSF wait until the job finishes before it accepts another job. Example… #!/bin/bash bsub -K -o out.1 sleep 10 & bsub -K -o out.2 sleep 5 & wait From the man...


Mail Notifications

*Please note we are currently investigating an issue that is preventing compute nodes from sending completion e-mail.  A workaround is to use the bsub -K option so that your script will return and allow you to send mail from the login node via script. You might want to be notified of the status of your jobs without having to log...


Interactive Jobs

In some cases you may want to run tests on the compute nodes, to validate your scripts. A good way to do this is to run an LSF job in interactive mode. This way, you can simulate the exact environment in which your code will run. Here are the suggested commands for experimenting at the compute node level with interactive...


LSF Error Codes

Determining why a job ended unexpectedly is an essential skill for running jobs successfully on the cluster and identifying systemic errors. The basic process for locating error codes, and subsequently an english translation, mostly involves the use of the bjobs and bhist commands. A script for locating job exit information is also provided in /tools/scripts. Here is some information on...