Job Dependency and Intragroup Contention
The allocation and scheduling of resources by the Torque\Moab workload manager is a complex process. When there are multiple jobs being submitted to the same reservation, this becomes even further complicated, and can lead to unexpected results.
To facilitate preemption, the scheduler performs a node selection algorithm and attempts to free these nodes by breaking down any running jobs. In some cases, depending on the cluster workload or type of software, this can be a time-consuming process. If the scheduler is unable to free the necessary resources within a given amount of time, it will move the preempting job to deferred status, where it will wait for while before attempting the preemption process again.
While this job is deferred, other jobs submitted to the same reservation may be able to grab the previously requested nodes and begin to run. This can result in a misappropriation of jobs, or jobs that were submitted at a later time being able to jump in ahead.
10:00 AM Job 1111 submitted by User A
10:15 AM Job 2222 submitted by User A
11:00 AM Job 1111 preemption process begins
11:15 AM Job 2222 preemption process begins
11:15 AM Job 1111 unable to preempt, moved to Deferred
11:20 AM Job 2222 grabs Job 1111 previously requested resources and begins to run
11:45 AM Job 1111 moved back into Idle, now must wait again for free resources
To avoid your own jobs being jumped and run out of expected order, we recommend using a job dependency flag…
$ rsub -l nodes=3:ppn=20 ./myjob.sh
$ rsub -l nodes=3:ppn=20 -W depend=after:49243.hopper-mgt ./myjob.sh
$ rsub -l nodes=3:ppn=20 -W depend=after:49270.hopper-mgt ./myjob.sh
Here, we see that the first job is submitted normally with rsub, and the following jobs each specify a dependency on their predecessor. With this in place, the jobs will stay in Hold until the dependency is met. In this case, the “after” dependency is used, meaning these jobs can run as long as the job specified in the dependency has started. There are several other dependency types that you can specify.
active jobs ------------------------
JOBID USERNAME STATE PROCS REMAINING STARTTIME
49243 morgaia Running 20 45:00:00:00 Mon Feb 27 15:07:56
1 active job
blocked jobs -----------------------
JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME
49270 morgaia Hold 20 45:00:00:00 Mon Feb 27 15:19:41
49271 morgaia Hold 20 45:00:00:00 Mon Feb 27 15:21:29
2 blocked jobs
**Please note the use of the job ID with .hopper-mgt appended. Without this, the scheduler will return an “invalid dependency” error.