1. Slurm Workload Manager#
1.1. Command line tools#
Slurm provides a complete toolbox to manage and control your jobs. Some of them
carry out common tasks, such as submitting job scripts to the queue (sbatch)
or printing information about the queue (squeue). Others have new roles not
found in a classic PBS environment, such as srun.
1.1.1. Job management tools#
1.1.2. Slurm monitoring tools#
1.1.2.1. Monitoring jobs#
With the squeue command you can monitor your jobs in the queue.
In the VUB clusters you can also use mysqueue to get a more detailed view of
your queue. It shows a detailed overview of your jobs currently in the queue,
either PENDING to start or already RUNNING.
mysqueue output# JOBID PARTITION NAME USER STATE TIME TIME_LIMIT NODES CPUS MIN_MEMORY NODELIST(REASON)
1125244 ampere_gpu gpu_job01 vsc10000 RUNNING 3-01:55:38 5-00:00:00 1 16 7810M node404
1125245 ampere_gpu gpu_job02 vsc10000 PENDING 0:00 5-00:00:00 1 16 10300M (Priority)
1125246 zen5_mpi my_job01 vsc10000 RUNNING 2-19:58:16 4-23:59:00 2 32 8G node[710,719]
1125247 pascal_gpu gpu_job03 vsc10000 PENDING 0:00 3-00:00:00 1 12 230G (Resources)
Each row in the table corresponds to one of your running or pending jobs or any individual running job in your Job arrays. You can check the PARTITION where each job is running or trying to start and the resources (TIME, NODES, CPUS, MIN_MEMORY) that are/will be allocated to it.
Note
The command mysqueue -t all will show all your jobs in the last 24 hours.
The column NODELIST(REASON) will either show the list of nodes used by a running job or the reason behind the pending state of a job. The most common reason codes are the following:
- Priority
Job is waiting for other pending jobs in front to be processed.
- Resources
Job is in front of the queue but there are no available nodes with the requested resources.
- ReqNodeNotAvail
The requested partition/nodes are not available. This usually happens on a scheduled maintenance.
See also
Full list of reason tags for pending jobs.
1.1.2.2. Monitoring nodes and partitions#
With the sinfo command you can monitor the nodes and partitions in the cluster.
In the VUB clusters you can also use mysinfo to get a more detailed view of
the cluster. It shows an overview in real time of the available hardware
resources for each partition in the cluster, including cores, memory and GPUs,
as well as their current load and running state.
mysinfo output# CLUSTER: hydra
PARTITION STATE [NODES x CPUS] CPUS(A/I/O/T) CPU_LOAD MEMORY MB GRES GRES_USED
ampere_gpu resv [ 2 x 32 ] 0/64/0/64 0.01-0.03 246989 MB gpu:a100:2(S:1) gpu:a100:0(IDX:N/A)
ampere_gpu mix [ 3 x 32 ] 66/30/0/96 13.92-19.47 257567 MB gpu:a100:2(S:0-1) gpu:a100:2(IDX:0-1)
ampere_gpu alloc [ 3 x 32 ] 96/0/0/96 3.27-32.00 257567 MB gpu:a100:2(S:0-1) gpu:a100:2(IDX:0-1)
zen5_himem alloc [ 1 x 128 ] 128/0/0/128 59.54 1547679 MB (null) (null)
zen5_mpi mix [ 10 x 128 ] 817/463/0/1280 0.00-124.32 773536+ MB (null) (null)
[...]
zen4 mix [ 13 x 64 ] 346/486/0/832 0.02-50.06 386510 MB (null) (null)
zen4 alloc [ 7 x 64 ] 448/0/0/448 2.03-74.64 386510 MB (null) (null)
Tip
The command mysinfo -N shows a detailed overview per node.
1.1.2.3. Monitoring job accounting data#
Warning Use with restrain, avoid including sacct or mysacct in your
scripts.
With the sacct command you can
display accounting data of your current and past jobs (and job steps), such as
CPU time and memory used. In the VUB clusters you can also use mysacct to
get a more detailed view of your jobs and slurm_jobinfo <JOB_ID> to view
details of a given job. (Replace <JOB_ID> with the ID of your job.)
Tip
Use --starttime and --endtime to specify a time range.
1.1.2.4. Monitoring running jobs#
With the sattach command you can attach standard input, output, and error of a current job to your shell.
1.2. Job working directory#
In Slurm, by default the job stays in the directory where it was submitted from.
Thus, adding cd $SLURM_SUBMIT_DIR to the job script is not needed. Users can
also use the Slurm option --chdir to specify in which directory a job should
start.
1.3. Torque/Moab to Slurm migration#
Experienced Torque/Moab users can use the translation tables below to quickly get up and running in Slurm.
1.3.1. Submitting and monitoring jobs#
Torque/Moab |
Slurm |
Description |
|---|---|---|
|
|
Submit a job with batch script |
|
|
Start an interactive job, see Interactive jobs |
|
|
Delete a job |
|
mysqueue --states=all ormysacct --starttime=YYYY-MM-DD |
Show job queue status |
|
|
Show details about a job |
|
|
Show resources usage |
|
|
Show summary of available nodes and their usage |
1.3.2. Requesting resources and other options#
Torque/Moab |
Slurm |
Description |
|---|---|---|
|
|
Set job name to |
|
|
Define the time limit |
|
|
Request a single CPU core |
|
|
Request multiple cores on 1 node for Parallel non-MPI jobs |
|
--ntasks=X or--ntasks=X --nodes=Y or--nodes=Y --ntasks-per-node=Z |
Request multiple cores on 1 or multiple nodes for Parallel MPI jobs |
|
--mem-per-cpu=Ndefault unit = MB
|
Request memory per CPU core
Only if needed, see Memory allocation
|
|
|
Request |
|
|
Send job alerts to given email address |
|
--mail-type=BEGIN|END|FAIL|REQUEUE|ALLselect 1 or comma separated list
|
Conditions for sending alerts by email |
|
|
Write stdout to |
|
|
Write stderr to |
|
(default, unless |
Write stdout and stderr to the same file |
1.3.3. Environment variables defined by resource managers#
Torque/Moab |
Slurm |
Description |
|---|---|---|
|
|
Job ID |
|
|
Directory where job was submitted from, see Job working directory |
$PBS_NODEFILE(nodes file)
|
$SLURM_JOB_NODELIST or$(scontrol show hostnames)(nodes string)
|
List of nodes assigned to job |
|
|
Job name |
|
|
Job array ID (index) number |
|
|
Number of nodes |
|
Number of cores per node |
|
|
Total number of cores |
1.3.4. Features to partitions#
Torque/Moab features |
Slurm partitions |
|---|---|
|
|