4. Resource allocation#
4.1. CPU-cores allocation#
In Slurm, the way CPU-cores are requested for the job depend on the type of application, and users should distinguish between (at least) two classes of parallel applications:
Parallel non-MPI jobs: single-node jobs with a single task and multiple CPU-cores per task
Parallel MPI jobs: multi-task jobs that can run on 1 or multiple nodes
4.1.1. Job variables about CPUs#
The total number of cores allocated to your job is given by the product of the environment variables as defined in your batch job:
${SLURM_CPUS_PER_TASK:-1} * ${SLURM_NTASKS:-1}
The number of cores per node allocated to your job is given by:
${SLURM_CPUS_PER_TASK:-1} * ${SLURM_NTASKS:-1} / $SLURM_NNODES
Note, however, that this value is not guaranteed to be the same on each node
unless --ntasks-per-node was specified. Otherwise it is only a mean value.
4.2. Memory allocation#
Jobs that do not define any specific memory request will get a default
allocation per core, which is the total node memory divided by the number of
cores on the node. In most cases, the default memory allocation is sufficient,
and it is also what we recommend. If your jobs need more than the default
memory, make sure to control their memory usage (e.g. with mysacct) to
avoid allocating more resources than needed.
If your job needs a non-default amount of memory, we highly recommend to specify
memory allocation of your job with the Slurm option --mem-per-cpu=X, which
sets the memory per core. Avoid using the --mem=X option (memory per node),
especially for multi-node jobs, because it doesn’t guarantee equal memory
allocation for each core.
The default memory unit is megabytes, but you can specify different units using
one of the following one letter suffixes: K, M, G or T. For example, to request
2GB per core you can use --mem-per-cpu=2000 or --mem-per-cpu=2G.
If your job needs more than 240GB memory, you have to specify a high-memory
node with --partition=zen5_himem. These nodes provide up to 1.5TB.