Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Assign a unique feature name to the node in the Slurm configuration file. For example, you can add the following line to the file:

NodeName=node1 Features=explicit-demand

Replace node1 with the name of your node, and "explicit-demand" with the name of the feature you want to assign to the node.

Update the Slurm daemon with the new configuration file:

sudo systemctl restart slurmctld

In your job submission script or command, specify the feature name of the node you want to use. For example:

sbatch --constraint=explicit-demand myjob.sh

Assign a unique feature name to To reserve the node in the Slurm configuration file. For example, for jobs that e.g. require GPU, you can add create a separate partition for GPU jobs only and configure the following line node to the file:belong to that partition. Here's an example configuration for your case:

NodeName=node1 Features=explicit-demand
# Define compute node
NodeName=rtx Gres=gpu:1 CPUs=16 Sockets=1 CoresPerSocket=8 ThreadsPerCore=2 CPUSpecList=0,1 State=UNKNOWN Weight=80
# Define GPU partition
PartitionName=gpu Nodes=rtx State=UP

Replace node1 with After adding this configuration to your slurm.conf file and restarting the name of your node, slurmctld and "explicit-demand" with the name of the feature slurmd services, you want to assign can submit jobs to the gpu partition using the --partition option, like this:

srun --partition=gpu echo hi

This will reserve the rtx node for jobs in the gpu partition, and prevent non-GPU jobs from running on that node.

Update You also will need to add DEFAULT partition for the Slurm daemon with the new configuration file:other jobs:

sudo systemctl restart slurmctld
PartitionName=general Nodes=rtx Default=YES MaxTime=INFINITE State=UP

In your job submission script or command, specify the feature name of the node you want to use. For example:All other jobs should run on this default partition then.

sbatch --constraint=explicit-demand myjob.sh

To reserve the node for jobs that e.g. require GPU, you can create a separate partition for GPU jobs only and configure the node to belong to that partition. Here's an example configuration for your case:

# Define compute node
NodeName=rtx Gres=gpu:1 CPUs=16 Sockets=1 CoresPerSocket=8 ThreadsPerCore=2 CPUSpecList=0,1 State=UNKNOWN Weight=80
# Define GPU partition
PartitionName=gpu Nodes=rtx State=UP

After adding this configuration to your slurm.conf file and restarting the slurmctld and slurmd services, you can submit jobs to the gpu partition using the --partition option, like this:

srun --partition=gpu echo hi

This will reserve the rtx node for jobs in the gpu partition, and prevent non-GPU jobs from running on that node.

You also will need to add DEFAULT partition for the other jobs:

PartitionName=general Nodes=rtx Nodes=def Default=YES MaxTime=INFINITE State=UP

All other jobs should run on this default partition then.

then. The GPU partition will only be used if explicitly required, even if the default partition has no available nodes as the following example shows:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
gpu          up   infinite      1    idle rtx
general*     up   infinite      1    unk def
> sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
gpu          up   infinite      1   idle rtx
general*     up   infinite      1   unk* c2
> srun --partition=gpu echo hi
hi
> srun  echo hi
srun: Required node not available (down, drained or reserved)
srun: job 11 queued and waiting for resources

To reserve the node for jobs that e.g. require GPU, you can create a separate partition for GPU jobs only and configure the node to belong to that partition. Here's an example configuration for your case:

# Define compute node
NodeName=rtx Gres=gpu:1 CPUs=16 Sockets=1 CoresPerSocket=8 ThreadsPerCore=2 CPUSpecList=0,1 State=UNKNOWN Weight=80
# Define GPU partition
PartitionName=gpu Nodes=rtx State=UP

After adding this configuration to your slurm.conf file and restarting the slurmctld and slurmd services, you can submit jobs to the gpu partition using the --partition option, like this:

srun --partition=gpu echo hi

This will reserve the rtx node for jobs in the gpu partition, and prevent non-GPU jobs from running on that node.

You also will need to add DEFAULT partition for the other jobs:

PartitionName=general Nodes=def Default=YES MaxTime=INFINITE State=UP

All other jobs should run on this default partition then. The GPU partition will only be used if explicitly required, even if the default partition has no available nodes as the following example shows:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
gpu          up   infinite      1    idle rtx
general*     up   infinite      1    unk def
> sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
gpu          up   infinite      1   idle rtx
general*     up   infinite      1   unk* c2
def
> srun --partition=gpu echo hi
hi
> srun  echo hi
srun: Required node not available (down, drained or reserved)
srun: job 11 queued and waiting for resources