The supercomputer system provides a queue for each service course you have applied for, and you can specify the queue to execute the jobs.
All queues have three limits of computing resources for minimum, standard and maximum and the amount of computing resources available in the queue is determined.
Contents | Meaning |
---|---|
Minimum Resources(Rmin) | Number of CPU cores always secured for the queue |
Standard Resources(Rstd) | Maximum number of CPU cores that can use jobs submitted to the queue |
Maximum Resources(Rmax) | Maximum number of CPU cores available in the entire queue |
The minimum resources are the number of CPU cores that are always secured for the queue and is also called the minimum guaranteed resources.
During regular system operation, the number of CPU cores set as the minimum resources are guaranteed, regardless of system congestion.
The standard resources are the maximum number of CPU cores that a submitted job to the queue can use.
You can specify the amount of resources to be allocated to a job by using the --rsc option when a job is submitted, but an error will occur if the resources which exceed standard resources are requested.
$ sbatch --rsc p=400 ./jobscript.sh
Too many processors requested. Job not submitted.
The maximum resources are the maximum number of CPU cores available in the entire queue. When system resources are available, jobs submitted to the queue are executed sequentially, and multiple jobs may be in the execution state at the same time. However, the sum of the number of cores used by all jobs in the queue cannot exceed the maximum amount of resources. When the number of CPU cores used by the entire queue reaches the maximum resources, the job will be in the waiting state even if the system resources are available.
【Amendment】
Option | Description | System A Initial Value |
Maximum Value |
System B Initial Value |
Maximum Value |
System C Initial Value |
Maximum Value |
---|---|---|---|---|---|---|---|
p | Number of processes | 1 | Standard Resources (When c=1) |
1 | Standard Resources (When c=1) |
1 | Standard Resources (When c=1) |
t | Number of threads per process | 1 | 112(224) | 1 | 112(224) | 1 | 112(224) |
c | Number of cores per process | 1 | 112 | 1 | 112 | 1 | 112 |
m | Amount of memory per process (Unit:M, G) |
1071M | 117G | 4571M | 500G | 18392M | 2011G |
Option | Description | System G Initial Value |
Maximum Value |
Cloud Initial Value |
Maximum Value |
---|---|---|---|---|---|
p | Number of processes | 1 | Standard Resources (When c=1) |
1 | 1 |
t | Number of threads per process | 1 | 64(128) | 1 | 36(72) |
c | Number of cores per process | 1 | 64 | 1 | 36 |
m | Amount of memory per process (Unit:M, G) |
8000M | 500G | 14222M | 500G |
g | Number of GPUs | 1 | Standard Resources | - | - |
The value in () of t is the maximum value when hyper-threading is enabled. If you enable hyper-threading, please specify that t=c✕2.
If you use system G, you can require resources with either ptcm option or g option.
A group course queue is an individual queue for a group that have applied for a group course.
The amount of minimum resources, standard resources, and maximum resources are determined as follows
Contents | Value |
---|---|
Minimum Resources | Value calculated from course type and standard resources |
Standard Resources | Value indicated in the application form |
Maximum Resources | Varies in the range of 1 to 2 times of the standard resources |
The large-scale job course queue is an individual queue for those who have applied for the large-scale job course.
The amount of minimum resources, standard resources, and maximum resources are determined as follows.
Contents | Value |
---|---|
Minimum Resources | Same value as the standard resources |
Standard Resources | Value indicated in the application form |
Maximum Resources | Same value as the standard resources |
We explain the function of the --rsc option with an example. It is the same not only for the batch jobs but also for the --rsc option specified with tssrun.
We assume system B here, but the rule of the --rsc option is the same for other systems.
#SBATCH --rsc p=1:t=4:c=4:m=8G
#SBATCH --rsc p=60:t=1:c=1
#SBATCH --rsc p=6:t=4:c=4:m=20G
#SBATCH --rsc p=1:t=224:c=112:m=500G