Use export command to set environment variables and add "$" at the beginning of variable names to refer to the environment variables.
Setting of the Environment Variables
#Format Environment Variable Name=Value; export Environment Variable Name
LANG=en_US.UTF-8; export LANG
Referene of the Environment Variables
echo $LANG
Environment Variable Name | Meaning |
---|---|
$SLURM_CPUS_ON_NODE | the number of Cores/Node |
$SLURM_DPC_CPUS | the number of Physical cores per task |
$SLURM_CPUS_PER_TASK | the number of Logical cores per task (twice the value of the physical core) |
$SLURM_JOB_ID | Job ID (Use $SLURM_ARRAY_JOB_ID for array jobs) |
$SLURM_ARRAY_JOB_ID | Parent job ID when executing array job |
$SLURM_ARRAY_TASK_ID | Task ID when executing array job |
$SLURM_JOB_NAME | Job name |
$SLURM_JOB_NODELIST | Name of nodes allocated to the job |
$SLURM_JOB_NUM_NODES | Number of nodes allocated to the job |
$SLURM_LOCALID | Index of executing node in node |
$SLURM_NODEID | Index relative to the node allocated to the job |
$SLURM_NTASKS | Number of processes for the job |
$SLURM_PROCID | Index of tasks for the job |
$SLURM_SUBMIT_DIR | Submit directory |
$SLURM_SUBMIT_HOST | Source host |
If you type the newgrp command without changing the group name such as newgrp gr19999
, the environment variable LD_LIBRARY_PATH will not be inherited to the new group environment after it is typed.
On the other hand, if you add a hyphen, such as newgrp - gr19999
, the initial value at login is reset as the environment variable LD_LIBRARY_PATH, however the environment variable DISPLAY will disappear here and cause problems when connecting with X11 software.
Therefore, we provide reload_modules
command to reset the environment variables set by the modules that were loaded before newgrp was executed.
# State of LD_LIBRARY_PATH before executing newgrp
[b59999@laurel31 ~]$ echo $LD_LIBRARY_PATH
/opt/system/app/intel/2023.1/mpi/2021.9.0/libfabric/lib:...
# State of LD_LIBRARY_PATH after executing newgrp
[b59999@laurel31 ~]$ newgrp gr19999
[b59999@laurel31 ~]$ echo $LD_LIBRARY_PATH
(The contents have been cleared.)
# LD_LIBRARY_PATH is reset by executing reload_modules
[b59999@laurel31 ~]$ reload_modules
[b59999@laurel31 ~]$ echo $LD_LIBRARY_PATH
/opt/system/app/intel/2023.1/mpi/2021.9.0/libfabric/lib:...
Please note that the following options cannot be specified following #SBATCH in the job script.
Option | ||||
---|---|---|---|---|
--batch | --clusters(-M) | --constraint(-C) | --contiguous | --core-spec(-S) |
--cores-per-socket | --cpus-per-gpu | --cpus-per-task(-c) | --distribution(-m) | --exclude(-x) |
--exclusive | --gpu-bind | --gpus(-G) | --gpus-per-node | --gpus-per-socket |
--gres | --gres-flags | --mem | --mem-bind | --mem-per-cpu |
--mem-per-gpu | --mincpus | --nodefile(-F) | --nodelist(-w) | --nodes(-N) |
--ntasks(-n) | --ntasks-per-core | --ntasks-per-gpu | --ntasks-per-node | --ntasks-per-socket |
--overcommit(-O) | --oversubscribe(-s) | --qos(-q) | --sockets-per-node | --spread-job |
--switches | --thread-spec | --threads-per-core | --use-min-nodes | |
--get-user-env | --gid | --priority | --reboot | --uid |
The /tmp area can be used as a temporary data write destination on our supercomputer system. In some cases, programs with detail file I/O can be processed faster by using /tmp.
/tmp is a private area for each job so that files are not mixed with those of other jobs. Please take advantage of this feature to avoid mixing files with those of other jobs.
Please note that the /tmp area is automatically deleted at the end of the job. To keep the files written to /tmp, you must include a description to copy the files to /home or /LARGE0,/LARGE1 in the job script . Please note that the deleted files cannot be retrieved later.
/tmp capacity available for each system is calculated from
Number of processes x number of cores per process x capacity per core per system (see the following table)
System Name | Capacity per core |
---|---|
System A | 2.4G |
System B | 8.9G |
System C | 8.9G |
System G | 15.6G |
Cloud | 94G |
For example, if a job with 4 processes (8 cores per process) is submitted on System B,
from 4 x 8 x 8.9
, 284.8GB will be allocated.
It is possible to control the timing of the job execution start to be submitted according to the execution status of jobs that have already been submitted. Specifically, it is available by adding the following information to the job submission options column.
#SBATCH -d <Commands specifying the execution order>:<Job ID already submitted>
For example, if job script A has been successfully completed execution and you wish to continue executing job script B, you can use the following procedure.
Submit job script A to the job.
If the job ID of the job submitted in 1. is "200", job script B will be as follows. *
#!/bin/bash
#============ SBATCH Directives =======
#SBATCH -p gr19999b
#SBATCH -t 2:0:0
#SBATCH --rsc p=4:t=8:c=8:m=8G
#SBATCH -d afterok:200
#SBATCH -o %x.%j.out
#============ Shell Script ============
srun ./a.out
Submit job script B to the job.
You can select the specified command of the execution order from the following four options.
Specified commands of the execution order | Meaning |
---|---|
after | Execute the job after the start of the specified job. |
afterany | Execute the job after the specified job ends. |
afterok | Execute the job when the specified job ends normally. If the specified job ends abnormally, it will ends without being executed. |
afternotok | Execute the job when the specified job ends abnormally. If the specified job ends normally, it will ends without being executed . |
You can execute multiple jobs with different parameters in a single job script with the function of the array job. Specifically, it is available by adding the following information to the job submission options column.
#SBATCH -a <start_num>-<end_num>[option]
For example, you can execute a job to transfer 1.data, 2.data, and 3.data placed in the same directory as the job script to ./a.out and analyze by writing a following job script.
#!/bin/bash
#============ SBATCH Directives =======
#SBATCH -p gr19999b
#SBATCH -t 2:0:0
#SBATCH --rsc p=4:t=8:c=8:m=8G
#SBATCH -a 1-3
#SBATCH -o %x.%A_%a.out
## %x is replaced by the job name, %A by the array job ID, and %a by the array task ID.
#============ Shell Script ============
srun ./a.out ${SLURM_ARRAY_TASK_ID}.data
The following options can be set in the [option] field.
Description in [option] | Execution details |
---|---|
:[0-9] | Execute every specified number. For example, if you specify "1-5:2", 1,3,5 will be executed. |
%[0-9] | The specified number is set as the maximum number of the simultaneous executions.For example, if you specify "%2", the maximum number of simultaneous executions will be "2". |
You can use computing resources effectively by executing multiple programs simultaneously as a single job. You can execute multiple programs simultaneously as a single job by describing multiple execution commands in a shell script and executing the shell script.
※This method is for a sequential program or a program that is thread-parallelized by OpenMP or automatic parallelization functions. ※To execute multiple MPI programs simultaneously as a single job, please refer to How to execute multiple programs simultaneously (MPMD).
An example of each script is shown below.
Job scripts to be executed by sbatch command (sequential)
※The following script is a sample of executing four programs simultaneously.
#!/bin/bash
#============ SBATCH Directives =======
#SBATCH -p gr19999b
#SBATCH --rsc p=4:t=1:c=1:m=3G
#SBATCH -o %x.%j.out
#============ Shell Script ============
srun ./multiprocess.sh
Shell scripts to be executed in job scripts (sequential)
#!/bin/bash
case $SLURM_PROCID in
0) ./a1.out ;;
1) ./a2.out ;;
2) ./a3.out ;;
3) ./a4.out ;;
esac
Job script to be executed by sbatch command (OpenMP)
※The following script is a sample to execute 8 programs with 4 threads execution simultaneously.
#!/bin/bash
#============ SBATCH Directives =======
#SBATCH -p gr19999b
#SBATCH -t 2:0:0
#SBATCH --rsc p=8:t=4:c=4:m=3G
#SBATCH -o %x.%j.out
#============ Shell Script ============
srun ./multiprocess.sh
Shell scripts (OpenMP) to be executed in job scripts
#!/bin/bash
case $SLURM_PROCID in
0) ./aaa.out ;;
1) ./bbb.out ;;
2) ./ccc.out ;;
3) ./ddd.out ;;
4) ./eee.out ;;
5) ./fff.out ;;
6) ./ggg.out ;;
7) ./hhh.out ;;
esac
By specifying "#SBATCH --rsc t=4" in the job script, the job scheduler is automatically configured to use 4 threads per process. Therefore, in the above shell script, each program is executed with 4 threads.
If you want to vary the number of threads per process, define OMP_NUM_THREADS={number of threads you want to use} for each process as follows.
#!/bin/bash
case $SLURM_PROCID in
0) export OMP_NUM_THREADS=1; ./aaa.out ;;
1) export OMP_NUM_THREADS=2; ./bbb.out ;;
2) export OMP_NUM_THREADS=2; ./ccc.out ;;
3) export OMP_NUM_THREADS=3; ./ddd.out ;;
4) export OMP_NUM_THREADS=3; ./eee.out ;;
5) export OMP_NUM_THREADS=4; ./fff.out ;;
6) export OMP_NUM_THREADS=4; ./ggg.out ;;
7) export OMP_NUM_THREADS=4; ./hhh.out ;;
esac
Job script to be executed by sbatch command (MPI)
※The following script is a sample to execute 3 MPI program (4 cores and 4 threads per process) that requires 4 processes concurrently.
#!/bin/bash
#============ SBATCH Directives =======
#SBATCH -p gr19999b
#SBATCH -t 2:0:0
#SBATCH --rsc p=12:t=4:c=4:m=3G
#SBATCH -o %x.%j.out
#============ Shell Script ============
srun --multi-prog multiprocess.conf
configuration file(multiprocess.conf)
## Execute as a 4-process MPI program
0-3 ./aaa.out
4-7 ./bbb.out
8-11 ./ccc.out
The number of processes used can be changed for each MPI program. For example, if you want the program a1.out to use 4 processes and a2.out to use 8 processes, you can create the following configuration file. However, the number of cores, threads, and memory per process are automatically set to the values specified by the --rsc option, so they cannot be customized for each MPI program.
## Execute as a 4-process MPI program
0-3 ./aaa.out
## Execute as a 8-process MPI program
4-11 ./bbb.out