Execution of Serial Program

This section explains the use case of creating a program in C language, compiling and executing a job on the supercomputer.
Please refer to Access for login to supercomputer.

This section explains how to create and compile the sample program (sample.c).

We create a sample program using "emacs" here. In addition to the "emacs", other editors that are installed as standard on the system are "vi (vim)" etc.

Start "emacs" with the following command and open the empty file “sample.c”.

emacs sample.c

Enter the following program manually or copy and paste it.
The paste to the terminal is often assigned to the right click or wheel click of the mouse.

#include <stdio.h>
#include <unistd.h>

int main (void) {
   printf ("hello \ n \ n");
   sleep (120);
   return 0;
}

When input is completed, press "Ctrl + x Ctrl + s" to save. “Ctrl + x” means “press "x" key while holding down "Ctrl" key on the keyboard”.
When saving is completed, the following message is displayed at the bottom of the screen.

Wrote /home/b/b59999/sample.c

To exit the emacs, press "Ctrl + x Ctrl + c".
Check the created file by typing ls -l in the terminal. If the file name "sample.c" is displayed as shown below, there is no problem.

-rw-r ----- 1 b59999 b59999 75 Dec 18 10:42 sample.c

Enter the following command to compile. The command name to compile varies depending on the systems.

System Command
A cc sample.c
B or C icc sample.c

The compiled program is named “a.out”.
If you type "ls -l" in the terminal and the file name "a.out" is displayed as shown below, there is no problem.

-rw-r ----- 1 b59999 b59999 9817 Dec 18 10:42 a.out

Next section explains the procedure for executing the program created in the previous section.
Since the login node is a computer shared by many users, the program needs to be executed on the computing node using the following method.

A unit that submits (request execution) a program to a computing node is called a job. There are two types of job submission methods: interactive processing and batch processing. This section introduces how to submit jobs using batch processing.

When submitting jobs using batch processing, it is necessary to describe a execution process with a script called a job script.
This section explains how to create, execute, and confirm job scripts.

Job scripts are basically the same format as shell scripts. The job script consists of an option area describing PBS job submission options and a user program area (shell script part) describing program to be executed.
In the user program area, it is possible to describe the processing to be executed before and after in the same format as the shell script in addition to the execution processing of the program itself.

Start emacs with the following command and open the "sample.sh" file.

emacs sample.sh

When emacs starts, enter the following job script manually or copy and paste it.
*How to describe the user program area of the job script differs depending on System A and System B / C. Please use the system of which you submit the job.

#! / bin / bash
#============ PBS Options ============
#QSUB -q gr19999b
#QSUB -ug gr19999
#QSUB -A p = 1: c = 1: t = 1: m = 1355M
#QSUB -W 10:00
#============ Shell Script ============
aprun -n $ QSUB_PROCS -d $ QSUB_THREADS -N $ QSUB_PPN ./a.out
  • Note) aprun command is required for batch processing in System A. Please refer to [here](./# aprun) for details.

#! / bin / bash
#============ PBS Options ============
#QSUB -q gr19999b
#QSUB -ug gr19999
#QSUB -A p = 1: c = 1: t = 1: m = 3413M
#QSUB -W 10:00
#============ Shell Script ============
./a.out

Next, modify the job script according to your available system and queue.

Specify the queue after #QSUB -q. The below is the queue name for each service course.

Service Course Queue Name
Entry course eb
Personal course pa (System A)
pb (System B)
pc (System C)
Group course, organization fixed rate, HPCI, JHPCN Queue name notified to applicant

If you specify a group course queue "gr19999b", enter as follows.

#QSUB -q gr19999b

This is required for users of group course, organization fixed rate, HPCI, JHPCN. Users of entry course and personal course can delete this line.

Specify the group name after #QSUB -ug. Group name of the group course is the same as the group name (the last letter removed from the queue name). Group name of the organization fixed rate, HPCI, and JHPCN may differ from the queue name, so please follow the specified group name.

If you submit the group course queue "gr19999b" of the system B, enter as follows.

#QSUB -ug gr19999

You can specify the amount of Job allocation resource after #QSUB -A. The following four arguments can be specified. If not specified, default values are automatically set.

-A argument Description Default value
p = procs Number of processes allocated during job execution 1
t = threads Number of threads allocated per process during job execution. Set environment variable OMP_NUM_THREADS automatically 1
c = cores Number of CPU cores allocated per process during job execution. Basically set to the same value as "t" 1
m = memory Amount of allocated memory per process during job execution (unit: M, G, T) 1355M (System A), 3413M (System B), 42666M (System C)

As we deal with a serial program (non-parallel program) as an example here, we describe an example in which one core, one thread, and 1355M of memory are allocated and executed below. For parallel computation, please see Execution of MPI Program.

#QSUB -A p = 1: c = 1: t = 1: m = 1355M

You can specify the upper limit of the job execution time after #QSUB -W. The default value of the job execution time and the upper limit value which can be specified are as follows:

Service course Default value Upper limit value
Entry course 1:00 (1 hour) 1:00 (1 hour)
Personal course 1:00 (1 hour) 168: 00 (168 hours)
Group course, Organization Fixed Rate, HPCI, JHPCN 1:00 (1 hour) 336: 00 (336 hours)

If the upper limit of the job execution time is 10 hours, enter as follows.

#QSUB -W 10:00

In the system A, a node where the job script is executed (gateway node) and a node where the program is executed (computing node) are separated. To run a program on a computing node, the aprun command **must** be used at the location where the job script program is executed regardless of whether it is a serial program or an MPI program.

The following are typical examples of aprun command. Please refer to the command manual (man aprun) for details of the options.

Option Function Default value Remarks
-n procs Specify the number of processes to be started 1 By specifying $ QSUB_PROCS, the number specified for "p" in the #QSUB -A option is inserted.
-d cores Specify the number of CPU cores to be secured per process 1 By specifying $ QSUB_THREADS, the number specified for "t" in the #QSUB -A option is inserted.
-N _procs \ _per \ node Specify the number of processes per node 1 By specifying $ QSUB_PPN, the optimal number of processes per node is automatically inserted.

Submit a job with the following command.

qsub sample.sh

If the above command is successfully received, the JOBID is displayed as shown below.

$ qsub sample.sh
1255581.jb

You can check the job status with the qs command.

Example of output result:

QUEUE     USER     JOBID      STATUS  PROC THRD CORE    MEM    ELAPSE( limit)
 gr19999b  b59999   1255581   RUN               1         1         1    1355M  00:00( 10:00)

When the "STATUS" is "RUN", the program is running on the computing node.
When the job ends, the standard output of the program is output to a file with the extension “o + number”, such as “B121811.o1255581”. The part of the number is JOBID.

Use the ls command to find and open the standard output file. If the job is running normally, job summary with the output "hello" is shown as below.

[b59999 @ laurel3 b] $ ls
B121811.o1255581 B121811.e1255581

[b59999 @ laurel3 b] $ cat B121811.o1255581
hello

================================================== ==============================

                Resource Usage on 2017-12-18 11: 50: 04.791317:

JobId: 1255581.jb
        Job_Name = B1255581
        queue = gr19999b
        Resource_List.Aoption = gr19999: p = 1: t = 1: c = 1: m = 1355: c_dash = 1
        Resource_List.select = 1: ncpus = 2: mpiprocs = 1: mem = 1355mb: jobfilter = long
        qtime = 2017-12-18 11:49:58
        stime = 2017-12-18 11:50:04
        resource_used.cpupercent = 0
        resource_used.cput = 00:00:00
        resources_used.mem = 0b
        resources_used.ncpus = 2
        resources_used.vmem = 0b
        resources_used.walltime = 00:00:02

================================================== ==============================

A file with a extension “e + number” is a standard error output.

In the above tutorial, the sample program was executed in a single process in the home directory.

For full-scale calculations, parallel execution of multiple processes and multiple threads using MPI and OpenMP is necessary. For that, the #QSUB -A option that is not covered in the job script this time is required. Also, since the standard execution time of a job is 1 hour, the #QSUB -W option is required for long-time calculations.
For details, please refer to the following page according to the systems.

If you are not in the entry course, you can use a large volume disk as a disk space in addition to the home (capacity: 100 GB). Please refer to the following page for details.

Several types of commercial tools are also available for compilers and calculation libraries. Please refer to the following page for details.

Sample scripts for the job scripts are provided according to the execution type. Please refer to the following page for details.

This concludes the explanation of compiling and executing of your application.


Copyright © Academic Center for Computing and Media Studies, Kyoto University, All Rights Reserved.