Parallel and distributed computing

How to program parallel computers?

OpenMP

../../_images/OpenMP_logo.png
../../_images/openmp-model.png

The fork-and-join parallel model. Figure taken from https://docs.nersc.gov/development/programming-models/openmp/

Message Passing Interface (MPI)

../../_images/mpi-basic.png

Basic MPI communication between two processors. Figure taken from https://www.cs.mtsu.edu/~rbutler/courses/pp6330/www.llnl.gov/computing/tutorials/workshops/workshop/mpi/MAIN.html.

The most popular implementations of the MPI standard are: MPICH, OpenMPI.

GPU programming models

../../_images/gpu_acceleration.png

Heterogeneous programming model. Figure taken from https://es.mathworks.com/help/gpucoder/gs/gpu-prog-paradigm.html


Sharing cluster with others

Scheduling jobs and managing resources is one of the key components of HPC clusters that sets them apart from other types of computing systems. Since HPC clusters can have hundreds of users and compute nodes, it would be impossible to decide which users should use which resources without a queuing system. The specialized software called scheduler is responsible for deciding which user job is executed when on which resources (nodes).

../../_images/restaurant_queue_manager.png

An illustration of the HPC job manager on the example of the restaurant. Figure taken from https://epcced.github.io/hpc-intro/13-scheduler/index.html.