Multithreaded MPI-1 Benchmarks
The IMB-MT component of the IntelĀ® MPI Benchmarks provides benchmarks
for some of the MPI-1 functions, running in multiple threads. This implies
the use of the MPI_THREAD_MULTIPLE mode and
execution of several threads per rank, each performing the communication.
The design of multithreaded benchmarks is based on the following key
principles:
- To make the communication patterns meaningful, the benchmark has
to meet the following requirements:
- Data must be distributed between threads. To avoid threads
transferring the same data, in a multithreaded communication the
input and the output data must be properly distributed between
the threads. This must be done before the main benchmarking loop
starts.
- The communication pattern must ensure the deterministic order
of data sends and receives. For point-to-point MPI-1 communications,
this could be done by separating the thread message flows with
tags. This method, however, is unavailable for collective MPI-1
communications. As a result, a different method is used for both
collective and point-to-point benchmarks, with each thread using
its own MPI communicator.
- Thread control inside a rank is performed using the OpenMP* API.
IMB-MT benchmarks are always run as if the multiple mode is enabled.
Additionally, they run with the maximum number of processes available
for a benchmark, rather than running multiple times with an increasing
number of processes.
The following benchmarks are available within the IMB-MT component:
- PingPongMT
- PingPingMT
- SendrecvMT
- ExchangeMT
- UnibandMT
- BibandMT
- BcastMT
- ReduceMT
- AllreduceMT
See Also
Command-Line
Control for IMB-MT Benchmarks