Multithreaded MPI-1 Benchmarks

Multithreaded MPI-1 Benchmarks#

The IMB-MT component of the Intel(R)MPI Benchmarks provides benchmarks for some of the MPI-1 functions, running in multiple threads. This implies the use of the MPI_THREAD_MULTIPLE mode and execution of several threads per rank, each performing the communication.

The design of multithreaded benchmarks is based on the following key principles:

To make the communication patterns meaningful, the benchmark has to meet the following requirements:

  • Data must be distributed between threads. To avoid threads transferring the same data, in a multithreaded communication the input and the output data must be properly distributed between the threads. This must be done before the main benchmarking loop starts.

  • The communication pattern must ensure the deterministic order of data sends and receives. For point-to-point MPI-1 communications, this could be done by separating the thread message flows with tags. This method, however, is unavailable for collective MPI-1 communications. As a result, a different method is used for both collective and point-to-point benchmarks, with each thread using its own MPI communicator.

Thread control inside a rank is performed using the OpenMP* API.

Note

IMB-MT benchmarks are always run as if the multiple mode is enabled. Additionally, they run with the maximum number of processes available for a benchmark, rather than running multiple times with an increasing number of processes.

The following benchmarks are available within the IMB-MT component:

  • PingPongMT

  • PingPingMT

  • SendrecvMT

  • ExchangeMT

  • UnibandMT

  • BibandMT

  • BcastMT

  • ReduceMT

  • AllreduceMT