Classification of MPI-1 Benchmarks#

Intel(R) MPI Benchmarks introduces the following classes of benchmarks:

  • Single Transfer

  • Parallel Transfer

  • Collective benchmarks

The following table lists the MPI-1 benchmarks in each class:

Single Transfer

Parallel Transfer

Collective

PingPong

Sendrecv

Bcast

Multi-Bcast

PingPongSpecificSource

Exchange

Allgather

Multi-Allgather

PingPongAnySource

Multi-PingPong

Allgatherv

Multi-Allgatherv

PingPing

Multi-PingPing

Alltoall

Multi-Alltoall

PingPingSpecificSource

Multi-Sendrecv

Alltoallv

Multi-Alltoallv

PingPingAnySource

Multi-Exchange

Scatter

Multi-Scatter

Uniband

Scatterv

Multi-Scatterv

Biband

Gather

Multi-Gather

Multi-Uniband

Gatherv

Multi-Gatherv

Multi-Biband

Reduce

Multi-Reduce

Reduce_scatter

Mu lti-Reduce_scatter

Allreduce

Multi-Allreduce

Barrier

Multi-Barrier

Each class interprets results in a different way.

Single Transfer Benchmarks#

Single transfer benchmarks involve two active processes into communication. Other processes wait for the communication completion. Each benchmark is run with varying message lengths. The timing is averaged between two processes. The basic MPI data type for all messages is MPI_BYTE.

Throughput values are measured in MBps and can be calculated as follows:

throughput = X/time

where

  • time is measured in μ sec.

  • X is the length of a message, in bytes.

Parallel Transfer Benchmarks#

Parallel transfer benchmarks involve more than two active processes into communication. Each benchmark runs with varying message lengths. The timing is averaged over multiple samples. The basic MPI data type for all messages is MPI_BYTE. The throughput calculations of the benchmarks take into account the multiplicity nmsg of messages outgoing from or incoming to a particular process according to the following table:

Benchmark

Turnover

Sendrecv (sends and receives X bytes)

2X bytes, nmsg=2

Exchange

4X bytes, nmsg=4

Throughput values are measured in MBps and can be calculated as follows:

throughput = nmsg*X/time,

where

  • time is measured in μsec.

  • X is the length of a message, in bytes.

Collective Benchmarks#

Collective benchmarks measure MPI collective operations. Each benchmark is run with varying message lengths. The timing is averaged over multiple samples. The basic MPI data type for all messages is MPI_BYTE for pure data movement functions and MPI_FLOAT for reductions.

Collective benchmarks show bare timings.