For the IMB-MT component, a different set of command-line options is available than that for the other components. The -input, -include, and -exclude options are common for all the components and are described in Command-Line Control.
Here is the formal description of IMB-MT options:
IMB-MT [-thread_level <level>]
[-input <filename>]
[-include benchmark1[,benchmark2[,...]]
[-exclude benchmark1[,benchmark2[,...]]
[-stride <value>]
[-warmup <value>]
[-repeat <value>]
[-barrier <value>]
[-count <value>]
[-malloc_align <value>]
[-malloc_algo <algorithm>]
[-check <value>]
[-datatype <type>]
[benchmark1[,benchmark2[,...]]
Sets a value for the parameter to be passed to MPI_Init_thread. The multiple value is required for the multithreaded mode. If another value is provided, the benchmark functionality is nearly equivalent to that of IMB-MPI1 and the results are likely to be the same. In the multiple mode, the OMP_NUM_THREADS environment variable controls the number of threads forked in each rank.
Each thread deals with a buffer of the size count*sizeof(datatype). The rank's buffer is not shared among the threads, but each thread has its own full-size buffer. Thus, PingPong with two ranks and two threads will transfer twice as much data as PingPong with two ranks and one thread.
For point-to-point benchmarks, sets the distance between the communicating ranks in each pair.
For example, PingPongMT with stride=1 launches simultaneous point-to-point communications for pairs (0,1), (2,3),...; for stride=2 the pairs are (0,2), (1,3),...; for stride=4 the pairs are (0,4), (1,5),....
The special value stride=0 (default) sets the stride to a half of the number of MPI ranks.
Setting the stride equal to the -ppn value is useful for benchmarking inter-node communications only.
Sets the number of benchmark cycles not included in time counting.
Sets the number of counted benchmark cycles.
Sets a barrier to be used when measuring time:
on | Use the MPI_Barrier operation |
off | Do not use a barrier |
special | Use a special barrier implementation instead of MPI_Barrier |
Sets the message length.
To execute benchmarks with a set of message lengths, specify the lengths as a comma-separated list. Lengths must be given as numbers of message types rather than as bytes.
Sets the size for manual alignment of allocated buffers in bytes. The default value is 64 and is acceptable for most cases. See the AlignedAllocator class definition in MT/MT_benchmark.h for details.
Sets the algorithm for allocating memory in a multithreaded environment. The default value is serial and is acceptable for most cases.
Enables/disables correctness checking.
When enabled, the buffers to transfer are filled with integer sequences generated based on the rank number. After each transfer iteration, the check procedure verifies that the receive buffers are filled with the expected values.
This option requires the -datatype int option set.
Sets the MPI_Datatype for messages. The default value is int.