Command-Line Control for IMB-MT Benchmarks

For the IMB-MT component, a different set of command-line options is available than that for the other components. The -input, -include, and -exclude options are common for all the components and are described in Command-Line Control.

Here is the formal description of IMB-MT options:

IMB-MT [-thread_level <level>] 
       [-input <filename>]
       [-include benchmark1[,benchmark2[,...]]
       [-exclude benchmark1[,benchmark2[,...]]
       [-stride <value>]
       [-warmup <value>]
       [-repeat <value>]
       [-barrier <value>]
       [-count <value>]
       [-malloc_align <value>]
       [-malloc_algo <algorithm>]
       [-check <value>]
       [-datatype <type>]
       [benchmark1[,benchmark2[,...]]

-thread_level {single|funneled|serialized|multiple|nompinit} Option

Sets a value for the parameter to be passed to MPI_Init_thread. The multiple value is required for the multithreaded mode. If another value is provided, the benchmark functionality is nearly equivalent to that of IMB-MPI1 and the results are likely to be the same. In the multiple mode, the OMP_NUM_THREADS environment variable controls the number of threads forked in each rank.

Each thread deals with a buffer of the size count*sizeof(datatype). The rank's buffer is not shared among the threads, but each thread has its own full-size buffer. Thus, PingPong with two ranks and two threads will transfer twice as much data as PingPong with two ranks and one thread.

-stride <stride> Option

For point-to-point benchmarks, sets the distance between the communicating ranks in each pair.

For example, PingPongMT with stride=1 launches simultaneous point-to-point communications for pairs (0,1), (2,3),...; for stride=2 the pairs are (0,2), (1,3),...; for stride=4 the pairs are (0,4), (1,5),....

The special value stride=0 (default) sets the stride to a half of the number of MPI ranks.

Setting the stride equal to the -ppn value is useful for benchmarking inter-node communications only.

-warmup Option

Sets the number of benchmark cycles not included in time counting.

-repeat Option

Sets the number of counted benchmark cycles.

-barrier {on|off|special} Option

Sets a barrier to be used when measuring time:

on Use the MPI_Barrier operation
off Do not use a barrier
special Use a special barrier implementation instead of MPI_Barrier

-count Option

Sets the message length.

To execute benchmarks with a set of message lengths, specify the lengths as a comma-separated list. Lengths must be given as numbers of message types rather than as bytes.

-malloc_align Option

Sets the size for manual alignment of allocated buffers in bytes. The default value is 64 and is acceptable for most cases. See the AlignedAllocator class definition in MT/MT_benchmark.h for details.

-malloc_algo {serial|continuous|parallel} Option

Sets the algorithm for allocating memory in a multithreaded environment. The default value is serial and is acceptable for most cases.

-check {on|off} Option

Enables/disables correctness checking.

When enabled, the buffers to transfer are filled with integer sequences generated based on the rank number. After each transfer iteration, the check procedure verifies that the receive buffers are filled with the expected values.

This option requires the -datatype int option set.

-datatype {int|char} Option

Sets the MPI_Datatype for messages. The default value is int.