Tuning is very dependent on the specifications of the particular platform. Intel carefully determines the tuning parameters for a limited set of platforms, and makes them available for autotuning using the I_MPI_TUNING_MODE environment variable.
For the full list of platforms supported by the I_MPI_TUNING_MODE environment variable, see Tuning Environment Variables. This variable has no effect on platforms not included in this list. For these platforms, use the I_MPI_TUNING_AUTO Family Environment Variables directly to find the best settings.
The autotuner functionality lets you automatically find the best algorithms for collective operations . The autotuner search space can be modified by I_MPI_ADJUST_<opname>_LIST variables from I_MPI_ADJUST Family Environment Variables.
The collectives currently available for autotuning are: MPI_Allreduce, MPI_Bcast, MPI_Barrier, MPI_Reduce, MPI_Gather, MPI_Scatter, MPI_Alltoall, MPI_Allgatherv, MPI_Reduce_scatter, MPI_Reduce_scatter_block, MPI_Scan, MPI_Exscan, MPI_Iallreduce, MPI_Ibcast, MPI_Ibarrier, MPI_Ireduce, MPI_Igather, MPI_Iscatter, MPI_Ialltoall, MPI_Iallgatherv, MPI_Ireduce_scatter, MPI_Ireduce_scatter_block, MPI_Iscan, and MPI_Iexscan.
To get started with autotuning, follow these steps:
Launch the application with the autotuner enabled and specify the dump file, which stores results:
I_MPI_TUNING_MODE=autoI_MPI_TUNING_BIN_DUMP=<tuning_results.dat>
Launch the application with the tuning results generated at the previous step:
I_MPI_TUNING_BIN=<tuning_results.dat>Or use the -tune Hydra option.
If you experience performance issues, see Environment Variables for Autotuning.
For example:
$ export I_MPI_TUNING_MODE=auto $ export I_MPI_TUNING_AUTO_SYNC=1 $ export I_MPI_TUNING_AUTO_ITER_NUM=5 $ export I_MPI_TUNING_BIN_DUMP=./tuning_results.dat $ mpirun -n 128 -ppn 64 IMB-MPI1 allreduce -iter 1000,800 -time 4800
$ export I_MPI_TUNING_BIN=./tuning_results.dat $ mpirun -n 128 -ppn 64 IMB-MPI1 allreduce -iter 1000,800 -time 4800
To tune collectives on a communicator identified with the help of Application Performance Snapshot (APS), execute the following variable at step 1: I_MPI_TUNING_AUTO_COMM_LIST=comm_id_1, … , comm_id_n.