Intel® Advisor Help
With Intel® Advisor, you can analyze parallel tasks running on a cluster to examine performance of your MPI application. Use the Intel® MPI gtool with mpiexec or mpirun to invoke the advisor command and spawn MPI processes across the cluster.
You can analyze MPI applications only through the command line interface, but you can view the result through the standalone GUI, as well as the command line.
Consider the following when running collections for an MPI application:
You can use the Intel Advisor with the Intel® MPI Library and other MPI implementations, but be aware of the following details:
You may need to adjust the command examples in this section to work for non-Intel MPI implementations. For example, adjust commands provided for process ranks to limit the number of processes in the job.
An MPI implementation needs to operate in cases when there is the Intel Advisor process (advisor) between the launcher process (mpiexec) and the application process. This means that the communication information should be passed using environment variables, as most MPI implementations do. Intel Advisor does not work on an MPI implementation that tries to pass communication information from its immediate parent process.
You can use Intel Advisor to generate the command line for collecting results on multiple MPI ranks. To do that,
You can generate command lines for modeling your MPI application performance with Offload Modeling scripts. Run the collect.py script with the --dry-run option:
advisor-python <APM>/collect.py <project-dir> [--config <config-file>] –-dry-run -- <application-name> [myApplication-options]
where:
Use the -gtool option of mpiexec with Intel® MPI Library 5.0.2 and higher:
$ mpiexec –gtool “advisor --collect=<analysis-type> --project-dir=<project-dir>:<ranks-set>” -n <N> <application-name> [myApplication-options]
where:
gtool option of mpiexec allows you to select MPI ranks to run analyses for. This can decrease overhead.
For detailed syntax, refer to the Intel® MPI Library Developer Reference for Linux*OS or Intel® MPI Library Developer Reference for Windows* OS.
Use mpiexec with the advisor command to spawn processes across the cluster and collect data about the application.
Each process has a rank associated with it. This rank is used to identify the result data.
To collect performance or dependencies data for an MPI program with Intel Advisor, the general form of the mpiexec command is:
$ mpiexec -n <N> "advisor --collect=<analysis-type> --project-dir=<project-dir> --search-dir src:r=<source-dir>" myApplication [myApplication-options]
where:
By default, Intel Advisor analyzes performance of a whole application. In some cases, you may want to focus on the most time consuming section or disable collection for the initialization or finalization phases. Intel Advisor supports the MPI region control with the MPI_Pcontrol() function. This function allows you to enable and disable collection for specific application regions in the source code.
To use the function, add it to the your application source code as follows:
According to the MPI standard, MPI_Pcontrol() accepts other numbers as arguments. For the Intel Advisor, only the 0 and 1 are relevant.
You can also use MPI_Pcontrol() to mark specific code regions. Use MPI_Pcontrol(<region>) at the beginning of the region, and MPI_Pcontrol(-<region>) at the end of the region, where <region> is 5 and higher.
You can model your MPI application performance on an accelerator to determine whether it can benefit from offloading to a target device.
You can run the performance modeling using only advisor command line interface or a combination of advisor and the analyze.py script. For example. to use advisor and analyze.py:
$ mpiexec –gtool “advisor --collect=<analysis-type> --project-dir=<project-dir>:<ranks-set>” -n <N> <application-name> [myApplication-options]
$ advisor-python <APM>/analyze.py <project-dir> --mpi-rank <n> [--options]
where:
$advisor-python <APM>/analyze.py <project-dir>/rank.<n> [--options]
Consider using --config=<config-file> option to set a pre-defined TOML file and/or a path to a custom TOML configuration file if you want to use custom hardware parameters for performance modeling and/or model performance for a multi-rank MPI applications. By default, Offload Modeling models performance for a single-rank MPI application on a gen11_icl target configuration.
Configure Performance Modeling for Multi-Rank MPI
By default, Offload Modeling is optimized to model performance for a single-rank MPI application. For multi-rank MPI applications, do one of the following:
Scale Target Device Parameters
By default, Offload Modeling assumes that one MPI process is mapped to one GPU tile. You can configure the performance model and map MPI ranks to a target device configuration.
[scale] Tiles_per_process = <float>
where <float> is a fraction of a GPU tile that corresponds to a single MPI process. It accepts values from 0.01 to 0.6. This parameter automatically adjusts:
$ advisor-python <APM>/analyze.py <project-dir> --config my_config.toml --mpi-rank <n> [--options]
Ignore MPI Time
For multi-rank MPI workloads, time spent in MPI runtime can differ from rank to rank and cause differences in the whole application time and Offload Modeling projections. If MPI time is significant and you see the differences between ranks, you can exclude time spent in MPI routines from the analysis.
$ advisor-python <APM>/analyze.py <project-dir> --mpi-rank <n> [--options]
In the report generated, all per-application performance modeling metrics are re-calculated based on application self time excluding time spent in MPI calls from the analysis. This should improve modeling across ranks.
As a result of collection, Intel Advisor creates a number of result directories in the directory specified with --project-dir. The nested result directories are named as rank.0, rank.1, ... rank.n, where the numeric suffix n corresponds to the MPI process rank.
To view the performance or dependency results collected for a specific rank, you can either open a result project file (*.advixeproj) that resides in the --project-dir via the Intel Advisor GUI, or run the Intel Advisor CLI report:
$ advisor --report=<analysis-type> --project-dir=<project-dir>:<ranks-set>
You can view only one rank's results at a time.
For Offload Modeling, you do not need to run the --report command. The reports are generated automatically after you run performance modeling. You can either open a result project file (*.advixeproj) that resides in the <project-dir> using the Intel Advisor GUI or view an HTML report in the respective rank directory at <project-dir>/rank.<n>/e<NNN>/pp<NNN>/data.0 with your preferred browser.
For more details on analyzing MPI applications, see the Intel MPI Library and online MPI documentation on the Intel® Developer Zone at https://software.intel.com/content/www/us/en/develop/tools/mpi-library/get-started.html
Hybrid applications: Intel MPI Library and OpenMP* on the Intel Developer Zone at https://software.intel.com/content/www/us/en/develop/articles/hybrid-applications-intelmpi-openmp.html