Intel® Advisor Help

Run GPU-to-GPU Performance Modeling (Preview)

To model performance of a Data Parallel C++ (DPC++), OpenCL™, or OpenMP* target application on a graphics processing unit (GPU) device, run the GPU-to-GPU modeling workflow of the Offload Modeling perspective.

Note

This is a technical preview feature.

Workflow

The GPU-to-GPU performance modeling workflow is similar to the CPU-to-GPU modeling workflow and includes the following steps:

  1. Measure the performance of GPU-enabled kernels running on an Intel® Graphics.
  2. Model application performance on a target GPU device and compare the estimated performance metrics to the baseline performance metrics.

Compared to the CPU-to-GPU performance modeling, GPU-to-GPU performance modeling has a better modeling accuracy because it considers the similarities in hardware configurations, compilers code-generation principles, and software implementation aspects between the baseline and modeled code. During the GPU-to-GPU performance modeling, Intel Advisor does the following:

Prerequisites

  1. Configure your system to analyze GPU kernels.
  2. Set Intel Advisor environment variables with an automated script to enable Intel Advisor command line interface.

Run the GPU-to-GPU Performance Modeling

You can run the GPU-to-GPU performance modeling only from command line with the Intel Advisor Python* scripts. Use one of the following methods:

Run the collect.py and analyze.py Scripts

Note

In the commands below, replace <APM> with $APM on Linux OS or with %APM% on Windows OS.

Run the scripts as follows:

  1. Collect performance metrics with the collect.py script and the --gpu option:

    advisor-python <APM>/collect.py <project-dir> --collect=basic --gpu [<analysis-options>] -- <target-application> [<target-options>]

    where <script-options> is one or several options to modify the script behavior. See collect.py Script for a full option list.

    This command runs the Survey, Trip Counts, and FLOP analyses only for the GPU kernels.

  2. Model application performance on a target GPU with analyze.py and --gpu:

    advisor-python <APM>/analyze.py <project-dir> --gpu [--config=<config-file>] [--out-dir <path>] [<analysis-options>]

    where:

    • --config=<config-file> is a target GPU configuration to model performance for. The following device configurations are available: gen11_icl (default), gen12_tgl, gen12_dg1, gen9_gt4, gen9_gt3, gen9_gt2.
    • --out-dir <path> is a directory to save all generated results files to.
    • <script-options> is one or several options to modify the script behavior. See analyze.py Script for a full option list.

Run the run_oa.py Script

Note

In the commands below, replace <APM> with $APM on Linux OS or with %APM% on Windows OS.

Collect baseline performance metrics for GPU kernels and model their performance on a target GPU:

advisor-python <APM>/run_oa.py <project-dir> --collect=basic --gpu [--config=<config-file>] [--out-dir <path>] [<analysis-options>] -- <target-application> [<target-options>]

where:

This command runs the Survey, Trip Counts, and FLOP analyses only for the GPU kernels and models their performance on the selected target GPU.

Note

After running run_oa.py and getting your first performance modeling results, you can run the analyze.py as many times as you need to remodel the performance with different software and/or hardware parameters.

View the Results

Once the Intel Advisor finishes the analyses, it prints a result summary and a result file location to a command prompt. By default, if you did not use the --out-dir option to change the result location, Intel Advisor generates a set of reports to the <project-dir>/e<NNN>/pp<NNN>/data.0 directory. The directory includes the following files:

Examine the results with the interactive HTML report. See Explore Performance Gain from GPU-to-GPU Modeling for details.

See Also