This recipe illustrates how to estimate application performance from one Intel® graphics processing unit (GPU) architecture to another by running the Offload Modeling perspective from the Intel® Advisor.
The performance estimation plays an important role in determining the next steps for the future-generation GPU architectures. For such cases, the GPU-to-GPU modeling is more accurate than the CPU-to-GPU modeling because of inherent differences between CPU and GPU execution flows.
In this recipe, use the Intel Advisor to analyze performance a SYCL application with the GPU-to-GPU modeling flow of the Offload Modeling perspective to estimate the profitability of offloading the application to the Intel® Iris® Xe MAX graphics (gen12_dg1 configuration).
Directions:
This section lists the hardware and software used to produce the specific result shown in this recipe:
Available for download as a standalone and as part of the Intel® oneAPI Base Toolkit.
Available for download as part of the Intel® oneAPI Base Toolkit.
You can download a precollected Offload Modeling report for the Mandelbrot application to follow this recipe and examine the analysis results.
Set up environment variables for oneAPI tools:
source <oneapi-install-dir>/setvars.sh
cd mandelbrot/ && mkdir build && cd build && cmake .. && make
You can run the GPU-to-GPU modeling using Intel Advisor command line interface (CLI), Python* scripts, or Intel Advisor graphical user interface (GUI).
In this section, use a special command line collection preset for the Offload Modeling perspective with the --gpu option to run all perspective analyses for the GPU-to-GPU modeling with a single command:
advisor --collect=offload --project-dir=./mandelbrot-advisor --gpu --config=gen12_dg1 -- ./mandelbrot
This command runs the perspective with the default medium accuracy and runs the following analyses one-by-one:
Important: The command line collection preset does not support MPI applications. You will need to run the analyses separately to analyze MPI application.
Once the analyses are completed, the result summary is printed to the terminal. You can continue to view the results in the Intel Advisor GUI or in an interactive HTML report from your preferred web browser.
In this section, examine the HTML report to understand the GPU-to-GPU modeling results. The HTML report is generated automatically after you run the Offload Modeling from CLI or using the Python scripts and is saved to ./mandelbrot-advisor/e000/report/advisor-report.html. You can open the report in your preferred web browser.
In the Summary tab, examine the Top Metrics and Program Metrics panes to understand the performance gain.
You can navigate between Summary, Accelerated Regions, and Source View tabs to understand details about the offloaded regions, examine useful metrics and the potential performance gain.
The Accelerated Regions tab provides detailed information for the offloaded code regions along with the source code in the bottom pane. In this view, you can examine different useful metrics for offloaded regions of interest. For example, examine the following metrics measured for the kernels running on the baseline GPU: iteration space, thread occupancy, SIMD width, local size, global size.
Examine the following metrics estimated for the target GPU: performance issues, time, speedup, data transfer with reuse.
See Accelerator Metrics for detailed description and interpretation of these metrics.
You can run the GPU-to-GPU modeling using Intel Advisor command line interface (CLI), Python* scripts, or Intel Advisor GUI.
Run Intel Advisor Python Scripts (Instead of Offload Modeling Collection Preset)
Use the special Python scripts delivered with the Intel Advisor to run the GPU-to-GPU modeling. These scripts use the Intel Advisor Python API to run the analyses.
For example, run the run_oa.py script with the --gpu to execute the perspective using a single command as follows:
$ advisor-python $APM/run_oa.py ./mandelbrot-advisor --collect=basic --gpu --config=gen12_dg1 -- ./mandlebrot
The run_oa.py script runs the following analyses one-by-one:
Important: The command line collection preset does not support MPI applications. Use the Intel Advisor CLI to analyze MPI application.
Once the analyses are completed, the result summary is printed to the terminal. You can continue to view the results in the Intel Advisor GUI or in an interactive HTML report from your preferred web browser.
Run Intel Advisor GUI (Instead of Offload Modeling Collection Preset)
Prerequisite: Create a project for the Mandelbrot application.
To run GPU-to-GPU modeling from Intel Advisor GUI:
Once the perspective is completed, the GPU-to-GPU offload modeling result is shown in the pane on the right.
With the GPU-to-GPU modeling, you can get more accurate projections for your application performance on the next-generation GPUs even before you have the hardware. The metrics collected by Offload Modeling can help you understand performance of the kernels running on the baseline GPU. The new interactive HTML report gives GUI-like experience and allows you to switch between Offload Modeling and GPU Roofline Insights perspectives, almost as in GUI.