Intel® Advisor Help

Explore Performance Gain from GPU-to-GPU Modeling (Preview)

Enabled Analyses

Performance data collection for GPU kernels only + Performance modeling for GPU kernels only

Note

This is a preview feature available only from command line with the Intel® Advisor Python* scripts. See Run GPU-to-GPU Offload Modeling.

Result Interpretation

By default, the GPU-to-GPU performance modeling results are generated to <project-dir>/e<NNN>/pp<NNN>/data.0. To view the results, go to this directory or the directory that you specified with the out-dir option and open an interactive HTML report report.html.

The structure and controls of the HTML report generated for the GPU-to-GPU performance modeling are similar to the HTML report for the CPU-to-GPU offload modeling, but the content is different because for the GPU-to-GPU modeling, Intel Advisor models performance only for GPU-enabled parts of your application.

The report includes the following tabs Summary, Offloaded Regions, Non-Offloaded Regions, Call Tree, Configuration, Logs. You can switch the tabs using the links in the top left.

Note

The Non-Offloaded Regions tab shows only GPU kernels that cannot be modeled. If all kernels are modeled, the tab is empty. For example, it might show kernels with some required metrics missing. For the GPU-to-GPU modeling, estimated speedup is not a reason for not offloading a kernel.

When you open the report, it first shows the Summary tab. In this tab, you can review the summary of the modeling results and estimated performance metrics for some GPU kernels in your application.

Summary of the GPU-to-GPU Offload Modeling HTML report

Note

The Top non offloaded pane shows only GPU kernels that cannot be modeled. If all kernels are modeled, the pane is empty. For the GPU-to-GPU modeling, estimated speedup is not a reason for not offloading a kernel.

To see the details about each GPU kernel, go to the Offloaded Regions or the Call Tree tab. These tabs report the same metrics, but the Offloaded Regions shows only modeled kernels, while the Call Tree shows all kernels, including non-modeled ones.

Offloaded Regions tab of the GPU-to-GPU Offload Modeling HTML report

Go to the Configuration tab to review the detailed target device configuration used for modeling in a read-only mode. You can also review the comments for each parameter and their possible values.

Go to the Logs tab to see a command line used to run the analyses and all output messages reported in console during the script(s) execution. This tab reports four types of messages: Error, Warning, Info, and Debug in the order of their appearance in console during the script(s) execution.

Note

By default, only Error, Warning, and Info messages are shown. To control types of messages shown, hover over the Severity column header and click the menu icon to open filters pane.

Next Steps