GPU Roofline Insights perspective enables you to estimate and visualize actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor.
There are two ways to run GPU Roofline Insights perspective: from the Intel® Advisor GUI and from CLI. Intel Advisor enables you to open results collected using both methods in the GUI.
Run GPU Roofline Insights Perspective from Intel® Advisor GUI
In the
Analysis Workflow pane, use a drop-down menu to select the
GPU Roofline Insights perspective, set data collection accuracy level to
Low, and click the
button to run it. At this accuracy level,
Intel Advisor:
For details about data collection accuracy presets, see Intel Advisor User Guide: GPU Roofline Accuracy Presets. Upon completion, Intel Advisor displays a GPU Roofline Summary. Switch to the GPU Roofline Regions tab to view the Roofline Chart and identify the main factors limiting the performance of your application.
GPU profiling is applicable only to Intel® Processor Graphics.
A Roofline chart plots an application's achieved performance and arithmetic intensity against the machine's maximum achievable performance:
In general:
Depending on your system configuration the following rooflines might be available on the Roofline chart:
The greater the distance between a dot and the highest achievable roofline, the more opportunity exists for performance improvement.
The GPU Roofline chart is based on a CPU Roofline chart layout, but there are some differences:
The dots on the chart correspond to OpenCL, OpenMP, Level Zero and SYCL kernels, while in the CPU version, they correspond to individual loops.
Some displayed information and controls (for example, thread/core count) are not relevant to GPU Roofline. For more information, see the table below.
The GPU Roofline chart enables you to view arithmetic intensity of one kernel at multiple memory levels. To do so, double-click a dot representing this kernel or select it and press ENTER. The dots that appear on the Roofline chart correspond to different memory levels used to calculate arithmetic intensity. Hover over a dot to identify its arithmetic intensity. To show or hide certain dots from a chart, use the Memory Level drop-down filter.
Run GPU Roofline Insights Perspective from Command Line Interface
To run GPU Roofline Insights perspective using advisor command line interface, use the following command:
advisor --collect=roofline --profile-gpu --project-dir=./advi --search-dir src:p=./advi –- myApplication
advisor --collect=survey --profile-gpu --project-dir=./advi --search-dir src:p=./advi –- myApplication
advisor --collect=tripcounts --no-trip-counts --flop --profile-gpu --project-dir=./advi --search-dir src:p=./advi –- myApplication
Where:
This command is a batch mode that runs two analyses one by one:
To view the achieved performance of your application against hardware-imposed performance ceilings on an interactive Roofline chart, open the collected results in the Intel Advisor GUI or use the following command to generate an interactive HTML Roofline report:
advisor --report=roofline --profile-gpu --report-output=./advi/advisor-roofline.html --project-dir=./advi
Where report-output option specifies the directory and the HTML file into which Intel Advisor saves the generated report.
By default, Intel Advisor generates a FLOAT Roofline chart. To switch to INT Roofline chart, add a –-data-type=int option to your command.
For details about generating CLI reports, see the respective section in the Intel Advisor User Guide or use the following command in your terminal:advisor --help report
Intel Advisor enables you to create a read-only result snapshot using the following command:
advisor --snapshot --project-dir=./advi --pack --cache-sources --cache-binaries -- /tmp/my_proj_snapshot
What's Next
Use the GPU Roofline Summary (available in GUI only) to compare performance of your application on a CPU and on a GPU device.
Investigate performance metrics for your kernels and recommendations with possible optimization steps in the GPU Code Analytics pane.
See Also
Explore a use case for optimizing GPU usage described in Intel Advisor Cookbook: Identify Code Regions to Offload to GPU and Visualize GPU Usage.