Intel® Advisor Help
To plot a Roofline chart, the Intel® Advisor runs two steps:
Intel® Advisor calculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH.
Intel Advisor automatically determines data type in the collected operations using the dst register.
For convenience, Intel Advisor has the shortcut --collect=roofline command line action, which you can use to run both Survey and Characterization analyses with a single command. This shortcut command is recommended to run the GPU Roofline Insights perspective.
There are two methods to run the GPU Roofline analysis. Use one of the following:
Optionally, you can also run the Performance Modeling analysis as part of the GPU Roofline Insights perspective. If you select this analysis, it models your application performance on a baseline GPU device as a target to compare it with the actual application performance. This data is used to suggest more recommendations for performance optimization.
Info: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
Method 1. Run the Shortcut Command
advisor --collect=roofline --profile-gpu --project-dir=./advi_results -– ./myApplication
This command collects data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_results
This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
Method 2. Run the Analyses Separately
Use this method if you want to analyze an MPI application.
advisor --collect=survey --profile-gpu --project-dir=./advi_results -- ./myApplication
advisor --collect=tripcounts --flops --profile-gpu --project-dir=./advi_results -- ./myApplication
These commands collect data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_results
This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
You can view the results in the Intel Advisor graphical user interface (GUI) or in CLI, or generate an interactive HTML report. See View the Results below for details.
Analysis Details
The CPU / Memory Roofline Insights workflow includes only the Roofline analysis, which sequentially runs the Survey and Characterization (trip counts and FLOP) analyses.
The analysis has a set of additional options that modify its behavior and collect additional performance data.
Consider the following options:
Roofline Options
To run the Roofline analysis, use the following command line action: --collect=roofline.
Recommended action options:
Options |
Description |
---|---|
--profile-gpu |
Analyze GPU kernels. This option is required for each command. |
--target-gpu |
Select a target GPU adapter to collect profiling data. The adapter configuration should be in the following format <domain>:<bus>:<device-number>.<function-number>. Only decimal numbers are accepted. Use this option if you have more than one GPU adapter on your system. The default is the latest GPU architecture version found on your system. TipTo see a list of GPU adapters available on your system, run advisor --help target-gpu and see the option description. |
--gpu-sampling-interval=<double> |
Set an interval (in milliseconds) between GPU samples. By default, it is set to 1. |
--enable-data-transfer-analysis |
Model data transfer between host memory and device memory. Use this option if you want to run the Performance Modeling analysis. |
--track-memory-objects |
Attribute memory objects to the analyzed loops that accessed the objects. Use this option if you want to run the Performance Modeling analysis. |
--data-transfer=<level> |
Set the level of details for modeling data transfers during Characterization. Use this option if you want to run the Performance Modeling analysis. Use one of the following values:
|
See advisor Command Option Reference for more options.
Performance Modeling Options
To run the Performance Modeling analysis, use the following command line action: --collect=projection.
The action options in the table below are required to use when you run the Performance Modeling analysis as part of the GPU Roofline Insights perspective:
Options |
Description |
---|---|
--profile-gpu |
Analyze GPU kernels. This option is required for each command. |
--enforce-baseline-decomposition |
Use the same local size and SIMD width as measured on the baseline. This option is required. |
--model-baseline-gpu |
Use the baseline GPU configuration as a target device for modeling. This option is required. This option automatically enables the --enforce-baseline-decomposition option, so you can use only --model-baseline-gpu. |
See advisor Command Option Reference for more options.
Intel Advisor provides several ways to work with the GPU Roofline results.
View Results in GUI
When you run Intel Advisor CLI, a project is created automatically in the directory specified with --project-dir. All the collected results and analysis configurations are stored in the .advixeproj project, that you can view in the Intel Advisor.
To open the project in GUI, you can run the following command:
advisor-gui <project-dir>
You first see a Summary report that includes performance characteristics for code regions in your code. The left side of the report shows metrics for code regions that run on a GPU, the right side of the report shows metrics for code regions that run on a CPU. The report shows the following data:
Program metrics for all code regions executed on the GPU and loops/functions executed on the CPU, including total execution time, GPU usage effectiveness, and the number of executed operations.
Preview Roofline charts for CPU and GPU parts of your code. The charts plot an application's achieved performance and arithmetic intensity against the maximum achievable performance for top three dots and total dot, which combines all loops/functions (for CPU) and kernels (for GPU). By default, it shows Roofline for a dominating operations data type (INT or FLOAT). You can switch to a different data type using the FLOAT/INT toggle.
This pane also reports the number of operations transferred per second, bandwidth for different memory levels, and an instruction mix histogram (for GPU only).
Top five hotspots on CPU and GPU sorted by elapsed time.
Performance characteristics of how well the application uses hardware resources.
Information about the analyses executed and platforms that the data was collected on.
View an Interactive HTML Report
Intel Advisor enables you to export two types of HTML reports, which you can open in your preferred browser and share:
For details on exporting the HTML reports, see Work with Standalone HTML Reports.
Save a Read-only Snapshot
A snapshot is a read-only copy of a project result, which you can view at any time using the Intel Advisor GUI. To save an active project result as a read-only snapshot:
advisor --snapshot --project-dir=<project-dir> [--cache-sources] [--cache-binaries] -- <snapshot-path>
where:
To open the result snapshot in the Intel Advisor GUI, you can run the following command:
advisor-gui <snapshot-path>
You can visually compare the saved snapshot against the current active result or other snapshot results.
Continue to identify performance bottlenecks on GPU. For details about the metrics reported, see Accelerator Metrics.