Intel® Advisor Help

Run GPU Roofline Insights Perspective from Command Line

To plot a Roofline chart, the Intel® Advisor runs two steps:

  1. Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.

  2. Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.

    Intel® Advisor calculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH

    Note

    Intel Advisor automatically determines data type in the collected operations using the dst register.

For convenience, Intel Advisor has the shortcut --collect=roofline command line action, which you can use to run both Survey and Characterization analyses with a single command. This shortcut command is recommended to run the GPU Roofline Insights perspective.

Prerequisites

  1. Configure your system to analyze GPU kernels.
  2. Set Intel Advisor environment variables with an automated script to enable the advisor command line interface (CLI).

Note

In the commands below, the options in square brackets ([--<option>]) are recommended if you want to change what data is collected.

Plot a GPU Roofline Chart

Run the Roofline analysis for GPU using one of the following methods:

where:

Note

The Roofline analysis collects data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, the Intel Advisor generates a Memory-Level Roofline by default.

If you want to collect advanced data for loops/functions running on CPU, use --stacks and/or --enable-cache-simulation options.

See advisor Command Line Interface Reference for more options.

Example

Collect GPU Roofline data for a GPU adapter with the address 0:0:2.0:

advisor --collect=roofline --project-dir=./advi -–profile-gpu -–target-gpu=0:0:2.0 -- myApplication

View the Results

Intel Advisor provides several ways to work with the GPU Roofline results.

View Results in GUI

When you run Intel Advisor CLI, a project is created automatically in the directory specified with --project-dir. All the collected results and analysis configurations are stored in the .advixeproj project, that you can view in the Intel Advisor.

To open the project in GUI, you can run the following command:

advisor-gui <project-dir>

Note

If the report does not open, click Show Result on the Welcome pane.

You first see a Summary report that includes performance characteristics for code regions in your code. The left side of the report shows metrics for code regions that run on a GPU, the right side of the report shows metrics for code regions that run on a CPU. The report shows the following data:

View an Interactive HTML Report

To generate an interactive HTML report for the GPU Roofline chart from CLI, run the following command:

advisor --report=roofline --project-dir=<project-dir> --report-output=<path> --gpu [--data-type=<type>]

where:

When you open the report, you see the GPU Roofline chart with the selected configuration. In this report, you can:

Interactive GPU Roofline HTML report

Save a Read-only Snapshot

A snapshot is a read-only copy of a project result, which you can view at any time using the Intel Advisor GUI. To save an active project result as a read-only snapshot:

advisor --snapshot --project-dir=<project-dir> [--cache-sources] [--cache-binaries] -- <snapshot-path>

where:

  • --cache-sources is an option to add application source code to the snapshot.

  • --cache-binaries is an option to add application binaries to the snapshot.

  • <snapshot-path is a path and a name for the snapshot. For example, if you specify /tmp/new_snapshot, a snapshot is saved in a tmp directory as new_snapshot.advixeexpz. You can skip this and save the snapshot to a current directory as snapshotXXX.advixeexpz.

To open the result snapshot in the Intel Advisor GUI, you can run the following command:

advisor-gui <snapshot-path>

You can visually compare the saved snapshot against the current active result or other snapshot results.

Next Steps

Continue to identify performance bottlenecks on GPU. For details about the metrics reported, see Accelerator Metrics.

See Also