Intel® Advisor Help

GPU Roofline Accuracy Levels in Command Line

For each perspective, Intel® Advisor has several levels of collection accuracy. Each accuracy level is a set of analyses and properties that control what data is collected and the level of collection details. The higher accuracy value you choose, the higher runtime overhead is added.

In CLI, each accuracy level corresponds to a set of commands with specific options that you should run one by one to get a desired result.

The following accuracy levels are available:

Comparison / Accuracy Level

Low

Medium

Overhead

5 - 10x

15 - 50x

Goal

Analyze kernels in your application running on GPU

Analyze kernels running on GPU and loops/functions running on CPU in more details

Analyses

Survey with GPU profiling + Characterization (FLOP)

Survey with GPU profiling + Characterization (Trip Counts and FLOP with call stacks for CPU and CPU cache simulation)

Result for kernels on GPU

Memory-level GPU Roofline (for CARM, L3, SLM, GTI)

Memory-level GPU Roofline (for CARM, L3, SLM, GTI)

Result for loops/functions on CPU

Cache-aware CPU Roofline for L1 cache

Memory-level Roofline with call stacks (for L1, L2, L3, DRAM)

You can generate commands for a desired accuracy level from the Intel Advisor GUI. See Generate Command Lines from GUI for details.

Note

There is a variety of techniques available to minimize data collection, result size, and execution overhead. Check Minimize Analysis Overhead.

Low Accuracy

To run the GPU Roofline Insights perspective with the low accuracy:

advisor --collect=roofline --project-dir=./advi -–profile-gpu –- myApplication

Medium Accuracy

To run the GPU Roofline Insights perspective with the medium accuracy:

advisor --collect=roofline --project-dir=./advi --profile-gpu –-stacks --enable-data-transfer-analysis -- myApplication

You can view the results in the Intel Advisor GUI or generate an interactive HTML report.

See Also