Intel® Advisor Help

Run Vectorization and Code Insights Perspective from Command Line

Vectorization and Code Insights perspective includes several analyses that you can run depending on the desired result. The main analysis is the Survey, which collects performance data for loops and functions in your application and identifies under-vectorized and non-vectorized loops/functions. The Survey analysis is enough to get the basic insights about your application performance.

Tip

See Intel Advisor cheat sheet for quick reference on command line interface.

Prerequisites

Set Intel Advisor environment variables with an automated script to enable the command line interface (CLI).

Run Vectorization and Code Insights Perspective

Info: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.

  1. Run the Survey analysis.
    advisor --collect=survey --project-dir=./advi_results -- ./myApplication
  2. Run the Characterization analysis to collect trip counts and FLOP data:
    advisor --collect=tripcounts --flop --stacks --project-dir=./advi_results -- ./myApplication
  3. Optional: Run the Memory Access Patterns analysis for loops/functions with the Possible Inefficient Memory Access Patter issue.
    advisor --collect=map --select=has-issue --project-dir=./advi_results -– ./myApplication
  4. Optional: Run the Dependencies analysis to check for loop-carried dependencies in loops/functions with Assumed dependency present issue:
    advisor --collect=dependencies --project-dir=./advi_results --select=has-issue -- ./myApplication

You can view the results in the Intel Advisor graphical user interface (GUI), print a summary to a command prompt/terminal, or save to a file. See View the Results below for details.

Analysis Details

The Vectorization and Code Insights workflow includes the following analyses:

  1. Survey to collect initial performance data.
  2. Characterization with trip counts and FLOP data to collect additional performance details.
  3. Memory Access Patterns (optional) to identify memory traffic data and memory usage issues.
  4. Dependencies (optional) to identify loop-carried dependencies.

Each analysis has a set of additional options that modify its behavior and collect additional performance data. The more analyses you run and option you use, the more useful data about your application you get.

Consider the following options:

Characterization Options

To run the Characterization analysis, use the following command line action: --collect=tripcounts.

Recommended action options:

Options

Description

--flop

Collect data about floating-point and integer operations, memory traffic, and mask utilization metrics for AVX-512 platforms.

--stacks

Enable advanced collection of call stack data.

--enable-cache-simulation

Model CPU cache behavior on your target application.

--cache-config=<config>

Set the cache hierarchy to collect modeling data for CPU cache behavior. Use with enable-cache-simulation.

The value should follow the template: [<num_of_caches>]:[<num_of_ways_caches_connected> ]:[<cache_size>]:[<cacheline_size>] for each of three cache levels separated with a /.

--cachesim-associativity=<num>

Set the cache associativity for modeling CPU cache behavior: 1 | 2 | 4 | 8 (default) | 16. Use with enable-cache-simulation.

--cachesim-mode=<mode>

Set the focus for modeling CPU cache behavior: cache-misses | footprint | utilization. Use with enable-cache-simulation.

See advisor Command Option Reference for more options.

Memory Access Patterns Options

The Memory Access Patterns analysis is optional because it adds a high overhead. To run the Memory Access Patterns analysis, use the following command line action: --collect=map.

Recommended action options:

Options

Description

--select=<string>

Select loops for the analysis by loop IDs, source locations, or criteria such as scalar, has-issue, or markup=<markup-mode>. This option is required.

See select for more selection options.

--enable-cache-simulation

Model CPU cache behavior on your target application.

--cachesim-cacheline-size=<num>

Set the cache line size (in bytes) for modeling CPU cache behavior: 4 | 8 | 16 | 32 | 64 (default) | 128 | 256 | 512 | 1024 | 2048 | 4096 | 8192 | 16384 | 32768 | 65536. Use with enable-cache-simulation.

--cachesim-sets=<num>

Set the cache set size (in bytes) for modeling CPU cache behavior: 256 | 512 | 1024 | 2048 | 4096 (default) | 8192. Use with enable-cache-simulation.

See advisor Command Option Reference for more options.

Dependencies Options

The Dependencies analysis is optional because it adds a high overhead and is mostly necessary if you have scalar loops/functions in your application. To run the Dependencies analysis, use the following command line action: --collect=dependencies.

Recommended action options:

Options

Description

--select=<string>

Select loops for the analysis by loop IDs, source locations, criteria such as scalar, has-issue, or markup=<markup-mode>. This option is required.

See select for more selection options.

--filter-reductions

Mark all potential reductions with a specific diagnostic.

See advisor Command Option Reference for more options.

View the Results

Intel Advisor provides several ways to view the Vectorization and Code Insights results.

View Result in CLI

You can print the results collected in the CLI and save them to a .txt, .csv, or .xml file.

For example, to generate the Survey report:

advisor --report=survey --project-dir=./advi_results

You should see a similar result:

 ID   Function Call Sites                      Total   Self                   Type                              Why No Vectorization   Vector ISA   Compiler         Average         Min              Max                                 Call Count          Transformations       Source Location                Module
                and Loops                      Time    Time                                                                                         Estimated Gain   Trip Count      Trip Count       Trip Count  
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
14   [loop in main at mmult_serial.cpp:79]     0.495s  0.495s   Vectorized Versions   1 vectorization possible but seems inefficient...      SSE2     <2.42x        127; 127; 1; 7   127; 127; 1; 7   128; 128; 1; 7   524252; 524324; 530432; 530432   Interchanged; Unrolled   mmult_serial.cpp:79   1_mmult_serial.exe   
 6   -[loop in main at mmult_serial.cpp:79]    0.275s  0.275s     Vectorized (Body)                                                          SSE2      2.42x                   127              127              128                           524252   Unrolled; Interchanged   mmult_serial.cpp:79   1_mmult_serial.exe   
 3   -[loop in main at mmult_serial.cpp:79]    0.205s  0.205s     Vectorized (Body)                                                          SSE2      2.42x                   127              127              128                           524324   Unrolled; Interchanged   mmult_serial.cpp:79   1_mmult_serial.exe   
 7   -[loop in main at mmult_serial.cpp:79]    0.015s  0.015s                Peeled                                                                                                               1                1                1          530432             Interchanged   mmult_serial.cpp:79   1_mmult_serial.exe   
11   -[loop in main at mmult_serial.cpp:79]        0s      0s             Remainder     vectorization possible but seems inefficient...                                                           7                7                7          530432             Interchanged   mmult_serial.cpp:79   1_mmult_serial.exe   
 4   [loop in main at mmult_serial.cpp:79]     0.510s  0.015s                Scalar     inner loop was already vectorized                                                     1024             1024             1024                             1024             Interchanged   mmult_serial.cpp:79   1_mmult_serial.exe   
12   [loop in main at mmult_serial.cpp:79]     0.510s      0s       Scalar Versions     1 inner loop was already vectorized                                                   1024             1024             1024                                1                            mmult_serial.cpp:79   1_mmult_serial.exe   
 5   -[loop in main at mmult_serial.cpp:79]    0.510s      0s                Scalar     inner loop was already vectorized                                                     1024             1024             1024                                1                            mmult_serial.cpp:79   1_mmult_serial.exe   

The result is also saved into a text file advisor-survey.txt located at ./advi_results/eNNN/hsNNN.

You can generate a report for any analysis you run. The generic report command looks as follows:

advisor --report=<analysis-type> --project-dir=<project-dir> --format=<format>

where:

You can also generate a report with the data from all analyses run and save it to a CSV file with the --report=joined action as follows:

advisor --report=joined --report-output=<path-to-csv>

where --report-output=<path-to-csv> is a path and a name for a .csv file to save the report to. For example, /home/report.csv. This option is required to generate a joined report.

See advisor Command Line Interface Reference for more options.

View Result in GUI

When you run Intel Advisor CLI, a project is created automatically in the directory specified with --project-dir. All the collected results and analysis configurations are stored in the .advixeproj project, that you can view in the Intel Advisor.

To open the project in GUI, you can run the following command:

advisor-gui <project-dir>

Note

If the report does not open, click Show Result on the Welcome pane.

You first see a Vectorization Summary report that includes the overall information about vectorized and not vectorized loops/functions in your code and the vectorization efficiency, including:

Vectorization summary report

Save a Read-only Snapshot

A snapshot is a read-only copy of a project result, which you can view at any time using the Intel Advisor GUI. To save an active project result as a read-only snapshot:

advisor --snapshot --project-dir=<project-dir> [--cache-sources] [--cache-binaries] -- <snapshot-path>

where:

  • --cache-sources is an option to add application source code to the snapshot.
  • --cache-binaries is an option to add application binaries to the snapshot.
  • <snapshot-path is a path and a name for the snapshot. For example, if you specify /tmp/new_snapshot, a snapshot is saved in a tmp directory as new_snapshot.advixeexpz. You can skip this and save the snapshot to a current directory as snapshotXXX.advixeexpz.

To open the result snapshot in the Intel Advisor GUI, you can run the following command:

advisor-gui <snapshot-path>

You can visually compare the saved snapshot against the current active result or other snapshot results.

Next Steps

Continue to Find Loops that Benefit from Better Vectorization to understand the results. For details about the metrics reported, see CPU and Memory Metrics.

See Also