Get Link
|
Sync TOC
|
<<
|
>>
Search Options:
Search Titles Only
Match All Words
Match Whole Words
Show Results in Tree
Intel® VTune™ Profiler User Guide
Introduction
Tuning Methodology
Tutorials and Samples
Notational Conventions
Get Help
Product Website and Support
Related Information
Install Intel® VTune™ Profiler
Sampling Drivers
Set Up System for GPU Analysis
Rebuild and Install the Kernel for GPU Analysis
Rebuild and Install Module i915 for GPU Analysis on CentOS*
Rebuild and Install Module i915 for GPU Analysis on Ubuntu*
Verify Intel® VTune™ Profiler Installation
Install VTune Profiler Server
Set Up Transport Security
Configure User Authentication/Authorization
Security Best Practices
Open Intel® VTune™ Profiler
Get Started with Intel® VTune™ Profiler
Intel® VTune™ Profiler Graphical User Interface
Web Server Interface
Microsoft Visual Studio* Integration
Eclipse* and Intel System Studio IDE Integration
Containerization Support
Run VTune Profiler in a Container
Profile Container Targets from the Host
Set Up Project
WHERE: Analysis System
Analysis System Options
WHAT: Analysis Target
Analysis Target Options
HOW: Analysis Types
Search Directories
Search Order
Set Up Analysis Target
Prepare Application for Analysis
Windows* Targets
Install the Sampling Drivers for Windows* Targets
Debug Information for Windows* Application Binaries
Compiler Switches for Performance Analysis on Windows* Targets
Debug Information for Windows* System Libraries
Add Administrative Privileges
Linux* Targets
Build and Install the Sampling Drivers for Linux* Targets
Debug Information for Linux* Application Binaries
Compiler Switches for Performance Analysis on Linux* Targets
Enable Linux* Kernel Analysis
Resolution of Symbol Names for Linux-Loadable Kernel Modules
Analyze Statically Linked Binaries on Linux* Targets
Set Up Remote Linux* Target
Set Up Linux* System for Remote Analysis
Configure SSH Access for Remote Collection
Search Directories for Remote Linux* Targets
Temporary Directory for Performance Results on Linux* Targets
Embedded Linux* Targets
Configure Yocto Project* and VTune Profiler with the Integration Layer
Configure Yocto Project*/Wind River* Linux* and Intel® VTune™ Profiler with the Intel System Studio Integration Layer
Configure Yocto Project* and Intel® VTune™ Profiler with the Linux* Target Package
FreeBSD* Targets
Set Up FreeBSD* System
QNX* Targets
Managed Code Targets
.NET* Targets
Windows Store Application Targets
Go* Application Targets
Android* Targets
Build and Install Sampling Drivers for Android* Targets
Set Up Android* System
Enable Java* Analysis on Android* System
Prepare an Android* Application for Analysis
Analyze Unplugged Devices
Search Directories for Android* Targets
Intel® Xeon Phi™ Processor Targets
Targets in Virtualized Environments
Profile Targets on a VMware* Guest System
Profile Targets on a Parallels* Guest System
Profile Targets on a KVM* Guest System
Profile KVM Kernel Modules from the Host
Profile KVM Kernel and User Space on the KVM System
Profile KVM Kernel and User Space from the Host
Profile Targets on a Xen* Virtualization Platform
Profile Targets in the Hyper-V* Environment
Targets in a Cloud Environment
Arbitrary Targets
Embedded System Targets
Analyze Performance
User-Mode Sampling and Tracing Collection
Hardware Event-based Sampling Collection
Allow Multiple Runs or Multiplex Events
Hardware Event-based Sampling Collection with Stacks
Performance Snapshot
Algorithm Group
Hotspots Analysis for CPU Usage Issues
Hotspots View
Anomaly Detection Analysis (preview)
Anomaly Detection View
Memory Consumption Analysis
Memory Consumption and Allocations View
Microarchitecture Analysis Group
Microarchitecture Exploration Analysis for Hardware Issues
Microarchitecture Exploration View
Microarchitecture Pipe
Memory Access Analysis for Cache Misses and High Bandwidth Issues
Memory Usage View
Parallelism Analysis Group
Threading Analysis
Threading Efficiency View
HPC Performance Characterization Analysis
HPC Performance Characterization View
Input and Output Analysis
Analyze Platform Performance
Analyze DPDK Applications
Analyze SPDK Applications
Analyze Linux Kernel I/O
Accelerators Analysis Group
GPU Offload Analysis
GPU Compute/Media Hotspots Analysis (Preview)
GPU Compute/Media Hotspots View
CPU/FPGA Interaction Analysis
CPU/FPGA Interaction View
NPU Exploration Analysis
NPU Exploration View
Platform Analysis Group
System Overview Analysis
Analyze Interrupts
Analyze Latency Issues
Platform Analysis
Hybrid CPU Analysis
Source Code Analysis
Custom Analysis
Custom Analysis Options
Hardware Event List
Hardware Event Skid
Instructions Retired Event
Precise Events
Linux* and Android* Kernel Analysis
Sampling Interval
Sample After Value
Energy Analysis
Run Energy Analysis
View Energy Analysis Data with Intel® VTune™ Profiler
Interpret Energy Analysis Data with Intel® VTune™ Profiler
Code Profiling Scenarios
Java* Code Analysis
Python* Code Analysis
Intel® Threading Building Blocks Code Analysis
MPI Code Analysis
OpenSHMEM* Code Analysis with Fabric Profiler
GPU Application Analysis on Intel® HD Graphics and Intel® Iris® Graphics
GPU OpenCL™ Application Analysis
Intel® Media SDK Program Analysis
Frame Data Analysis
Task Analysis
Control Data Collection
Finalization
Pause Data Collection
Limit Data Collection
Generate Command Line Configuration from GUI
Minimize Collection Overhead
Import External Data
Use a Custom Collector
Create a CSV File with External Data
Import Linux Perf* Trace with VTune Profiler Metrics
Examples of CSV Format and Imported Data
Manage Data Views
Switch Viewpoints
Control Window Synchronization
View Stacks
Call Stack Mode
Metrics Distribution Over Call Stacks
Manage Grid Views
Manage Timeline View
Change Threshold Values
Choose Data Format
Group and Filter Data
View Data on Inline Functions
Analyze Loops
Stitch Stacks for Intel® oneAPI Threading Building Blocks or OpenMP* Analysis
Search for Data
Manage Result Files
VTune Profiler Filenames and Locations
Import Results and Traces into VTune Profiler GUI
Compare Results
Compare Source Code
View Comparison Data
Comparison Summary
Bottom-up Comparison
Top-down Tree Comparison
Intel® VTune™ Profiler Command Line Interface
vtune Command Syntax
vtune Actions
Run Command Line Analysis
performance-snapshot Command Line Analysis
hotspots Command Line Analysis
anomaly-detection Command Line Analysis
threading Command Line Analysis
memory-consumption Command Line Analysis
hpc-performance Command Line Analysis
uarch-exploration Command Line Analysis
memory-access Command Line Analysis
tsx-exploration Command Line Analysis
tsx-hotspots Command Line Analysis
sgx-hotspots Command Line Analysis
gpu-hotspots Command Line Analysis
gpu-offload Command Line Analysis
npu
graphics-rendering Command Line Analysis
fpga-interaction Command Line Analysis
io Command Line Analysis
system-overview Command Line Analysis
runsa/runss Custom Command Line Analysis
Configure Analysis Options from Command Line
Collect System-Wide Data from Command Line
Collect Data on Remote Linux* Systems from Command Line
Configure GPU Analysis from Command Line
Specify Search Directories from Command Line
Specify Result Directory from Command Line
Pause Collection from Command Line
Manage Analysis Duration from Command Line
Limit Data Collection from Command Line
Work with Results from Command Line
View Command Line Results in the GUI
Import Results from Command Line
Re-finalize Results from Command Line
Generate Command Line Reports
Summary Report
Hotspots Report
Hardware Events Report
Callstacks Report
Timeline Report
Top-down Report
GPU Compute/Media Hotspots Report
gprof-cc Report
Difference Report
View Source Objects from Command Line
Save and Format Command Line Reports
Filter and Group Command Line Reports
Command Line Usage Scenarios
Use VTune Profiler Server in Containers
Android* Target Analysis from the Command Line
OpenMP* Analysis from the Command Line
Java* Code Analysis from the Command Line
Command Line Interface Reference
Option Descriptions and General Rules
allow-multiple-runs
analyze-kvm-guest
analyze-system
app-working-dir
archive
call-stack-mode
collect
collect-with
column
command
cpu-mask
csv-delimiter
cumulative-threshold-percent
custom-collector
data-limit
discard-raw-data
duration
filter
finalization-mode
finalize
format
group-by
help
import
inline-mode
knob
kvm-guest-kallsyms
kvm-guest-modules
limit
loop-mode
mrte-mode
no-follow-child
no-summary
no-unplugged-mode
quiet
report
report-knob
report-output
report-width
result-dir
resume-after
return-app-exitcode
ring-buffer
search-dir
show-as
sort-asc
sort-desc
source-object
source-search-dir
stack-size
start-paused
strategy
target-install-dir
target-system
target-tmp-dir
target-duration-type
target-pid
target-process
time-filter
trace-mpi
user-data-dir
verbose
version
Report Problems from Command Line
API Support
Support for Instrumentation and Tracing Technology API (ITT API)
Basic Usage and Configuration
Configure Your Build System
Attach ITT APIs to a Launched Application
Instrument Your Application
Minimize ITT API Overhead
View Instrumentation and Tracing Technology (ITT) API Task Data in Intel® VTune™ Profiler
Instrumentation and Tracing Technology API Reference
Domain API
String Handle API
Collection Control API
Thread Naming API
Task API
Frame API
Histogram API
User-Defined Synchronization API
Event API
Counter API
Context Metadata API
Load Module API
Memory Allocation APIs
JIT Profiling API
Using JIT Profiling API
JIT Profiling API Reference
iJIT_NotifyEvent
iJIT_IsProfilingActive
iJIT_ GetNewMethodID
System APIs Supported by Intel® VTune™ Profiler
Troubleshooting
Best Practices: Resolve Intel® VTune™ Profiler BSODs, Crashes, and Hangs in Windows* OS
Error Message: Application Sets Its Own Handler for Signal
Error Message: Cannot Enable Event-Based Sampling Collection
Error Message: Cannot Collect GPU Hardware Metrics
Error Message: Cannot Load Data File
Error Message: Cannot Locate Debugging Information
Error Message: Cannot Open Data
Error Message: Client Is Not Authorized to Connect to Server
Error Message: Root Privileges Required for Processor Graphics Events
Error Message: No Pre-built Driver Exists for This System
Error Message: Not All OpenCL™ API Profiling Callbacks Are Received
Error Message: Problem Accessing the Sampling Driver
Error Message: Required Key Not Available
Error Message: Scope of ptrace System Call Is Limited
Error Message: Stack Size Is Too Small
Error Message: Symbol File Is Not Found
Problem: Analysis of the .NET* Application Fails
Problem: Cannot Access VTune Profiler Documentation
Problem: CPU time for Hotspots or Threading Analysis is Too Low
Problem: 'Events= Sample After Value (SAV) * Samples' Is Not True If Multiple Runs Are Disabled
Problem: Guessed Stack Frames
Problem: GUI Hangs or Crashes
Problem: Inaccurate Sum in the Grid
Problem: Information Collected via ITT API Is Not Available When Attaching to a Process
Problem: No GPU Utilization Data Is Collected
Problem: Same Functions Are Compared As Different Instances
Problem: Skipped Stack Frames
Problem: Stack in the Top-Down Tree Window Is Incorrect
Problem: Stacks in Call Stack and Bottom-Up Panes Are Different
Problem: System Functions Appear in the User Functions Only Mode
Problem: Intel® VTune™ Profiler is Slow to Respond When Collecting or Displaying Data
Problem: Intel® VTune™ Profiler is Slow on X-Servers with SSH Connection
Problem: Unexpected Paused Time
Problem: {Unknown Timer} in the Platform Power Analysis Viewpoint
Problem: Unknown Critical Error Due to Disabled Loopback Interface
Problem: Unknown Frames
Problem: Unsupported Microsoft* Windows* OS
Reference
User Interface
Context Menu: Grid
Context Menus: Call Stack Pane
Context Menus: Project Navigator
Context Menus: Source/Assembly Window
Dialog Box: Binary/Symbol Search
Dialog Box: Source Search
Hot Keys
Menu: Customize Grouping
Menu: Intel VTune Profiler
Pane: Call Stack
Pane: Options - General
Pane: Options - Result Location
Pane: Options - Source/Assembly
Project Navigator
Pane: Timeline
Toolbar: Configure Analysis
Toolbar: Filter
Toolbar: Source/Assembly
Toolbar: Intel VTune Profiler
Window: Bandwidth - Platform Power Analysis
Window: Bottom-up
Window: Caller/Callee
Window: Cannot Find <file type> File
Window: Collection Log
Window: Compare Results
Window: Configure Analysis
Window: Core Wake-ups - Platform Power Analysis
Window: Correlate Metrics - Platform Power Analysis
Window: CPU C/P States - Platform Power Analysis
Window: Debug
Window: Event Count - Hardware Events
Window: Flame Graph
Window: Graphics - GPU Compute/Media Hotspots
Window: Graphics C/P States - Platform Power Analysis
Window: NC Device States - Platform Power Analysis
Window: Platform
Window: Platform Power Analysis
Window: Sample Count - Hardware Events
Window: SC Device States - Platform Power Analysis
Window: Summary
Window: Summary - Input and Output Summary
Window: Summary - Microarchitecture Exploration
Window: Summary - GPU Analysis
Window: Summary - Hardware Events
Window: Summary - Hotspots by CPU Utilization
Window: Summary - HPC Performance Characterization
Window: Summary - Memory Consumption
Window: Summary - Memory Usage
Window: Summary - Platform Power Analysis
Window: System Sleep States - Platform Power Analysis
Window: Temperature/Thermal Sample - Platform Power Analysis
Window: Timer Resolution - Platform Power Analysis
Window: Top-down Tree
Window: Uncore Event Count - Hardware Events
Window: Wakelocks - Platform Power Analysis
CPU Metrics Reference
GPU Metrics Reference
ALU0 Active
ALU0 Instructions
ALU1 Active
ALU1 Instructions
ALU2 Active
ALU2 Instructions
ALU0 and ALU1 Active
ALU0 and ALU2 Active
ALU0 and XMX Utilization
Average Time
Computing Threads Started
Computing Threads Started, Threads/sec
CPU Time
EU 2 FPU Pipelines Active
EU Array Active
EU Array Idle
EU Array Stalled/Idle
EU Array Stalled
EU IPC Rate
EU Send pipeline active
EU Threads Occupancy
Global
GPU EU Array Usage
GPU Instruction Cache L3 Miss Ratio
GPU L3 Atomics
GPU L3 Bound
GPU L3 Miss Ratio
GPU L3 Misses
GPU L3 Misses, Misses/sec
GPU Load Store Cache Miss Ratio
GPU Load Store Cache L3 Miss Ratio
GPU LSC Atomics
GPU LSC Fences
GPU Media Read Requests
GPU Media Write Requests
GPU SLM Atomics
GPU SLM Fences
GPU Memory Read Bandwidth, GB/sec
GPU Memory Texture Read Bandwidth, GB/sec
GPU Memory Write Bandwidth, GB/sec
GPU Sampler L3 Miss Ratio
GPU Texel Quads Count, Count/sec
GPU Utilization
Graphics Security Controller Busy
Host to GPU Memory Read Bandwidth
Host-to-GPU Memory Write Bandwidth
Instance Count
Instruction Cache Miss Ratio
L3 Busy
L3 Input Available
L3 Instruction Cache Bandwidth
L3 Load Store Cache Read Bandwidth
L3 Load Store Cache Write Bandwidth
L3 Miss Ratio
L3 Output Ready
L3 Read Bandwidth
L3 SQ Full
L3 Stalled
L3 Write Bandwidth
L3 Sampler Bandwidth, GB/sec
L3 Shader Bandwidth, GB/sec
LLC Miss Rate due GPU Lookups
LLC Miss Ratio due GPU Lookups
LSC Input Available
LSC Output Ready
LSC Partial Writes
Local
Maximum GPU Utilization
Multiple Pipe Utilization
Occupancy
PS EU Active %
PS EU Stall %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Render/GPGPU Command Streamer Loaded
Sampler Input Available
Sampler Output Ready
Samples Blended
Samples Killed in PS, pixels
Samples Written
Sampler Busy
Sampler Is Bottleneck
Shared Local Memory Read Bandwidth, GB/sec
Shared Local Memory Write Bandwidth, GB/sec
SIMD Width
SLM Bank Conflicts
Stack-to-stack Incoming Bandwidth
Stack-to-stack Outgoing Bandwidth
System Memory Read Bandwidth
System Memory Write Bandwidth
Size
Total, GB/sec
Thread Dispatcher Active
TLB Misses
Total Time
Typed Memory Read Bandwidth, GB/sec
Typed Memory Write Bandwidth, GB/sec
Typed Reads Coalescence
Typed Writes Coalescence
Untyped Memory Read Bandwidth, GB/sec
Untyped Memory Write Bandwidth, GB/sec
Untyped Reads Coalescence
Untyped Writes Coalescence
Video Codec Busy
Video Codec Read Requests
Video Codec Write Requests
Video Codec 2 Busy
Video Codec 2 Read Requests
Video Codec 2 Write Requests
Video Enhancement Busy
Video Enhancement Read Requests
Video Enhancement Write Requests
Video Enhancement 2 Busy
Video Enhancement 2 Read Requests
Video Enhancement 2 Write Requests
VS EU Active
VS EU Stall
XVE Barrier Stall
XVE Bit Manipulation Instructions
XVE Control Stall
XVE Dist or Acc Stall
XVE INT16\INT32\INT64\FP16\FP32\FP64 Instructions
XVE FP16\BF16\INT8\INT4\INT2 XMX Instructions
XVE Instruction Fetch Stall
XVE Pipe Stall
XVE Send Stall
XVE SBID Stall
XVE XMX Instructions
XVE XMX Pipeline Active
OpenCL™ Kernel Analysis Metrics Reference
Computing Task Total Time
Instance Count
SIMD Width
SIMD Utilization
Work Size
Energy Analysis Metrics Reference
Available Core Time
C-State
D0ix States
DRAM Self Refresh
Energy Consumed (mJ)
Idle Wake-ups
P-State
S0ix States
Temperature
Timer Resolution
Total Time in C0 State
Total Time in Non-C0 States
Total Time in S0 State
Total Wake-up Count
Wake-ups
Wake-ups/sec per Core
Intel Processor Events Reference
Notices and Disclaimers