Debugging Options
Contents
Debugging Options#
Auto-Attach#
The auto-attach feature enables listening to debug events from the GPU.
This feature allows the debugger to launch a gdbserver-ze
process to
listen to GPU debug events and connects the gdbserver-ze
to the debugger.
For each device on the system, an inferior is created on the gdbserver-ze
connection. This feature is designed to improve your debugging
experience and ensure that you can debug the kernel offloaded to GPU.
The auto-attach feature is enabled by default.
The feature does not affect the debugging capability on the CPU device.
However, to eliminate the extra output this feature creates, you can turn
it off with the INTELGT_AUTO_ATTACH_DISABLE
environment variable. To
do it, execute the following command on the shell before starting
gdb-oneapi
:
export INTELGT_AUTO_ATTACH_DISABLE=1
To enable the feature again:
unset INTELGT_AUTO_ATTACH_DISABLE
Reducing Overhead#
Executing the info threads
command may take a noticeable amount of
time to complete, because it needs to fetch the data of a large number
of threads when debugging GPUs. To reduce the overhead, it is
recommended to run the info threads
command without printing the
frame arguments. This can be achieved by changing the setting globally
using
set print frame-arguments none
or by using
with print frame-arguments none -- info threads
for a single command execution.
Pretty-Printing#
The pretty-printing feature simplifies the display of complex objects. If a pretty-printer is registered for the type of value you are going to print, the debugger simplifies the output. Otherwise, the debugger prints the value normally.
Intel® Distribution for GDB* supports pretty-printing for SYCL* types
id
, buffer
, and range
from the sycl
namespace.
You can write your own pretty-printer for any type. Refer to the Writing a Pretty Printer for more information.
To display the list of pretty-printers available, run the following command:
info pretty-print
Example output:
global pretty-printers:
SYCL
sycl::_V1::buffer
sycl::_V1::id
sycl::_V1::range
builtin
mpx_bound128
Pretty-printing is enabled by default. For example, when you print a
value of the index
variable:
print index
The output is the following:
$10 = sycl::_V1::id<1> = {32}
To disable pretty-printing and display raw content instead, use the
/r
flag:
print /r index
Example output:
$11 = {<sycl::_V1::detail::array<1>> = {common_array = {32, <No data fields>}}}
To disable all pretty-printers, use the following command:
disable pretty-printer
To enable pretty-printers:
enable pretty-printer
Prettify Frames#
Some C++ templates/SYCL constructs make it difficult to view the output of
info threads
and backtrace
given multi-line function names.
You can use the concept of frame filters to change the visibility of a printed frame with the ‘backtrace’ command. For details refer to the GDB Documentation Frame Filter API.
It is also possible to change the visibility of a printed frame globally.
Consider the setting print frame-info
:
(gdb) set print frame-info source-line
(gdb) info threads -stopped
Id Target Id Frame
<...>
2.3:[0-15] Thread 1.3 53 int dim1 = wiID[1];
2.4:[0-15] Thread 1.4 53 int dim1 = wiID[1];
2.5:[0-15] Thread 1.5 53 int dim1 = wiID[1];
2.6:[0-15] Thread 1.6 53 int dim1 = wiID[1];
2.7:[0-15] Thread 1.7 53 int dim1 = wiID[1];
2.8:[0-15] Thread 1.8 53 int dim1 = wiID[1];
<...>
Refer to the GDB documentation for more information.
Print settings for kernel data#
Given the sample program array-transform.cpp
:
18 using namespace std;
19 using namespace sycl;
[...]
26 int main(int argc, char *argv[]) {
27 constexpr size_t length = 64;
28 int input[length];
29 int output[length];
30
31 // Initialize the input
32 for (int i = 0; i < length; i++)
33 input[i] = i + 100;
34
35 try {
36 queue q(default_selector_v, dpc_common::exception_handler);
[...]
43 range data_range{length};
44 buffer buffer_in{input, data_range};
45 buffer buffer_out{output, data_range};
46
47 q.submit([&](auto &h) {
48 accessor in(buffer_in, h, read_only);
49 accessor out(buffer_out, h, write_only);
50
51 // kernel-start
52 h.parallel_for(data_range, [=](id<1> index) {
53 size_t id0 = GetDim(index, 0);
54 int element = in[index]; // breakpoint-here
To review the contents of the sycl::buffer
object buffer_in
of
length 64 from inside the kernel, one can use the following trick to make
it print as an array:
(gdb) print (int[64]) *in.MData
$1 = {100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,
129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143,
144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158,
159, 160, 161, 162, 163}
Note, inside the kernel we need to use a sycl::accessor
to access data
in the host’s sycl::buffer
.
Use the print setting print elements
to further configure number of
printed elements:
(gdb) set print elements 10
(gdb) print (int[64]) *in.MData
$2 = {100, 101, 102, 103, 104, 105, 106, 107, 108, 109...}
Consider the print setting print repeats
in case the output contains
repeated elements:
(gdb) show print repeats
Threshold for repeated print elements is 10.