Troubleshooting
Contents
Troubleshooting#
Template Operators Cannot Be Found#
The compiler omits the code of a template class method if that method
is not used in the code. This is a C++ issue and may cause
inconvenience when you want to invoke the omitted function. This
issue is seen in SYCL* because the basic classes (range
, id
,
nd_range
, accessor
, and others) are templates that have
several overloaded operators. Examples:
p index
Output:
$1 = sycl::_V1::id<1> = {32}
p index + 5
Output:
Could not find operator +.
As a solution, you can explicitly instantiate a template class in
your source. Then the methods of the template instance are available
in the binary. The instantiations can be surrounded with
#ifndef NDEBUG
and #endif
to avoid code bloat in release
builds. Example:
#ifndef NDEBUG
template class sycl::id<1>;
template class sycl::id<2>;
template class sycl::id<3>;
template class sycl::range<1>;
template class sycl::nd_range<1>;
#endif // #ifndef NDEBUG
Accessor Operator [] Cannot Be Resolved#
Elements of an accessor
object cannot be accessed using the
multi-dimensional access syntax during expression evaluation. See
example below:
print anAccessor[5][3][4]
Example output:
Cannot resolve function operator[] to any overloaded instance
Instead, use an id
object:
print workItemId
Example output:
$1 = sycl::_V1::id<3> = {5, 3, 4}
print anAccessor[workItemId]
Example output:
$2 = 1234
GDB appears hanging at synchronization points#
Synchronization points (e.g., a semaphore) may be implicitly inserted by the compiler at kernel boundaries. For example, consider the following code:
1 #include <omp.h>
2
3 int main(void)
4 {
5
6 const int N = 8;
7 float sum = 0.f;
8 const float alpha = 2.f;
9
10 #pragma omp parallel for simd reduction(+: sum)
11 for(size_t j = 0; j < N; j++)
12 sum += alpha;
13
14 return 0;
15 }
You can run this code on your supported GPU device:
OMP_TARGET_OFFLOAD=MANDATORY LIBOMPTARGET_DEVICETYPE=GPU LIBOMPTARGET_PLUGIN=LEVEL0 gdb-oneapi -q omp_test
A barrier is implicitly inserted by the compiler between lines 10
and 11
.
You can see this by inspecting the disassembly:
(gdb) info line 9
Line 9 of "main.cpp" starts at address 0xfffdf000 <main.extracted(void)> and ends at 0xfffdf460 <main.extracted(void)+1120>.
(gdb) info line 11
Line 11 of "main.cpp" starts at address 0xfffe4a70 <main.extracted(void)+23152> and ends at 0xfffe4bb0 <main.extracted(void)+23472>.
(gdb) disassemble /m 0xfffdf000,0xfffe4bb0
0x00000000fffe0980 <main.extracted(void)+6528>: (W) send.gtwy (1|M0) null r4 null 0x0 0x02000004 {@1,$11} // wr:1+0, rd:0; signal barrier
Inserting a breakpoint at line 11
and stepping into the for-loop via
set scheduler-locking step
and step
may cause the debugger to appear
‘hanging’ as only the current thread is resumed and other threads, which are
expected to signal the barrier, are not resumed. To avoid this, make sure to
set the scheduler-locking mode to replay
or off
at synchronization points.
Note
By default, GBD runs in all-stop mode, i.e., all threads stop when GDB takes control. In contrast, you can use non-stop mode, such that other threads continue when a breakpoint is hit. Refer to the GDB documentation for more information.
Another way to work around this issue and being able to step a single thread over a synchronization point is using the non-stop mode and inserting a temporary breakpoint at the synchronization point. Temporary breakpoints are only ever hit by one thread (and are deleted afterwards) which, together with ‘non-stop mode’, ensures that only the first thread that hits the breakpoint is stopped whilst other threads may continue to the synchronization point. This enables one to step the thread that hit the temporary breakpoint over the synchronization point.
Kernel Stops Responding#
If the kernel that is offloaded to a GPU stops responding:
Check whether there are any stray `gdbserver-ze` processes running in the background:
ps -u $USER | grep gdbserver-ze
Stop background `gdbserver-ze` processes, if there are any:
killall -9 gdbserver-ze
If the breakpoints defined inside the kernel are not hit when running on a GPU, and Virtualization technology for directed I/O (VT-d) is enabled, disable VT-d through the BIOS menu.