Device Selection
Offloading code to a device (such as a CPU, GPU, or FPGA) is available for both DPC++ and OpenMP* applications.
DPC++ Device Selection in the Host Code
Host code can explicitly select a device type. To do select a device, select a queue and initialize its device with one of the following:
default_selector
cpu_selector
gpu_selector
accelerator_selector
If default_selector
is used, the kernel runs based on a heuristic
that chooses from available compute devices (all, or a subset based on
the value of the ONEAPI_DEVICE_SELECTOR
environment variable).
If a specific device type (such as cpu_selector
or gpu_selector
)
is used, then it is expected that the specified device type is available
in the platform or included in the filter specified by
ONEAPI_DEVICE_SELECTOR
. If such a device is not available, then the
runtime system throws an exception indicating that the requested device
is not available. This error can be thrown in the situation where an
ahead-of-time (AOT) compiled binary is run in a platform that does not
contain the specified device type.
Note
While DPC++ applications can run on any supported target hardware, tuning is required to derive the best performance advantage on a given target architecture. For example, code tuned for a CPU likely will not run as fast on a GPU accelerator without modification.
ONEAPI_DEVICE_SELECTOR
is a complex environment variable that allows you
to limit the runtimes, compute device types, and compute device IDs that
may be used by the DPC++ runtime to a subset of all available
combinations. The compute device IDs correspond to those returned by the
SYCL API, clinfo
, or sycl-ls
(with the numbering starting at 0).
They have no relation to whether the device with that ID is of a certain
type or supports a specific runtime. Using a programmatic special
selector (like gpu_selector
) to request a filtered out device will
cause an exception to be thrown. Refer to the environment variable
description in GitHub for details on use and example values:
https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md.
The sycl-ls
tool enumerates a list of devices available in the
system. It is strongly recommended to run this tool before running any
SYCL or DPC++ programs to make sure the system is configured properly.
As a part of enumeration, sycl-ls
prints the ONEAPI_DEVICE_SELECTOR
string as a prefix of each device listing. The format of the sycl-ls
output is
[ONEAPI_DEVICE_SELECTOR] Platform_name, Device_name, Device_version [driver_version]
.
In the following example, the string enclosed in the bracket ([ ]) at
the beginning of each line is the ONEAPI_DEVICE_SELECTOR
string used to
designate the specific device on which the program will run.
Device Selection Example
$ sycl-ls
[opencl:acc:0] Intel® FPGA Emulation Platform for OpenCL™, Intel® FPGA Emulation Device 1.2 [2021.12.9.0.24_005321]
[opencl:gpu:1] Intel® OpenCL HD Graphics, Intel® UHD Graphics 630 [0x3e92] 3.0 [21.37.20939]
[opencl:cpu:2] Intel® OpenCL, Intel® Core™ i7-8700 CPU @ 3.20GHz 3.0 [2021.12.9.0.24_005321]
[level_zero:gpu:0] Intel® Level-Zero, Intel® UHD Graphics 630 [0x3e92] 1.1 [1.2.20939]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]
Additional information about device selection is available from the DPC++ Language Guide and API Reference.
OpenMP* Device Query and Selection in the Host Code
OpenMP provides a set of APIs for programmers to query and set device for running code on the device. Host code can explicitly select and set a device number. For each offloading region, a programmer can also use a device clause to specify the target device that is to be used for executing that particular offload region.
int omp_get_num_procs (void)
routine returns the number of processors available to the device.void omp_set_default_device(int device_num)
routine controls the default target device for offloading code or data.int omp_get_default_device(void)
routine returns the default target device.int omp_get_num_devices(void)
routine returns the number of non-host devices available for offloading code or data.int omp_get_device_num(void)
routine returns the device number of the device on which the calling thread is executing.int omp_is_initial_device(int device_num)
routine returns true if the current task is executing on the host device; otherwise, it returns false.int omp_get_initial_device(void)
routine returns a device number that represents the host device.
A programmer can also use the environment variable
LIBOMPTARGET_DEVICETYPE = [ CPU | GPU ]
to perform a device type
selection. If a specific device type such as CPU or GPU is specified,
then it is expected that the specified device type is available in the
platform. If such a device is not available, then the runtime system
throws an error that the requested device type is not available if the
environment variable OMP_TARGET_OFFLOAD
has the value =mandatory
, otherwise, the
execution will have a fallback execution on its initial device.
Additional information about device selection is available from the
OpenMP 5.2 specification. Details about environment variables are
available from GitHub:
https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md.