Configuring data layouts#
This page describes how to configure a descriptor
object for a specific data layout. When non-native, all the relevant types and enumerations mentioned below belong to the oneapi::mkl::dft
namespace and are declared in oneapi/mkl/dft.hpp
(file to be included). The usage of prepended namespace specifiers oneapi::mkl::dft
is omitted below for conciseness.
The DPC++ interface provides the configuration parameter config_param::FWD_STRIDES
(resp. config_param::BWD_STRIDES
) to define the data layout locating entries (or parts thereof) of relevant data sequences in the forward (resp. backward) domain. In case of batched transforms, i.e., if the configuration value for config_param::NUMBER_OF_TRANSFORMS
is set to an integer \(M\) larger than \(1\), the value set for configuration parameter config_param::FWD_DISTANCE
(resp. config_param::BWD_DISTANCE
) completes the description of the data layout by specifying the distances between successive data sequences in the forward (resp. backward) domain.
This topic leverages the general notations from the introduction, and uses the superscript \(\text{fwd}\) (resp. \(\text{bwd}\)) for data sequences in forward (resp. backward) domain. A placeholder label \(\text{v}\) is also used to capture a possible distinction between the real (if \(\text{v}\) is \(\text{r}\)) and imaginary (if \(\text{v}\) is \(\text{i}\)) parts of a complex data entry; naturally, that placeholder label \(\text{v}\) is relevant only for data layouts that distinguish real and imaginary parts of complex data entries.
A non-redundant entry \(\left(\cdot\right)^{m}_{k_{1}, k_{2}, \ldots, k_{d}}\) (or its real or imaginary part, if relevant) is stored at index \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) of the appropriate data container (sycl::buffer
object or device-accessible USM allocation) provided to a compute function, the base data type of which is (possibly implicitly re-interpreted as) documented in the table below. That index value is defined as
wherein
\(s^{\text{xwd}}_{j}\), \(\forall j \in \lbrace 0, \ldots, d\rbrace\) represents the offset and generalized strides defining the locations of relevant values within each \(d\)-dimensional data sequence in the forward (resp. backward) domain if \(\text{x} = \text{f}\) (resp. if \(\text{x} = \text{b}\)), counted in number of elements of the relevant implicitly-assumed elementary data type;
\(l^{\text{xwd}}\) represents the distance between successive \(d\)-dimensional data sequences in the forward (resp. backward) domain if \(\text{x} = \text{f}\) (resp. if \(\text{x} = \text{b}\)), counted in number of elements of the relevant implicitly-assumed elementary data type;
the relation \(p^{\text{xwd}}_{d}\left(k_{d};\text{v}\right)\) simplifies into the identity \(p^{\text{xwd}}_{d}\left(k_{d};\text{v}\right) = k_{d}\) in all recommended use cases or if \(\text{x} = \text{f}\), i.e., \(\text{v}\) is either irrelevant or unused in such cases. However, for some one-dimensional real descriptors using deprecated configurations, the real and imaginary parts of entry \(\left(\cdot\right)^{m}_{k_{1}}\) in backward domain, are to be considered separately from one another and the corresponding indices are denoted by \(J^{\text{bwd}}\left(k_{1}, m;\text{r}\right) = s^{\text{bwd}}_{0} + p^{\text{bwd}}_{1}\left(k_{1};\text{r}\right)\ s^{\text{bwd}}_{1} + m\ l^{\text{bwd}}\) and \(J^{\text{bwd}}\left(k_{1}, m;\text{i}\right) = s^{\text{bwd}}_{0} + p^{\text{bwd}}_{1}\left(k_{1};\text{i}\right)\ s^{\text{bwd}}_{1} + m\ l^{\text{bwd}}\), respectively.
In this page, it is assumed that only non-redundant data sequence entries are of interest, i.e., that \(0\leq m < M\), \(0\leq k_{j} < n_{j}, \forall j \in \lbrace 1, 2, \ldots, d-1\rbrace\) and that \(0 \leq k_{d} \leq \lfloor \frac{n_{d}}{2}\rfloor\) (resp. \(0 \leq k_{d} < n_{d}\)) for entries that do (resp. do not) belong the backward domain of a real DFT.
Note that all elements accessed as a value stored at index \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) of a given user-provided data container must belong to the same block allocation.
Implicitly-assumed elementary data type#
When reading or writing an element at index \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) of any user-provided data container used at compute time, a descriptor
object may first re-interpret the base data type of that data container into an implicitly-assumed elementary data type. That implicitly-assumed data type depends on the object type; that is, on the specialization values used for the template parameters when instantiating the descriptor
class, and on other configuration value(s). The table below lists the implicitly-assumed data type in either domain (last 2 columns) based on the object type and its configuration value(s).
Type of descriptor and relevant configuration values |
Implicitly-assumed elementary data type in forward domain |
Implicitly-assumed elementary data type in backward domain |
---|---|---|
Complex descriptor with
config_value::COMPLEX_COMPLEX set for config_param::COMPLEX_STORAGE (default behavior)
|
|
|
Complex descriptor with
config_value::REAL_REAL set for config_param::COMPLEX_STORAGE (not implemented via the DPC++ interface of oneMKL)
|
|
|
Real descriptor with
config_value::COMPLEX_COMPLEX set for config_param::CONJUGATE_EVEN_STORAGE (default behavior)
|
|
|
Real descriptor with
config_value::COMPLEX_REAL set for config_param::CONJUGATE_EVEN_STORAGE (supported only for 1D DFTs on CPU, deprecated)
|
|
|
Descriptors that implicitly assume an elementary data type of float
or double
(resp. std::complex<float>
or std::complex<double>
) in a domain are referred to as “descriptors expecting real (resp. complex) data” in that domain.
Configuring strides in forward and backward domains#
The values \(s^{\text{xwd}}_0, s^{\text{xwd}}_1, \dots, s^{\text{xwd}}_d\) defining \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) are to be communicated as elements (in that order) of a std::vector<std::int64_t>
object of size \((d+1)\), passed as the configuration value for config_param::FWD_STRIDES
if \(\text{x} = \text{f}\) (resp. config_param::BWD_STRIDES
if \(\text{x} = \text{b}\)) using the relevant configuration-setting member function. The element \(s^{\text{xwd}}_0\) represents an absolute offset (or “displacement”) in the data sets while the subsequent elements \(s^{\text{xwd}}_j\ (j > 0)\) are generalized strides to be considered along dimensions \(j \in \lbrace 1, \ldots, d\rbrace\).
When created, the descriptors are default-configured for unbatched, in-place transforms using a unit stride along the last dimension, no offset and the default configuration settings documented in the above table. For real descriptors, minimal padding is used in forward domain, aligning with the data layout requirements for in-place transforms.
In other words, the default stride values are \(s^{\text{fwd}}_0 = s^{\text{bwd}}_0 = 0\), \(s^{\text{fwd}}_d = s^{\text{bwd}}_d = 1\) and, for \(d\)-dimensional transforms with \(d > 1\),
\(s^{\text{fwd}}_{d-1} = s^{\text{bwd}}_{d-1} = n_{d}\) for complex descriptors;
\(s^{\text{bwd}}_{d-1} = \lfloor \frac{n_{d}}{2} \rfloor + 1\), and \(s^{\text{fwd}}_{d-1} = 2 s^{\text{bwd}}_{d-1}\) for real descriptors;
if \(d > 2\), \(s^{\text{xwd}}_k = n_{k+1} s^{\text{xwd}}_{k+1}\) for \(k \in \lbrace 1, \ldots, d - 2\rbrace\) (for \(\text{x} = \text{f}\) and \(\text{x} = \text{b}\)).
The usage of these default strides for unbatched, in-place transforms is illustrated in the usage examples.
Configuring batched transforms#
The value \(l^{\text{xwd}}\) completing the definition of \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) is to be set as an std::int64_t
configuration value for config_param::FWD_DISTANCE
if \(\text{x} = \text{f}\) (resp. config_param::BWD_DISTANCE
if \(\text{x} = \text{b}\)) using the relevant configuration-setting member function. This value is irrelevant for unbatched transforms, i.e., for descriptors set to handle a number of transforms \(M\) equal to \(1\) (default behavior).
In case of batched transforms, the desired number \(M\) of DFTs must be set explicitly as an std::int64_t
configuration value config_param::NUMBER_OF_TRANSFORMS
using the relevant configuration-setting member function. In that case, config_param::FWD_DISTANCE
and config_param::BWD_DISTANCE
must also be set explicitly since their default configuration values of \(0\) would break the data layout requirements for any \(M > 1\).
The configuration of batched transforms is illustrated in the usage examples.
Deprecated layouts in backward domain of one-dimensional real transforms#
All complex descriptors and all real descriptors expecting complex data in backward domain use the straightforward identity relation \(p^{\text{bwd}}_{d}\left(k_{d}; \text{v}\right) = k_{d}\), i.e., \(\text{v}\) is irrelevant in that case. Every default behavior and recommended usage falls into this category; the reader is referred to the usage examples for more details and illustrations about the resulting layouts and default (or otherwise recommended) strides and distances.
For real descriptors expecting real data in backward domain (deprecated usage, supported for 1D real DFTs on CPU only), the relation \(p^{\text{bwd}}_{d}\left(k_{d}; \text{v}\right)\) takes a more intricate form. In backward domain, such descriptors expect real data in the sense that the real and imaginary parts of the data sequence entries are not necessarily stored contiguously in memory (or not even stored at all). The specific form of \(p^{\text{bwd}}_{d}\left(k_{d}; \text{v}\right)\) depends on the value set for config_param::PACKED_FORMAT
. For real descriptors expecting real data in backward domain, three different values (documented below) are possible for that configuration parameter: config_value::CCS_FORMAT
, config_value::PACK_FORMAT
and config_value::PERM_FORMAT
. Given the limited support for 1D transforms on CPUs, \(d = 1\) is used in the rest of this section to simplify the presentation. Illustrations are also given for unbatched cases; that is, \(M = 1\), so the then-superfluous batch index \(m = 0\) is omitted in this section’s illustrative tables, too.
config_value::CCS_FORMAT
value set for config_param::PACKED_FORMAT
#
If the configuration value config_value::CCS_FORMAT
is used, then
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{r}\right) = 2k_{1}\);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{i}\right) = 2k_{1} + 1\).
Given that all non-redundant entries in backward domain are captured by \(0\leq k_{1} \leq \lfloor \frac{n_{1}}{2}\rfloor\), the range of relevant values for \(p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right)\) is \(0 \leq p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right) < n_{1} + 2\) (resp. \(0 \leq p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right) < n_{1} + 1\)) if \(n_{1}\) is even (resp. odd), in this case.
This format is illustrated in the table below for \(M = 1\), \(s^{\text{bwd}}_{0} = 0\) and \(s^{\text{bwd}}_{1} = 1\).
\(J^{\text{bwd}}\left(k_{1}; \text{v}\right)\) |
\(0\) |
\(1\) |
\(2\) |
\(3\) |
\(\ldots\) |
\(2\lfloor n_{1}/2\rfloor\) |
\(2\lfloor n_{1}/2\rfloor + 1\) |
---|---|---|---|---|---|---|---|
\(\left(k_{1}; \text{v}\right)\) |
\(\left(0; \text{r}\right)\) |
\(\left(0; \text{i}\right)\) |
\(\left(1; \text{r}\right)\) |
\(\left(1; \text{i}\right)\) |
\(\ldots\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{r}\right)\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{i}\right)\) |
Stored value |
\(\Re\left(\left(\cdot\right)_{0}\right)\) |
\(0\) |
\(\Re\left(\left(\cdot\right)_{1}\right)\) |
\(\Im\left(\left(\cdot\right)_{1}\right)\) |
\(\ldots\) |
\(\Re\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
\(\Im\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
config_value::PACK_FORMAT
value set for config_param::PACKED_FORMAT
#
If the configuration value config_value::PACK_FORMAT
is used, then
\(p^{\text{bwd}}_{1}\left(0; \text{r}\right) = 0\);
\(p^{\text{bwd}}_{1}\left(0; \text{i}\right)\) does not exist (\(0\)-valued imaginary parts are not stored explicitly);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{r}\right) = 2k_{1} - 1\) for any \(0 < k_{1} \leq \lfloor n_{1}/2 \rfloor\);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{i}\right) = 2k_{1}\) for any \(0 < k_{1} < \lfloor n_{1}/2 \rfloor\). \(p^{\text{bwd}}_{1}\left(k_{1}; \text{i}\right) = 2k_{1}\) holds for \(k_{1} = \lfloor n_{1}/2 \rfloor\) if \(n_{1}\) is odd; \(p^{\text{bwd}}_{1}\left(\lfloor n_{1}/2 \rfloor; \text{i}\right)\) does not exist if \(n_{1}\) is even (\(0\)-valued imaginary parts are not stored explicitly).
Given that all non-redundant entries in backward domain are captured by \(0\leq k_{1} \leq \lfloor \frac{n_{1}}{2}\rfloor\), the range of relevant values for \(p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right)\) is \(0 \leq p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right) < n_{1}\) in this case (regardless of whether \(n_{1}\) is even or odd).
This format is illustrated in the tables below for \(M = 1\), \(s^{\text{bwd}}_{0} = 0\) and \(s^{\text{bwd}}_{1} = 1\).
\(J^{\text{bwd}}\left(k_{1}; \text{v}\right)\) |
\(0\) |
\(1\) |
\(2\) |
\(3\) |
\(\ldots\) |
\(n_{1} - 2\) |
\(n_{1} - 1\) |
---|---|---|---|---|---|---|---|
\(\left(k_{1}; \text{v}\right)\) |
\(\left(0; \text{r}\right)\) |
\(\left(1; \text{r}\right)\) |
\(\left(1; \text{i}\right)\) |
\(\left(2; \text{r}\right)\) |
\(\ldots\) |
\(\left(n_{1}/2 - 1; \text{i}\right)\) |
\(\left(n_{1}/2; \text{r}\right)\) |
Stored value |
\(\Re\left(\left(\cdot\right)_{0}\right)\) |
\(\Re\left(\left(\cdot\right)_{1}\right)\) |
\(\Im\left(\left(\cdot\right)_{1}\right)\) |
\(\Re\left(\left(\cdot\right)_{2}\right)\) |
\(\ldots\) |
\(\Im\left(\left(\cdot\right)_{n_{1}/2 - 1}\right)\) |
\(\Re\left(\left(\cdot\right)_{n_{1}/2}\right)\) |
\(J^{\text{bwd}}\left(k_{1}; \text{v}\right)\) |
\(0\) |
\(1\) |
\(2\) |
\(3\) |
\(\ldots\) |
\(n_{1} - 2\) |
\(n_{1} - 1\) |
---|---|---|---|---|---|---|---|
\(\left(k_{1}; \text{v}\right)\) |
\(\left(0; \text{r}\right)\) |
\(\left(1; \text{r}\right)\) |
\(\left(1; \text{i}\right)\) |
\(\left(2; \text{r}\right)\) |
\(\ldots\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{r}\right)\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{i}\right)\) |
Stored value |
\(\Re\left(\left(\cdot\right)_{0}\right)\) |
\(\Re\left(\left(\cdot\right)_{1}\right)\) |
\(\Im\left(\left(\cdot\right)_{1}\right)\) |
\(\Re\left(\left(\cdot\right)_{2}\right)\) |
\(\ldots\) |
\(\Re\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
\(\Im\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
config_value::PERM_FORMAT
value set for config_param::PACKED_FORMAT
#
If the configuration value config_value::PERM_FORMAT
is used, the \(p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right)\) relation differs according to whether \(n_{1}\) is even or odd.
If \(n_{1}\) is even, then
\(p^{\text{bwd}}_{1}\left(0; \text{r}\right) = 0\) and \(p^{\text{bwd}}_{1}\left(0; \text{i}\right)\) does not exist (\(0\)-valued imaginary parts are not stored explicitly);
\(p^{\text{bwd}}_{1}\left(n_{1}/2; \text{r}\right) = 1\) and \(p^{\text{bwd}}_{1}\left(n_{1}/2; \text{i}\right)\) does not exist (\(0\)-valued imaginary parts are not stored explicitly);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{r}\right) = 2k_{1}\) for any \(0 < k_{1} < n_{1}/2\);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{i}\right) = 2k_{1} + 1\) for any \(0 < k_{1} < n_{1}/2\).
If \(n_{1}\) is odd, then (this format is equivalent to config_value::PACK_FORMAT
if \(n_{1}\) is odd)
\(p^{\text{bwd}}_{1}\left(0; \text{r}\right) = 0\) and \(p^{\text{bwd}}_{1}\left(0; \text{i}\right)\) does not exist (\(0\)-valued imaginary parts are not stored explicitly);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{r}\right) = 2k_{1} - 1\) for any \(0 < k_{1} \leq \lfloor n_{1}/2\rfloor\);
\(p^{\text{bwd}}_{1}\left(k_{1}; \text{i}\right) = 2k_{1}\) for any \(0 < k_{1} \leq \lfloor n_{1}/2\rfloor\).
Given that all non-redundant entries in backward domain are captured by \(0\leq k_{1} \leq \lfloor \frac{n_{1}}{2}\rfloor\), the range of relevant values for \(p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right)\) is \(0 \leq p^{\text{bwd}}_{1}\left(k_{1}; \text{v}\right) < n_{1}\) in this case (regardless of whether \(n_{1}\) is even or odd).
This format is illustrated in the tables below for \(M = 1\), \(s^{\text{bwd}}_{0} = 0\) and \(s^{\text{bwd}}_{1} = 1\).
\(J^{\text{bwd}}\left(k_{1}; \text{v}\right)\) |
\(0\) |
\(1\) |
\(2\) |
\(3\) |
\(\ldots\) |
\(n_{1} - 2\) |
\(n_{1} - 1\) |
---|---|---|---|---|---|---|---|
\(\left(k_{1}; \text{v}\right)\) |
\(\left(0; \text{r}\right)\) |
\(\left(n_{1}/2; \text{r}\right)\) |
\(\left(1; \text{r}\right)\) |
\(\left(1; \text{i}\right)\) |
\(\ldots\) |
\(\left(n_{1}/2 - 1; \text{r}\right)\) |
\(\left(n_{1}/2 - 1; \text{i}\right)\) |
Stored value |
\(\Re\left(\left(\cdot\right)_{0}\right)\) |
\(\Re\left(\left(\cdot\right)_{n_{1}/2}\right)\) |
\(\Re\left(\left(\cdot\right)_{1}\right)\) |
\(\Im\left(\left(\cdot\right)_{1}\right)\) |
\(\ldots\) |
\(\Re\left(\left(\cdot\right)_{n_{1}/2 - 1}\right)\) |
\(\Im\left(\left(\cdot\right)_{n_{1}/2 - 1}\right)\) |
\(J^{\text{bwd}}\left(k_{1}; \text{v}\right)\) |
\(0\) |
\(1\) |
\(2\) |
\(3\) |
\(\ldots\) |
\(n_{1} - 2\) |
\(n_{1} - 1\) |
---|---|---|---|---|---|---|---|
\(\left(k_{1}; \text{v}\right)\) |
\(\left(0; \text{r}\right)\) |
\(\left(1; \text{r}\right)\) |
\(\left(1; \text{i}\right)\) |
\(\left(2; \text{r}\right)\) |
\(\ldots\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{r}\right)\) |
\(\left(\lfloor n_{1}/2\rfloor; \text{i}\right)\) |
Stored value |
\(\Re\left(\left(\cdot\right)_{0}\right)\) |
\(\Re\left(\left(\cdot\right)_{1}\right)\) |
\(\Im\left(\left(\cdot\right)_{1}\right)\) |
\(\Re\left(\left(\cdot\right)_{2}\right)\) |
\(\ldots\) |
\(\Re\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
\(\Im\left(\left(\cdot\right)_{\lfloor n_{1}/2\rfloor}\right)\) |
Note
Real descriptors expecting complex data in backward domain do not support any other value than config_value::CCE_FORMAT
for config_param::PACKED_FORMAT
. Therefore, config_value::CCE_FORMAT
is used as the default value for config_param::PACKED_FORMAT
(consistently with config_value::COMPLEX_COMPLEX
being the default value set for config_param::CONJUGATE_EVEN_STORAGE
).
The value set for config_param::PACKED_FORMAT
must be set explicitly (to either config_value::CCS_FORMAT
, config_value::PACK_FORMAT
or config_value::PERM_FORMAT
) for real descriptors expecting real data in backward domain as it further specifies the descriptor’s behavior in that case (see explanations above). Real descriptors expecting real data in backward domain are supported for 1D real DFTs on CPU only. Their support is deprecated.
Data layout requirements#
In general, the distances and strides must be set so that
values of \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) are non-negative for all \((d+1)\)-tuples \((k_{1}, k_{2}, \dots, p^{\text{xwd}}_{d}\left(k_{d};\text{v}\right), m)\) within relevant ranges;
every value of \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) corresponds to a unique value relevant to the data sequences under consideration. In other words, there must not be one value of \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) corresponding to two different \((d+1)\)-tuples \((k_{1}, k_{2}, \dots, p^{\text{xwd}}_{d}\left(k_{d};\text{v}\right), m)\) that would both be within relevant ranges.
Additionally, for in-place transforms (configuration value config_value::INPLACE
set for config_param::PLACEMENT
), the following “consistency requirements” apply:
descriptors expecting the same data type in either domain (e.g., complex descriptors) must use the same offset, stride(s), and distance values in forward and backward domains;
for real descriptors expecting complex data in backward domain (default behavior for real descriptors), the memory address(es) of leading entry(ies) along the last dimension must be identical in forward and backward domains. Specifically, that requirement translates into the conditions \(s^{\text{fwd}}_{j} = 2 s^{\text{bwd}}_{j}, \ \forall j \in \lbrace 0, \ldots, d - 1\rbrace\) as well as, if \(M > 1\), \(l^{\text{fwd}} = 2 l^{\text{bwd}}\). Note that this requirement leads to some data padding to be used in forward domain if unit strides are used along dimension \(d\) in forward and backward domains (recommended usage, as set by default).
Note
Support for negative strides with a sufficiently large (positive) offset index guaranteeing non-negativeness of all \(J^{\text{xwd}}\left(k_{1}, \ldots, k_{d}, m; \text{v}\right)\) is not enabled yet (unimplemented);
One-dimensional real descriptors expecting real data in backward domain and using configuration value
config_value::CCS_FORMAT
forconfig_param::PACKED_FORMAT
also require \(l^{\text{bwd}} \geq n_{1} + 2\).
Configuring strides for input and output data [deprecated]#
Instead of specifying strides by domain, one may choose to specify the strides for input and output data sequences. Let \(s^{\text{x}}_{j}, \ j \in \lbrace 0, 1, \ldots, d\rbrace\) be the stride values for input (resp. output) data sequences if \(\text{x} = \text{i}\) (resp. \(\text{x} = \text{o}\)). Such \(s^{\text{x}}_0, s^{\text{x}}_1, \dots, s^{\text{x}}_d\) values may be communicated as elements (in that order) of a std::vector<std::int64_t>
object of size \((d+1)\), passed as the configuration value for config_param::INPUT_STRIDES
if \(\text{x} = \text{i}\) (resp. config_param::OUTPUT_STRIDES
if \(\text{x} = \text{o}\)) using the relevant configuration-setting member function.
The values of \(s^{\text{i}}_{j}\) and \(s^{\text{o}}_{j}\) are to be used and considered by oneMKL if and only if \(s^{\text{fwd}}_{j} = s^{\text{bwd}}_{j} = 0, \forall j \in \lbrace 0, 1, \ldots, d\rbrace\). This will happen automatically if config_param::INPUT_STRIDES
and config_param::OUTPUT_STRIDES
are set and config_param::FWD_STRIDES
and config_param::BWD_STRIDES
are not (see the note below). In such a case, descriptor
objects must consider the data layouts corresponding to the two compute directions separately. As detailed above, relevant data sequence entries are accessed as elements of data containers (sycl::buffer
objects or device-accessible USM allocations) provided to the compute function, the base data type of which is (possibly implicitly re-interpreted) as documented in the above table. If using input and output strides, the index to be used when accessing a data sequence entry \(\left(\cdot\right)^{m}_{k_{1}, k_{2}, \ldots, k_{d}}\) – or part thereof – in forward domain is
where \(\text{x} = \text{i}\) (resp. \(\text{x} = \text{o}\)) for forward (resp. backward) DFTs. Similarly, the index to be used when accessing a data sequence entry – or part thereof – in backward domain is
where \(\text{x} = \text{o}\) (resp. \(\text{x} = \text{i}\)) for forward (resp. backward) DFTs.
As a consequence, configuring descriptor
objects using these deprecated configuration parameters makes their configuration direction-dependent when different stride values are used in forward and backward domains. Since the intended compute direction is unknown to the descriptor
object when committing it, every direction that results in a legitimate data layout in forward and backward domains must be supported by successfully committed descriptor
objects.
Note
For descriptor
objects with strides configured via these deprecated configuration parameters, the data layout requirements may be satisfied for only one of the two compute directions, i.e., for only one of the forward or backward DFT. The behavior of oneMKL is undefined if using that object for the compute direction that does not align with the data layout requirements.
Setting either of config_param::INPUT_STRIDES
or config_param::OUTPUT_STRIDES
triggers any (default or previously-set) values for config_param::FWD_STRIDES
and config_param::BWD_STRIDES
to reset to \(0\)-valued vectors, and vice versa. This implicit behavior prevents mix-and-matching either of config_param::INPUT_STRIDES
or config_param::OUTPUT_STRIDES
with either of config_param::FWD_STRIDES
or config_param::BWD_STRIDES
, which is not supported by oneMKL. If such a configuration is attempted, an exception is thrown at commit time due to invalid configuration, as the stride values that were implicitly reset invalidate the data layout requirements for any non-trivial DFT.
If specifying the data layout strides using these deprecated configuration parameters and if the strides differ in forward and backward domain, the descriptor must be re-configured and re-committed for computing the DFT in the reverse direction as shown below.
// ...
desc.set_value(config_param::INPUT_STRIDES, fwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, bwd_domain_strides);
desc.commit(queue);
compute_forward(desc, ...);
// ...
desc.set_value(config_param::INPUT_STRIDES, bwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, fwd_domain_strides);
desc.commit(queue);
compute_backward(desc, ...);
The config_param::INPUT_STRIDES
and config_param::OUTPUT_STRIDES
parameters have been deprecated since oneMKL2024.1. A compile-time deprecation warning advising users to update their usage to config_param::FWD_STRIDES
and config_param::BWD_STRIDES
is emitted for any application using these configuration parameters.
Supported layouts on GPU devices#
On GPU devices, oneMKL requires
the rank \(d\) of the transform to be no greater than \(3\);
the offset values \(s^{\text{fwd}}_{0}\) and \(s^{\text{bwd}}_{0}\) to be \(0\);
either \(l^{\text{xwd}} = n_{1} s^{\text{xwd}}_{1}\) or \(s^{\text{xwd}}_{1} = M l^{\text{xwd}}\) for batched, two-dimensional real transforms (for \(\text{x} = \text{f}\) and \(\text{x} = \text{b}\));
either \(s^{\text{xwd}}_{1} = n_{2} s^{\text{xwd}}_{2}\) (along with \(l^{\text{xwd}} = n_{1} s^{\text{xwd}}_{1}\) if \(M > 1\)) or \(s^{\text{xwd}}_{2} = n_{1} s^{\text{xwd}}_{1}\) (along with \(s^{\text{xwd}}_{1} = M l^{\text{xwd}}\) if \(M > 1\)) for three-dimensional real transforms (for \(\text{x} = \text{f}\) and \(\text{x} = \text{b}\));
real descriptors to use
config_value::COMPLEX_COMPLEX
forconfig_param::CONJUGATE_EVEN_STORAGE
andconfig_value::CCE_FORMAT
forconfig_param:PACKED_FORMAT
.
Note
If parts of the block allocation used for output results are found irrelevant to the computed DFT, GPU-committed descriptor
may overwrite them (e.g., padding elements between successive data sequences).