Computes groups of matrix-matrix product with general matrices.
Group API
event trsm_batch(queue &exec_queue, side *left_right, uplo *upper_lower, transpose *trans, diag *unit_diag, std::int64_t *m, std::int64_t *n, T alpha, T **a, std::int64_t *lda, std::int64_t *stridea, T **b, std::int64_t *ldb, std::int64_t *strideb, std::int64_t group_count, std::int64_t *groupsize, const vector_class<event> &dependencies = {});
Strided API
void trsm_batch(queue &exec_queue, side left_right, uplo upper_lower, transpose trans, diag unit_diag, std::int64_t m, std::int64_t n, T alpha, buffer<T,1> &a, std::int64_t lda, std::int64_t stridea, buffer<T,1> &b, std::int64_t ldb, std::int64_t strideb, std::int64_t batch_size);
event trsm_batch(queue &exec_queue, side left_right, uplo upper_lower, transpose trans, diag unit_diag, std::int64_t m, std::int64_t n, T alpha, T *a, std::int64_t lda, std::int64_t stridea, T *b, std::int64_t ldb, std::int64_t strideb, std::int64_t batch_size, const vector_class<event> &dependencies = {});
trsm_batch supports the following precisions and devices.
T | Devices Supported |
---|---|
float | Host, CPU, and GPU |
double | Host, CPU, and GPU |
std::complex<float> | Host, CPU, and GPU |
std::complex<double> | Host, CPU, and GPU |
The trsm_batch routines solve a series of equations of the form op(A) * X = alpha * B or X * op(A) = alpha * B. They are similar to the trsm routine counterparts, but the trsm_batch routines solve linear equations with groups of matrices. The groups contain matrices with the same parameters.
The operation for the strided API is defined as
for i = 0 … batch_size – 1 A and B are matrices at offset i * stridea and i * strideb in a and b. if (left_right == mkl::side::L) then computes X such that op(A) * X = alpha * B else computes X such that X * op(A) = alpha * B end if B := X end for
The operation for the group API is defined as
idx = 0 for i = 0 … group_count – 1 left_right, upper_lower, alpha, and group_size at position i in their respective arrays. for j = 0 … group_size – 1 A, and B are matrices at position idx in A_array, and B_array if (left_right == mkl::side::L) then computes X such that op(A) * X = alpha * B else computes X such that X * op(A) = alpha * B end if B:= X idx := idx + 1 end for end for
where:
op(A) is one of op(A) = A, or op(A) = AT, or op(A) = AH
alpha is a scalar
A is a triangular matrix
B and X are m x n general matrices
The a and b buffers (arrays, for USM API) contains all the input matrices. The stride between matrices is either given by the exact size of the matrix or by the stride parameter. The total number of matrices in a and b is given by the .
The a and b arrays contains pointers to all the input matrices. The total number of matrices in a and b is given by the .
A is either m x m or n x n, depending on whether it multiplies X on the left or right. On return, the matrix B is overwritten by the solution matrix X.
Strided API
Specifies whether the matrices A multiply X on the left (side::left) or on the right (side::right). See Data Types for more details.
Specifies whether the matrices A are upper or lower triangular. See Data Types for more details.
Specifies op(A), the transposition operation applied to the matrices A. See Data Types for more details.
Specifies whether the matrices A are assumed to be unit triangular (all diagonal elements are 1.). See Data Types for more details.
Number of rows of the B matrices. Must be at least zero.
Number of columns of the B matrices. Must be at least zero.
Scaling factor for the solutions.
Buffer holding the input matrices A. Must have size at least stridea*batch_size.
Leading dimension of the matrices A. Must be at least m if left_right = side::left, and at least n if left_right = side::right. Must be positive.
Stride between the different A matrices.
If left_right = side::left, the matrices A are m-by-m matrices, so stridea must be at least lda*m.
If left_right = side::right, the matrices A are n-by-n matrices, so stridea must be at least lda*n.
Buffer holding the input matrices B. Must have size at least strideb*batch_size.
Leading dimension of the matrices B. If matrices are stored column major, ldb must be at least mldb. If matrices are stored row major, ldb must be at least n. must be positive.
Stride between the different B matrices. If matrices are stored column-major, strideb must be at least ldb*n. If matrices are stored row-major, strideb must be at least ldb*m".
Scaling factor for the matrices C.
Specifies the number of triangular linear systems to solve.
Group API
Array of size group_count which specifies whether the matrices A in each group multiply each X in the same group on the left (side::left) or on the right (side::right). See Data Types for more details.
Array of size group_count which specifies whether the matrices A in each group are upper or lower triangular. See Data Types for more details.
Array of size group_count which specifies op(A), the transposition operation applied to the matrices A in each group. See Data Types for more details.
Array of size group_count which specifies whether the matrices A in each group are assumed to be unit triangular (all diagonal elements are 1.). See Data Types for more details.
Array of size group_count which the number of rows of each B matrices in each group. Each must be at least zero.
Array of size group_count which the number of columns of each B matrices in each group. Each must be at least zero.
Array of size group_count containing scaling factors for the solutions in each group.
Array of size total_batch_count holding pointers to each A matrix. If left_righti = side::left, the matrices A are m-by-m matrices, so strideai must be at least ldai*mi. If left_righti = side::right, the matrices A are n-by-n matrices, so strideai must be at least ldai*ni.
Array of size group_count containing leading dimensions of the matrices A in each group. Must be at least mi if left_righti = side::left, and at least ni if left_righti = side::right. Each must be positive.
Array of size total_batch_count holding pointers to each B matrix. If matrices are stored column-major, stridebi must be at least ldbi*ni. If matrices are stored row-major, stridebi must be at least ldbi*mi.
Array of size group_count, containing leading dimensions of the matrices B in each group. If matrices are stored column major, all ldbi must be at least mi. If matrices are stored row major, all ldbi must be at least ni. Each ldbi must be positive.
Scaling factor for the matrices C.
Specifies the number of triangular linear systems to solve.
Strided API
Output buffer, overwritten by batch_size solution matrices X.
Output array, overwritten by batch_size solution matrices X.
Group API
Output array, containing pointers to arrays overwritten by batch_size solution matrices X.
If alpha = 0, matrix B is set to zero, and the matrices A and B do not need to be initialized before calling trsm_batch.