syrk_batch¶
Computes rank-k updates on a group of symmetric matrices by a group of general matrices.
Description¶
The syrk_batch
routines perform a series of symmetric rank-k updates. They are similar to the syrk
routine counterparts, but the syrk_batch
routines perform the symmetric rank-k updates with groups of matrices. The groups contain matrices with the same parameters.
The operation for the strided API is defined as
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea and i * stridec respectively.
C = alpha * op(A) * op(A)^T + beta * C
end for
The operation for the group API is defined as
idx = 0
for i = 0 … group_count – 1
n,k, alpha, beta, lda, ldc and group_size at position i in their respective arrays.
for j = 0 … group_size – 1
A and C are matrices of size at position idx in their respective arrays
C = alpha * op(A) * op(A)^T + beta * C
idx := idx + 1
end for
end for
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH
alpha
andbeta
are scalarsA
is a general matrix andC
is a symmetric matrixThe a and c buffers contain all the input matrices. The stride between matrices is either given by the exact size of the matrix or by the stride parameter. The the
batch_size
parameter gives the total number of matrices in thea
andc
buffers.
Here, op(A
) is n
-by-k
and c
is n
-by-n
.
API¶
Syntax¶
Group API
namespace oneapi::mkl::blas::column_major {
cl::sycl::event syrk_batch(queue &queue,
uplo *upper_lower,
transpose *trans,
std::int64_t *n,
std::int64_t *k,
T *alpha,
const T **a,
std::int64_t *lda,
T *beta,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *groupsize,
const cl::sycl::vector_class<cl::sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
cl::sycl::event syrk_batch(queue &queue,
uplo *upper_lower,
transpose *trans,
std::int64_t *n,
std::int64_t *k,
T *alpha,
const T **a,
std::int64_t *lda,
T *beta,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *groupsize,
const cl::sycl::vector_class<cl::sycl::event> &dependencies = {})
}
Strided API
namespace oneapi::mkl::blas::column_major {
event syrk_batch(queue &exec_queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stride_a,
T beta,
T *c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size,
const cl::sycl::vector_class<cl::sycl::event> &dependencies = {})
void syrk_batch(queue &queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
buffer<T,1> &a,
std::int64_t lda,
std::int64_t stride_a,
T beta,
cl::sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size);
}
namespace oneapi::mkl::blas::row_major {
event syrk_batch(queue &exec_queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stride_a,
T beta,
T *c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size,
const cl::sycl::vector_class<cl::sycl::event> &dependencies = {})
void syrk_batch(queue &queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
buffer<T,1> &a,
std::int64_t lda,
std::int64_t stride_a,
T beta,
cl::sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size);
}
syrk_batch
supports the following precisions and devices:
T |
Devices Supported |
---|---|
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
Input Parameters¶
Strided API
- upper_lower
Specifies whether data in
C
is stored in its upper or lower triangle. For more details, see Data Types.- trans
Specifies
op(A)
, the transposition operation applied toA
. Conjugation is never performed, even iftrans = transpose::conjtrans
. For more details, see Data Types.- n
Number of rows in
op(A)
, and rows and columns inC
. The value ofn
must be at least zero.- k
Number of columns in
op(A)
.The value ofk
must be at least zero.- alpha
Scaling factor for the rank-
k
update.- a
Buffer that holds input matrix
A
. Iftrans
=transpose::nontrans
,A
is ann
-by-k
matrix so the arraya
must have size at leastlda
*k
(respectively,lda
*n
) if column (respectively, row) major layout is used to store matrices. Otherwise,A
is ak
-by-n
matrix so the arraya
must have size at leastlda
*n
(respectively,lda
*k
) if column (respectively, row) major layout is used to store matrices. See Matrix and Vector Storage for more details.- lda
Leading dimension of
A
. If matrices are stored using column major layout,lda
must be at leastn
iftrans
=transpose::nontrans
, and at leastk
otherwise. If matrices are stored using row major layout,lda
must be at leastk
iftrans
=transpose::nontrans
, and at leastn
otherwise. Must be positive.- stridea
Stride between the different
A
matrices. The value must be nonnegative.- beta
Scaling factor for matrix
C
.- c
Buffer that holds input/output matrix
C
. Must have size at leastldc
*n
. For more details, see Matrix and Vector Storage.- ldc
Leading dimension of
C
. Must be positive and at leastn
.- stridec
Stride between the different
C
matrices. The value ofstridec
must be leastldc
*n
.- batch_size
Specifies the number of matrix multiply operations to perform.
Group API
- upper_lower
Array of size
group_count
. Each elementi
in the array specifies whether the data inC
is stored in its upper or lower triangle. For more details, see Data Types.- trans
Array of size
group_count
. Each elementi
in the array specifiesop(A)
the transposition operation applied to the matricesA
. For more details, see Data Types.- n
Array of size
group_count
of number of rows ofop(A)
andC
. Each must be at least zero.- k
Array of size
group_count
of number of columns ofop(A)
. Each must be at least zero.- alpha
Array of size
group_count
that contains scaling factors for the rank-k
updates.- a
Array of size
total_batch_count
of pointers used to storeA
matrices. If matrices are stored in column- (respectively, row-) major layout, the array allocated for theA
matrices of the groupi
must be of size at leastlda
i *k
i (respectively,lda
i *n
i ) ifA
is not transposed andlda
i*n
i (respectively,lda
i*k
i) ifA
is transposed.- lda
Array of size
group_count
of leading dimension of theA
matrices. If matrices are stored using column major layout,lda
i must be at leastn
i ifA
is not transposed, and at leastk
i ifA
is transposed. If matrices are stored using row major layout,lda
i must be at leastk
i ifA
is not transposed, and at leastn
i ifA
is transposed. Each must be positive.- beta
Array of size
group_count
containing scaling factors for theC
matrices.- c
Array of size
total_batch_count
of pointers used to storeC
matrices. The array allocated for theC
matrices of the groupi
. Must be of size at leastldc
i *n
i.- ldc
Array of size
group_count
of leading dimension of theC
matrices.ldc
i must be at leastn
i.- group_count
Number of groups. Must be at least 0.
- group_size
Array of size
group_count
. The elementgroup_size[i]
is the number of matrices in the groupi
. Each element ingroup_size
must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters¶
Strided API
- c
Output buffer, overwritten by
batch_size
rank-k
update operations of the formulaalpha*op(A)*op(A)
T+ beta*C
.
Group API
- c
Output array of pointers to
C
matrices, overwritten bytotal_batch_count
rank-k
update operations of the formulaalpha*op(A)*op(A)
T+ beta*C
.