Performs a Hermitian rank-k update.
event herk(queue &exec_queue, uplo upper_lower, transpose trans, std::int64_t n, std::int64_t k, T_real alpha, const T* a, std::int64_t lda, T_real beta, T* c, std::int64_t ldc, const vector_class<event> &dependencies = {});
herk supports the following precisions and devices:
T | T_real | Devices Supported |
---|---|---|
std::complex<float> | float | Host, CPU, and GPU |
std::complex<double> | double | Host, CPU, and GPU |
The herk routines compute a rank-k update of a Hermitian matrix C by a general matrix A. The operation is defined as:
C <- alpha*op(A)*op(A)H + beta*C
where:
op(X) is one of op(X) = X or op(X) = XH,
alpha and beta are real scalars,
C is a Hermitian matrix and A is a general matrix.
Here op(A) is n x k, and C is n x n.
The queue where the routine should be executed.
Specifies whether A's data is stored in its upper or lower triangle. See Data Types for more details.
Specifies op(A), the transposition operation applied to A. See Data Types for more details. Supported operations are transpose::nontrans and transpose::conjtrans.
The number of rows and columns in C.The value of n must be at least zero.
Number of columns in op(A).
The value of k must be at least zero.
Real scaling factor for the rank-k update.
Pointer to input matrix A. If trans = transpose::nontrans, A is an n-by-k matrix so the array a must have size at least lda*k (respectively, lda*n) if column (respectively, row) major layout is used to store matrices. Otherwise, A is an k-by-n matrix so the array a must have size at least lda*n (respectively, lda*k) if column (respectively, row) major layout is used to store matrices. See Matrix and Vector Storage for more details.
Leading dimension of A. If matrices are stored using column major layout, lda must be at least n if trans=transpose::nontrans, and at least k otherwise. If matrices are stored using row major layout, lda must be at least k if trans=transpose::nontrans, and at least n otherwise. Must be positive.
Real scaling factor for matrix C.
Pointer to input/output matrix C. Must have size at least ldc*n. See Matrix and Vector Storage for more details.
Leading dimension of C. Must be positive and at least n.
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Pointer to the output matrix, overwritten by alpha*op(A)*op(A)T + beta*C. The imaginary parts of the diagonal elements are set to zero.
Output event to wait on to ensure computation is complete.