Computes a matrix-vector product with a general band matrix.
event gbmv(queue &exec_queue, transpose trans, std::int64_t m, std::int64_t n, std::int64_t kl, std::int64_t ku, T alpha, const T *a, std::int64_t lda, const T *x, std::int64_t incx, T beta, T *y, std::int64_t incy, const vector_class<event> &dependencies = {});
The USM version ofgbmv supports the following precisions and devices.
T | Devices Supported |
---|---|
float | Host, CPU, and GPU |
double | Host, CPU, and GPU |
std::complex<float> | Host, CPU, and GPU |
std::complex<double> | Host, CPU, and GPU |
The gbmv routines compute a scalar-matrix-vector product and add the result to a scalar-vector product, with a general band matrix. The operation is defined as
y <- alpha*op(A)*x + beta*y
where:
op(A) is one of op(A) = A, or op(A) = AT, or op(A) = AH,
alpha and beta are scalars,
A is an m-by-n matrix with kl sub-diagonals and ku super-diagonals,
x and y are vectors.
The queue where the routine should be executed.
Specifies op(A), the transposition operation applied to A. See Data Types for more details.
Number of rows of A. Must be at least zero.
Number of columns of A. Must be at least zero.
Number of sub-diagonals of the matrix A. Must be at least zero.
Number of super-diagonals of the matrix A. Must be at least zero.
Scaling factor for the matrix-vector product.
The array holding input matrix A must have size at least lda*n if column major layout is used, or at least lda*m if row major layout is used.
Leading dimension of matrix A. Must be at least (kl + ku + 1), and positive.
Pointer to input vector x. The length len of vector x is n if A is not transposed, and m if A is transposed. The array holding input vector x must be of size at least (1 + (len - 1)*abs(incx)). See Matrix and Vector Storage for more details.
Stride of vector x.
Scaling factor for vector y.
Pointer to input/output vector y. The length len of vector y is m, if A is not transposed, and n if A is transposed. The array holding input/output vector y must be of size at least (1 + (len - 1)*abs(incy)) where len is this length. See Matrix and Vector Storage for more details.
Stride of vector y.
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Pointer to the updated vector y.
Output event to wait on to ensure computation is complete.