gbmv (USM Version)¶
Computes a matrix-vector product with a general band matrix.
Description¶
The gbmv
routines compute a scalar-matrix-vector product and add
the result to a scalar-vector product, with a general band matrix.
The operation is defined as
where:
op(
A
) is one of op(A
) =A
, or op(A
) =A
T, or op(A
) =A
H,alpha
andbeta
are scalars,A
is anm
-by-n
matrix withkl
sub-diagonals andku
super-diagonals,x
andy
are vectors.
API¶
Syntax¶
event gbmv(queue &exec_queue,
transpose trans,
std::int64_t m,
std::int64_t n,
std::int64_t kl,
std::int64_t ku,
T alpha,
const T *a,
std::int64_t lda,
const T *x,
std::int64_t
incx,
T beta,
T *y,
std::int64_t incy,
const vector_class<event>
&dependencies = {})
The USM version of gbmv
supports the following precisions and devices.
T |
Devices Supported |
---|---|
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
Input Parameters¶
- exec_queue
The queue where the routine should be executed.
- trans
Specifies op(
A
), the transposition operation applied toA
. See Data Types for more details.- m
Number of rows of
A
. Must be at least zero.- n
Number of columns of
A
. Must be at least zero.- kl
Number of sub-diagonals of the matrix
A
. Must be at least zero.- ku
Number of super-diagonals of the matrix
A
. Must be at least zero.- alpha
Scaling factor for the matrix-vector product.
- a
The array holding input matrix
A
must have size at leastlda
*n
if column major layout is used, or at leastlda
*m
if row major layout is used.- lda
Leading dimension of matrix
A
. Must be at least (kl
+ku
+ 1), and positive.- x
Pointer to input vector
x
. The lengthlen
of vectorx
isn
ifA
is not transposed, andm
ifA
is transposed. The array holding input vectorx
must be of size at least (1 + (len
- 1)*abs(incx
)). See Matrix Storage for more details.- incx
Stride of vector
x
.- beta
Scaling factor for vector
y
.- y
Pointer to input/output vector
y
. The lengthlen
of vectory
ism
, ifA
is not transposed, andn
ifA
is transposed. The array holding input/output vectory
must be of size at least (1 + (len
- 1)*abs(incy
)) wherelen
is this length. See Matrix Storage for more details.- incy
Stride of vector
y
.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters¶
- y
Pointer to the updated vector
y
.
Return Values¶
Output event to wait on to ensure computation is complete.