imatcopy_batch#
Computes a group of in-place scaled matrix transpose or copy operations using general matrices.
Description#
The imatcopy_batch routines perform a series of in-place scaled matrix
copies or transpositions. They are similar to the imatcopy
routines, but the imatcopy_batch routines perform their operations with
groups of matrices. The groups contain matrices with the same parameters.
imatcopy_batch supports the following precisions:
T |
|---|
|
|
|
|
imatcopy_batch (Buffer Version)#
Buffer version of imatcopy_batch supports only strided API.
Strided API#
The operation for the strided API is defined as:
for i = 0 … batch_size – 1
AB is a matrix at offset i * stride in ab
AB = alpha * op(AB)
end for
where:
op(X)is one ofop(X) = X,op(X) = X', orop(X) = conjg(X')alphais a scalarABis a matrix to be transformed in place
For the strided API, the single buffer AB contains all the matrices
to be transformed in place. The locations of the individual matrices within
the buffer are given by stride lengths, while the number of
matrices is given by the batch_size parameter.
Syntax#
namespace oneapi::mkl::blas::column_major {
void imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T, 1> &ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size);
}
namespace oneapi::mkl::blas::row_major {
void imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T, 1> &ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size);
}
Input Parameters#
- queue
The queue where the routine should be executed.
- trans
Specifies
op(AB), the transposition operation applied to the matricesAB.- m
Number of rows for each matrix
ABon input. Must be at least 0.- n
Number of columns for each matrix
ABon input. Must be at least 0.- alpha
Scaling factor for the matrix transpose or copy operation.
- ab
Buffer holding the matrices
AB. Must have size at leaststride*batch_size.- lda
Leading dimension of the
ABmatrices on input. If matrices are stored using column major layout,ldamust be at leastm. If matrices are stored using row major layout,ldamust be at leastn. Must be positive.- ldb
Leading dimension of the matrices
ABon output. Must be positive.trans=transpose::nontranstrans=transpose::transortrans=transpose::conjtransColumn major
Must be at least
mMust be at least
nRow major
Must be at least
nMust be at least
m- stride
Stride between the different
ABmatrices. It must be at leastmax(ldb,lda)*max(ka, kb), where:kaismif column major layout is used ornif row major layout is usedkbisnif column major layout is used andABis not transposed, ormotherwise
- batch_size
Specifies the number of matrices to transpose or copy. Must be at least zero.
Output Parameters#
- ab
Output buffer, overwritten by
batch_sizematrix multiply operations of the formalpha*op(AB).
imatcopy_batch (USM Version)#
USM version of imatcopy_batch supports group API and strided API.
Group API#
The type Ti of integer pointers in the group API may be either std::int64_t or std::int32_t.
The operation for the group API is defined as:
idx = 0
for i = 0 … group_count – 1
m,n, alpha, lda, ldb and group_size at position i in their respective arrays
for j = 0 … group_size – 1
AB is a matrix at position idx in AB
AB = alpha * op(AB)
idx := idx + 1
end for
end for
where:
op(X)is one ofop(X) = X,op(X) = X', orop(X) = conjg(X')alphais a scalarABis a matrix to be transformed in place
For the group API, the matrices are given by arrays of pointers. AB
represents a matrix stored at the address pointed to by ab.
The total number of entries in ab is given by:
Syntax#
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
const transpose *trans,
const Ti *m,
const Ti *n,
const T *alpha, T **ab,
const Ti *lda,
const Ti *ldb,
std::int64_t group_count,
const Ti *groupsize,
const std::vector<sycl::event> &dependencies = {});
}
namespace oneapi::mkl::blas::row_major {
sycl::event imatcopy_batch(sycl::queue &queue,
const transpose *trans,
const Ti *m,
const Ti *n,
const T *alpha, T **ab,
const Ti *lda,
const Ti *ldb,
std::int64_t group_count,
const Ti *groupsize,
const std::vector<sycl::event> &dependencies = {});
}
Input Parameters#
- queue
The queue where the routine should be executed.
- trans
Array of size
group_count. Each elementiin the array specifiesop(AB)the transposition operation applied to the matricesAB.- m
Array of
group_countintegers.m[i]specifies the number of rows inAB[i]on input. Each entry must be at least zero.- n
Array of
group_countintegers.n[i]specifies the number of columns inAB[i]on input. Each entry must be at least zero.- alpha
Array of size
group_countcontaining scaling factors for the matrix transpositions or copies.- ab
Array of size
total_batch_count, holding pointers to arrays used to storeABmatrices.- lda
Array of
group_countintegers.lda[i]specifies the leading dimension of the matrix inputAB. If matrices are stored using column major layout,lda[i]must be at leastm[i]. If matrices are stored using row major layout,lda[i]must be at leastn[i]. Must be positive.- ldb
Array of
group_countintegers.ldb[i]specifcies the leading dimension of the matrixABon output. Eachldb[i]must be positive and satisfy:trans[i]=transpose::nontranstrans[i]=transpose::transortrans[i]=transpose::conjtransColumn major
Must be at least
m[i]Must be at least
n[i]Row major
Must be at least
n[i]Must be at least
m[i]- group_count
Number of groups. Must be at least 0.
- group_size
Array of size
group_count. The elementgroup_size[i]is the number of matrices in the groupi. Each element ingroup_sizemust be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters#
- ab
Output array of pointers to
ABmatrices, overwritten bytotal_batch_countmatrix transpose or copy operations of the formalpha*op(AB).
Return Values#
Output event to wait on to ensure computation is complete.
Strided API#
The operation for the strided API is defined as:
for i = 0 … batch_size – 1
AB is a matrix at offset i * stride in ab
AB = alpha * op(AB)
end for
where:
op(X)is one ofop(X) = X,op(X) = X', orop(X) = conjg(X')alphais a scalarABis a matrix to be transformed in place
For the strided API, the single array ab contains all the matrices AB
to be transformed in place. The locations of the individual matrices within
the array are given by stride lengths, while the number of
matrices is given by the batch_size parameter.
Syntax#
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
oneapi::mkl::value_or_pointer<T> alpha,
T *ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {});
}
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
oneapi::mkl::value_or_pointer<T> alpha,
T *ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {});
}
Input Parameters#
- queue
The queue where the routine should be executed.
- trans
Specifies
op(AB), the transposition operation applied to the matricesAB.- m
Number of rows for each matrix
ABon input. Must be at least 0.- n
Number of columns for each matrix
ABon input. Must be at least 0.- alpha
Scaling factor for the matrix transpose or copy operation. See Scalar Arguments for more information on the
value_or_pointerdata type.- ab
Array holding the matrices
AB. Must have size at leaststride*batch_size.- lda
Leading dimension of the
ABmatrices on input. If matrices are stored using column major layout,ldamust be at leastm. If matrices are stored using row major layout,ldamust be at leastn. Must be positive.- ldb
Leading dimension of the matrices
ABon output. Must be positive.trans=transpose::nontranstrans=transpose::transortrans=transpose::conjtransColumn major
Must be at least
mMust be at least
nRow major
Must be at least
nMust be at least
m- stride
Stride between the different
ABmatrices. It must be at leastmax(ldb,lda)*max(ka, kb), where:kaismif column major layout is used ornif row major layout is usedkbisnif column major layout is used andABis not transposed, ormotherwise
- batch_size
Specifies the number of matrices to transpose or copy. Must be at least zero.
- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters#
- ab
Output array, overwritten by
batch_sizematrix multiply operations of the formalpha*op(AB).
Return Values#
Output event to wait on to ensure computation is complete.