Computes a group of out-of-place scaled matrix additions using general matrices.
void mkl_somatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, float alpha, const float * A, size_t lda, size_t stridea, float beta, const float * B, size_t ldb, size_t strideb, float * C, size_t ldc, size_t stridec, size_t batch_size);
void mkl_domatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, double alpha, const double * A, size_t lda, size_t stridea, double beta, const double * B, size_t ldb, size_t strideb, double * C, size_t ldc, size_t stridec, size_t batch_size);
void mkl_comatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, MKL_Complex8 alpha, const MKL_Complex8 * A, size_t lda, size_t stridea, MKL_Complex8 beta, const MKL_Complex8 * B, size_t ldb, size_t strideb, MKL_Complex8 * C, size_t ldc, size_t stridec, size_t batch_size);
void mkl_zomatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, MKL_Complex16 alpha, const MKL_Complex16 * A, size_t lda, size_t stridea, MKL_Complex16 beta, const MKL_Complex16 * B, size_t ldb, size_t strideb, MKL_Complex16 * C, size_t ldc, size_t stridec, size_t batch_size);
The mkl_omatadd_batch_strided routines perform a series of scaled matrix additions. They are similar to the mkl_omatadd routines, but the mkl_omatadd_batch_strided routines perform matrix operations with a group of matrices.
The matrices A, B, and C are stored at a constant stride from each other in memory, given by the parameters stridea, strideb, and stridec. The operation is defined as:
for i = 0 … batch_size – 1 A is a matrix at offset i * stridea in the array a B is a matrix at offset i * strideb in the array b C is a matrix at offset i * stridec in the array c C = alpha * op(A) + beta * op(B) end for
where:
The input arrays a and b contain all the input matrices, and the single output array c contains all the output matrices. The locations of the individual matrices within the array are given by stride lengths, while the number of matrices is given by the batch_size parameter.