axpy_batch

Computes a group of vector-scalar products added to a vector.

Description

The axpy_batch routines perform a series of scalar-vector product added to a vector. They are similar to the axpy routine counterparts, but the axpy_batch routines perform vector operations with a groups of vectors.

For the group API, each group contains vectors with the same parameters (size and increment). The operation for the group API is defined as

idx = 0
for i = 0 … group_count – 1
     n, alpha, incx, incy and group_size at position i in n_array, alpha_array, incx_array, incy_array and group_size_array
     for j = 0 … group_size – 1
         x and y are vectors of size n at position idx in x_array and y_array
         y := alpha * x + y
         idx := idx + 1
     end for
end for

The number of entries in x_array, and y_array is total_batch_count = the sum of all of the group_size entries.

For the strided API, all vector x (respectively, y) have the same parameters (size, increments) and are stored at constant stridex (respectively, stridey) from each other. The operation for the strided API is defined as

For i = 0 … batch_size – 1
    X and Y are vectors at offset i * stridex and i * stridey in x and y
    Y = alpha * X + Y
end for

API

Syntax

Group API

namespace oneapi::mkl::blas::column_major {
    sycl::event axpy_batch(sycl::queue &queue,
                           std::int64_t *n,
                           T *alpha,
                           const T **x,
                           std::int64_t *incx,
                           T **y,
                           std::int64_t *incy,
                           std::int64_t group_count,
                           std::int64_t *group_size,
                           const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
    sycl::event axpy_batch(sycl::queue &queue,
                           std::int64_t *n,
                           T *alpha,
                           const T **x,
                           std::int64_t *incx,
                           T **y,
                           std::int64_t *incy,
                           std::int64_t group_count,
                           std::int64_t *group_size,
                           const std::vector<sycl::event> &dependencies = {})
}

Strided API

namespace oneapi::mkl::blas::column_major {
    void axpy_batch(sycl::queue &queue,
                    std::int64_t n,
                    T alpha,
                    sycl::buffer<T,
                    1> &x,
                    std::int64_t incx,
                    std::int64_t stridex,
                    sycl::buffer<T,
                    1> &y,
                    std::int64_t incy,
                    std::int64_t stridey,
                    std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
    void axpy_batch(sycl::queue &queue,
                    std::int64_t n,
                    T alpha,
                    sycl::buffer<T,
                    1> &x,
                    std::int64_t incx,
                    std::int64_t stridex,
                    sycl::buffer<T,
                    1> &y,
                    std::int64_t incy,
                    std::int64_t stridey,
                    std::int64_t batch_size)
}

axpy_batch supports the following precisions and devices.

T

Devices Supported

float

Host, CPU, and GPU

double

Host, CPU, and GPU

std::complex<float>

Host, CPU, and GPU

std::complex<double>

Host, CPU, and GPU

Input Parameters

Group API

exec_queue

The queue where the routine should be executed.

n_array

Array of size group_count. For the group i, ni = n_array[i] is the number of elements in vectors x and y.

alpha_array

Array of size group_count. For the group i, alphai = alpha_array[i] is the scalar alpha.

x_array

Array of size total_batch_count of pointers used to store x vectors. The array allocated for the x vectors of the group i must be of size at least (1 + (ni – 1)*abs(incxi)). See Matrix Storage for more details.

incx_array

Array of size group_count. For the group i, incxi = incx_array[i] is the stride of vector x.

y_array

Array of size total_batch_count of pointers used to store y vectors. The array allocated for the y vectors of the group i must be of size at least (1 + (ni – 1)*abs(incyi)). See Matrix Storage for more details.

incy_array

Array of size group_count. For the group i, incyi = incy_array[i] is the stride of vector y.

group_count

Number of groups. Must be at least 0.

group_size_array

Array of size group_count. The element group_size_array[i] is the number of vector in the group i. Each element in group_size_array must be at least 0.

dependencies

List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.

Strided API

exec_queue

The queue where the routine should be executed.

n

Number of elements in vectors x and y.

alpha

Specifies the scalar alpha.

x

Buffer or USM pointer accessible by the queue’s device holding all the input vector x. The buffer or allocated memory must be of size at least batch_size * stridex.

incx

Stride between two consecutive elements of the x vectors.

stridex

Stride between two consecutive x vectors, must be at least (1 + (n-1)*abs(incx)). See Matrix Storage for more details.

y

Buffer or USM pointer accessible by the queue’s device holding all the input vectors y. The buffer or allocated memory must be of size at least batch_size * stridey.

incy

Stride between two consecutive elements of the y vectors.

stridey

Stride between two consecutive y vectors, must be at least (1 + (n-1)*abs(incy)). See Matrix Storage for more details.

batch_size

Number of axpy computations to perform and x and y vectors. Must be at least 0.

Output Parameters

Group API

y_array

Array of pointers holding the total_batch_count updated vector y.

Strided API

y

Array or buffer holding the batch_size updated vector y.