Intel® oneAPI Math Kernel Library Developer Reference - Fortran

Two-stage Algorithm for Inspector-executor Sparse BLAS routines

You can use a two-stage algorithm for Inspector-executor Sparse BLAS routines which produce a sparse matrix. The applicable routines are:

In the two-stage algorithm:

  1. The first stage constructs the structure of the output matrix.
    • For the BSR/CSR storage formats, fill out rows_start and either rows_end or rowIndex arrays for 4 or 3 array.
    • For the CSC storage format, fill out cols_start and either cols_end or colIndex arrays for 4 or 3 array.
    This stage also allows the user to estimate memory required for the desired operation.
  2. The second stage constructs other arrays and performs the desired operation.

You can separate the calls for each stage. You can also perform the entire computation in a single call using the sparse_request_t parameter:

Values for sparse_request_t parameter
Value

Description

SPARSE_STAGE_NNZ_COUNT

In the first stage, the algorithm computes only the row (CSR/BSR format) or column (CSC format) pointer array of the matrix storage format. The computed number of non-zeroes in the output matrix helps to calculate the amount of memory required.

SPARSE_STAGE_FINALIZE_MULT

In the second stage, the algorithm computes the remaining column (CSR/BSR format) or row (CSC format) index and value arrays for the output matrix. Use this value only after calling the function with SPARSE_STAGE_NNZ_COUNT first.

SPARSE_STAGE_FULL_MULT

Combine the two stages by performing the entire computation in a single step.

This example uses the two-stage algorithm for mkl_sparse_sp2m routine with a matrix in CSR format:

First stage (sparse_request_t = SPARSE_STAGE_NNZ_COUNT)

  1. The algorithm calls the mkl_sparse_sp2m routine with the request parameter set to SPARSE_STAGE_NNZ_COUNT.
  2. The algorithm exports the computed rows_start and rows_end arrays using the mkl_sparse_x_export_csr routine.
  3. These arrays are used to calculate the number of non-zeroes (nnz) of the resulting output matrix.

Note that at this stage, the arrays related to column index and values for the output matrix have not been computed.

status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_NNZ_COUNT, &csrC);

/* optional calculation of nnz of resulting output matrix for computing memory requirement */

status = mkl_sparse_x_export_csr ( csrC, &indexing, &rows, &cols, &rows_start, &rows_end, &col_indx, &values);

MKL_INT nnz = rows_end[rows-1] - rows_start[0];

Second stage (sparse_request_t = SPARSE_STAGE_FINALIZE_MULT)

The algorithm computes the remaining storage arrays (related to column index and values for the output matrix) and performs the desired operation.

status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_FINALIZE_MULT, &csrC);

Alternatively, you can perform both operations in a single step:

Single stage operation (sparse_request_t = SPARSE_STAGE_FULL_MULT)

status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_FULL_MULT, &csrC);