MPI-1 Benchmarks with GPU support

Contents

MPI-1 Benchmarks with GPU support#

The IMB-MPI1-GPU provides benchmarks for every of the MPI-1 functions running on GPU. Therefore, it uses the Cuda library or Level Zero.

Note

For IMB-MPI1-GPU benchmarks, memory is allocated on the GPU level if GPU exists. Use I_MPI_OFFLOAD=1 to enable GPU support on Intel MPI side.

The following benchmarks are available within the IMB-MPI1-GPU component:

  • PingPong

  • PingPongSpecificSource (excluded by default)

  • PingPongAnySource (excluded by default)

  • PingPing

  • PingPingSpecificSource (excluded by default)

  • PingPingAnySource (excluded by default)

  • Sendrecv

  • Exchange

  • Uniband (excluded by default)

  • Biband (excluded by default)

  • Bcast

  • Allgather

  • Allgatherv

  • Scatter

  • Scatterv

  • Gather

  • Gatherv

  • Alltoall

  • Alltoallv

  • Reduce

  • Reduce_scatter

  • Allreduce

  • Barrier

For example, if you run the following command:

I_MPI_OFFLOAD=1 mpirun -np 2 IMB-MPI1-GPU -msglog 3:7 PingPong

Intel(R) MPI Benchmarks selects GPU buffers. The default value of -mem_alloc_type option is device.

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.44         0.00
            8         1000         1.86         4.29
           16         1000         1.72         9.28
           32         1000         1.82        17.58
           64         1000         1.86        34.40
          128         1000         2.76        46.40

Alternatively, you can specify the -mem_alloc_type option:

I_MPI_OFFLOAD=1 mpirun -np 2 IMB-MPI1-GPU -msglog 3:7 -mem_alloc_type cpu PingPong

Intel(R) MPI Benchmarks selects CPU buffers.

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.44         0.00
            8         1000         0.53        15.15
           16         1000         0.52        30.62
           32         1000         0.53        60.94
           64         1000         0.69        92.44
          128         1000         0.56       227.15

See Also#