The most significant parameters in HPL.dat are P, Q, NB, and N. Specify them as follows:
P and Q - the number of rows and columns in the process grid, respectively.
P*Q must be the number of MPI processes that HPL is using.
Choose P≤Q.
N – the problem size:
For homogeneous runs, choose N divisible by NB*LCM(P,Q), where LCM is the least common multiple of the two numbers.
For heterogeneous runs, see Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark for how to choose N.
Increasing N usually increases performance, but the size of N is bounded by memory. In general, you can compute the memory required to store the matrix (which does not count internal buffers) as 8*N*N/(P*Q) bytes, where N is the problem size and P and Q are the process grids in HPL.dat. A general rule is to choose a problem size that fills 80% of memory.
NB – the block size of the data distribution.
The table below shows the recommended values of NB and element sizes for the CPU version:
Processors |
Intel® Distribution for LINPACK* Benchmark |
Intel® Optimized HPL-AI* Benchmark |
---|---|---|
Intel® Xeon Processor supporting Intel® Advanced Vector Extensions (Intel® AVX) instructions or older architecture | 256 | 256 |
Intel® Xeon Processor supporting Intel® Advanced Vector Extensions 2 (Intel® AVX2) instructions | 192 | 192 |
Intel® Xeon Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions | 384 | 384 |
Intel® Xeon Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions with Intel® Deep Learning Boost and bfloat16 | 384 | 768 |
Intel® Xeon Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions with Intel® AMX bfloat16 | 384 | 1536 |
Element size | 8 bytes | 4 bytes |
The table below shows the recommended values of NB and element sizes for the GPU version:
Processors |
Intel® Distribution for LINPACK* Benchmark |
Intel® Optimized HPL-AI* Benchmark |
---|---|---|
Intel® Data Center GPU Series | 384 | 1152 or 1536 |
Element size | 8 bytes | 2 bytes |