You can obtain best performance on systems with multi-core processors by requiring thatthreads do not migrate from core to core. To do this, bind threads to the CPU cores bysetting an affinity mask to threads. Use one of the following options:
Consider the following performance issue:
To resolve this issue, before calling Intel® oneAPI Math Kernel Library, set an affinity mask for each OpenMP thread using theKMP_AFFINITY environment variable or the SetThreadAffinityMask system function. The following code example shows how to resolve the issue by setting an affinity mask by operating system means using the Intel compiler. The code calls the functionSetThreadAffinityMask to bind the threads toappropriatecores,preventing migration of the threads. Then the Intel® oneAPI Math Kernel LibraryLAPACK routineis called:
// Set affinity mask
#include <windows.h>
#include <omp.h>
int main(void) {
#pragma omp parallel default(shared)
{
int tid = omp_get_thread_num();
// 2 packages x 2 cores/pkg x 1 threads/core (4 total cores)
DWORD_PTR mask = (1 << (tid == 0 ? 0 : 2 ));
SetThreadAffinityMask( GetCurrentThread(), mask );
}
// Call Intel MKL LAPACK routine
return 0;
}
Compile the application with the Intel compiler using the following command:
icl /Qopenmp test_application.c
wheretest_application.cis the filename for the application.
Build the application. Run it in four threads, for example, by using the environment variable to set the number of threads:
set OMP_NUM_THREADS=4 test_application.exe
See Windows API documentation at msdn.microsoft.com/ for the restrictions on the usage of Windows API routines and particulars of the SetThreadAffinityMask function used in the above example.
See also a similar example at en.wikipedia.org/wiki/Affinity_mask.
Product and Performance Information |
---|
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Notice revision #20201201 |