Sets the ability to change the default way the local copies are allocated for variable-length/assumed-sized arrays specified on privatization clauses (private, firstprivate, etc.) on OpenMP* teams and distribute constructs.
Linux: | -fopenmp-target-teams-default-vla-alloc-mode=arg |
Windows: | /Qopenmp-target-teams-default-vla-alloc-mode:arg |
malloc |
Use malloc/free to allocate/deallocate the local copies, making them shared across the threads of a team. The size of the malloc/free buffer may need to be adjusted using LIBOMPTARGET_DYNAMIC_MEMORY_SIZE=<num-mbytes>. |
wilocal |
Default. Use stack allocation by making the local copies private to each thread of each team. For example: Local to each work-item, which uses more memory but avoids synchronization overhead. |
wilocal |
Use stack allocation by making the local copies private to each thread of each team. |
Specify how local copies are allocated by default for variable-length/assumed-size arrays private to teams and distribute constructs for spir64 devices.
When the compiler's analyses determine that a VLA does not need to be shared across threads of a team, it will always use the wilocal mode for allocating its private copies.
None
The following shows examples of using this option.
Linux
cat tgt_teams_par_priv_vla.c
#include <stdio.h>
void f1(int n) {
int x[n];
#pragma omp target teams num_teams(1) private(x) thread_limit(4)
{
#pragma omp parallel shared(x)
#pragma omp critical
printf("%p\n", &x[0]);
}
}
int main() { f1(4); }
# Each thread in the team has its own x, hence the addresses are different.
icpx -O0 -fiopenmp -fopenmp-targets=spir64 tgt_teams_par_priv_vla.c && ./a.out
0x3f00000000860390
0x3f000000008603a0
0x3f000000008603b0
0x3f000000008603c0
# Every thread in the team shares the same x.
icpx -O0 -fiopenmp -fopenmp-targets=spir64 tgt_teams_par_priv_vla.c -fopenmp-target-teams-default-vla-alloc-mode=malloc && ./a.out
0xff00000026600000
0xff00000026600000
0xff00000026600000
0xff00000026600000
Windows
cat tgt_teams_par_priv_vla.c
#include <stdio.h>
void f1(int n) {
int x[n];
#pragma omp target teams num_teams(1) private(x) thread_limit(4)
{
#pragma omp parallel shared(x)
#pragma omp critical
printf("%p\n", &x[0]);
}
}
int main() { f1(4); }
# Each thread in the team has its own x, hence the addresses are different.
icx -O0 -fiopenmp -fopenmp-targets=spir64 tgt_teams_par_priv_vla.c && ./a.out
0x3f00000000860390
0x3f000000008603a0
0x3f000000008603b0
0x3f000000008603c0
# Every thread in the team shares the same x.
icx -O0 -fiopenmp -fopenmp-targets=spir64 tgt_teams_par_priv_vla.c -fopenmp-target-teams-default-vla-alloc-mode=malloc && ./a.out
0xff00000026600000
0xff00000026600000
0xff00000026600000
0xff00000026600000