System information

Intel® Xeon Phi Coprocessor DEVELOPERS QUICK START GUIDE
23
}
}
return ret;
}
Code Example 5: C/C++: Using OpenMP in Offloaded Reduction Code
real function FTNReductionOMP(data, size)
implicit none
integer :: size
real, dimension(size) :: data
real :: ret = 0.0
!dir$ omp offload target(mic) in(size) in(data:length(size))
!$omp parallel do reduction(+:ret)
do i=1,size
ret = ret + data(i)
enddo
!$omp end parallel do
FTNReductionOMP = ret
return
end function FTNReductionOMP
Code Example 6: Fortran: Using OpenMP* in Offloaded Reduction Code
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: OpenMP* + Intel® Cilk™ Plus
Extended Array Notation
The following code sample further extends the OpenMP example to use Intel Cilk Plus Extended Array
Notation. In the following code sample, each thread uses the Intel Cilk Plus Extended Array Notation
__sec_reduce_add() built-in reduction function to use all 32 of the Intel® MIC Architecture’s 512-bit vector
registers to reduce the elements in the array.
float OMPnthreads_CilkPlusEAN_reduction(float *data, int size)
{
float ret=0;
#pragma offload target(mic) in(data:length(size))
{
int nthreads = omp_get_max_threads();
int ElementsPerThread = size/nthreads;
#pragma omp parallel for reduction(+:ret)
for(int i=0;i<nthreads;i++)
{
ret =_sec_reduce_add(
data[i*ElementsPerThread:ElementsPerThread]);
}
//rest of the array
for(int i=nthreads*ElementsPerThread; i<size; i++)
{
ret+=data[i];
}