System information

Intel® Xeon Phi Coprocessor DEVELOPERS QUICK START GUIDE
18
Host Version:
The following sample code shows the C code to implement this version of the reduction.
float reduction(float *data, int size)
{
float ret = 0.f;
for (int i=0; i<size; ++i)
{
ret += data[i];
}
return ret;
}
Code Example 1: Implementing Reduction Code in C/C++
Creating the Offload Version
Serial Reduction with Offload
The programmer uses #pragma offload target(mic) (as shown in the example below) to mark statements
(offload constructs) that should execute on the Intel® Xeon Phi™ Coprocessor. The offloaded region is defined
as the offload construct plus the additional regions of code that run on the target as the result of function
calls. Execution of the statements on the host will resume once the statements on the target have executed
and the results are available on the host (i.e. the offload will block, although there is a version of this pragma
that allows asynchronous execution). The in, out, and inout clauses specify the direction of data to be
transferred between the host and the target.
Variables used within an offloaded construct that are declared outside the scope of the construct (including
the file-scope) are copied (by default) to the target before execution on the target begins and copied back to
the host on completion.
For example, in the code below, the variable ret is automatically copied to the target before execution on the
target and copied back to the host on completion. The offloaded code below is executed by a single thread on
a single Intel® MIC Architecture core.
float reduction(float *data, int size)
{
float ret = 0.f;
#pragma offload target(mic) in(data:length(size))
for (int i=0; i<size; ++i)
{
ret += data[i];
}
return ret;
}
Code Example 2: Serial Reduction with Offload