User`s guide

9 GPU Computing
9-28
The input values x1 and x2 correspond to pInOut and c in the C function prototype. The
output argument y corresponds to the value of pInOut in the C function prototype after
the C kernel has executed.
The following is a slightly more complicated example that shows a combination of const
and non-const pointers:
void moreComplicated( const float * pIn, float * pInOut1, float * pInOut2 )
The corresponding kernel object in MATLAB then has the properties:
MaxNumLHSArguments: 2
NumRHSArguments: 3
ArgumentTypes: {'in single vector' 'inout single vector' 'inout single vector'}
You can use feval on this code’s kernel (k) with the syntax:
[y1,y2] = feval(k,x1,x2,x3)
The three input arguments x1, x2, and x3, correspond to the three arguments that are
passed into the C function. The output arguments y1 and y2, correspond to the values of
pInOut1 and pInOut2 after the C kernel has executed.
Complete Kernel Workflow
“Add Two Numbers” on page 9-28
“Add Two Vectors” on page 9-29
“Example with CU and PTX Files” on page 9-30
Add Two Numbers
This example adds two doubles together in the GPU. You should have the NVIDIA
CUDA Toolkit installed, and have CUDA-capable drivers for your device.
1
The CU code to do this is as follows.
__global__ void add1( double * pi, double c )
{
*pi += c;
}
The directive __global__ indicates that this is an entry point to a kernel. The code
uses a pointer to send out the result in pi, which is both an input and an output. Put
this code in a file called test.cu in the current directory.