User`s guide

9 GPU Computing
9-24
These rules have some implications. The most notable is that every output from a kernel
must necessarily also be an input to the kernel, since the input allows the user to define
the size of the output (which follows from being unable to allocate memory on the GPU).
CUDAKernel Object Properties
When you create a kernel object without a terminating semicolon, or when you type the
object variable at the command line, MATLAB displays the kernel object properties. For
example:
k = parallel.gpu.CUDAKernel('conv.ptx','conv.cu')
k =
parallel.gpu.CUDAKernel handle
Package: parallel.gpu
Properties:
ThreadBlockSize: [1 1 1]
MaxThreadsPerBlock: 512
GridSize: [1 1 1]
SharedMemorySize: 0
EntryPoint: '_Z8theEntryPf'
MaxNumLHSArguments: 1
NumRHSArguments: 2
ArgumentTypes: {'in single vector' 'inout single vector'}
The properties of a kernel object control some of its execution behavior. Use dot notation
to alter those properties that can be changed.
For a descriptions of the object properties, see the CUDAKernel object reference page. A
typical reason for modifying the settable properties is to specify the number of threads,
as described below.
Specify Entry Points
If your PTX file contains multiple entry points, you can identify the particular kernel in
myfun.ptx that you want the kernel object k to refer to:
k = parallel.gpu.CUDAKernel('myfun.ptx','myfun.cu','myKernel1');
A single PTX file can contain multiple entry points to different kernels. Each of these
entry points has a unique name. These names are generally mangled (as in C++
mangling). However, when generated by nvcc the PTX name always contains the
original function name from the CU file. For example, if the CU file defines the kernel
function as