User`s guide

9 GPU Computing

9-24

These rules have some implications. The most notable is that every output from a kernel

must necessarily also be an input to the kernel, since the input allows the user to define

the size of the output (which follows from being unable to allocate memory on the GPU).

CUDAKernel Object Properties

When you create a kernel object without a terminating semicolon, or when you type the

object variable at the command line, MATLAB displays the kernel object properties. For

example:

k = parallel.gpu.CUDAKernel('conv.ptx','conv.cu')

k =

parallel.gpu.CUDAKernel handle

Package: parallel.gpu

Properties:

ThreadBlockSize: [1 1 1]

MaxThreadsPerBlock: 512

GridSize: [1 1 1]

SharedMemorySize: 0

EntryPoint: '_Z8theEntryPf'

MaxNumLHSArguments: 1

NumRHSArguments: 2

ArgumentTypes: {'in single vector' 'inout single vector'}

The properties of a kernel object control some of its execution behavior. Use dot notation

to alter those properties that can be changed.

For a descriptions of the object properties, see the CUDAKernel object reference page. A

typical reason for modifying the settable properties is to specify the number of threads,

as described below.

Specify Entry Points

If your PTX file contains multiple entry points, you can identify the particular kernel in

myfun.ptx that you want the kernel object k to refer to:

k = parallel.gpu.CUDAKernel('myfun.ptx','myfun.cu','myKernel1');

A single PTX file can contain multiple entry points to different kernels. Each of these

entry points has a unique name. These names are generally mangled (as in C++

mangling). However, when generated by nvcc the PTX name always contains the

original function name from the CU file. For example, if the CU file defines the kernel

function as