User`s guide
CUDAKernel
10-7
CUDAKernel
Kernel executable on GPU
Constructor
parallel.gpu.CUDAKernel
Description
A CUDAKernel object represents a CUDA kernel, that can execute on a GPU. You create
the kernel when you compile PTX or CU code, as described in “Run CUDA or PTX Code
on GPU” on page 9-20.
Methods
Properties
A CUDAKernel object has the following properties:
Property Name Description
ThreadBlockSize Size of block of threads on the kernel. This can be an integer vector
of length 1, 2, or 3 (since thread blocks can be up to 3-dimensional).
The product of the elements of ThreadBlockSize must not exceed
the MaxThreadsPerBlock for this kernel, and no element of
ThreadBlockSize can exceed the corresponding element of the
GPUDevice property MaxThreadBlockSize.
MaxThreadsPerBlock Maximum number of threads permissible in a single block for this
CUDA kernel. The product of the elements of ThreadBlockSize
must not exceed this value.
GridSize Size of grid (effectively the number of thread blocks that will be
launched independently by the GPU). This is an integer vector
of length 3. None of the elements of this vector can exceed the