User`s guide

Run CUDA or PTX Code on GPU

9-21

The following sections provide details of these commands and workflow steps.

Create a CUDAKernel Object

• “Compile a PTX File from a CU File” on page 9-21

• “Construct CUDAKernel Object with CU File Input” on page 9-21

• “Construct CUDAKernel Object with C Prototype Input” on page 9-21

• “Supported Data Types” on page 9-22

• “Argument Restrictions” on page 9-23

• “CUDAKernel Object Properties” on page 9-24

• “Specify Entry Points” on page 9-24

• “Specify Number of Threads” on page 9-25

Compile a PTX File from a CU File

If you have a CU file you want to execute on the GPU, you must first compile it to create

a PTX file. One way to do this is with the nvcc compiler in the NVIDIA

CUDA

Toolkit.

For example, if your CU file is called myfun.cu, you can create a compiled PTX file with

the shell command:

nvcc -ptx myfun.cu

This generates the file named myfun.ptx.

Construct CUDAKernel Object with CU File Input

With a .cu file and a .ptx file you can create a CUDAKernel object in MATLAB that you

can then use to evaluate the kernel:

k = parallel.gpu.CUDAKernel('myfun.ptx','myfun.cu');

Note You cannot save or load CUDAKernel objects.

Construct CUDAKernel Object with C Prototype Input

If you do not have the CU file corresponding to your PTX file, you can specify the C

prototype for your C kernel instead of the CU file. For example: