User`s guide
Run CUDA or PTX Code on GPU
9-21
The following sections provide details of these commands and workflow steps.
Create a CUDAKernel Object
• “Compile a PTX File from a CU File” on page 9-21
• “Construct CUDAKernel Object with CU File Input” on page 9-21
• “Construct CUDAKernel Object with C Prototype Input” on page 9-21
• “Supported Data Types” on page 9-22
• “Argument Restrictions” on page 9-23
• “CUDAKernel Object Properties” on page 9-24
• “Specify Entry Points” on page 9-24
• “Specify Number of Threads” on page 9-25
Compile a PTX File from a CU File
If you have a CU file you want to execute on the GPU, you must first compile it to create
a PTX file. One way to do this is with the nvcc compiler in the NVIDIA
®
CUDA
®
Toolkit.
For example, if your CU file is called myfun.cu, you can create a compiled PTX file with
the shell command:
nvcc -ptx myfun.cu
This generates the file named myfun.ptx.
Construct CUDAKernel Object with CU File Input
With a .cu file and a .ptx file you can create a CUDAKernel object in MATLAB that you
can then use to evaluate the kernel:
k = parallel.gpu.CUDAKernel('myfun.ptx','myfun.cu');
Note You cannot save or load CUDAKernel objects.
Construct CUDAKernel Object with C Prototype Input
If you do not have the CU file corresponding to your PTX file, you can specify the C
prototype for your C kernel instead of the CU file. For example: