User`s guide
9 GPU Computing
9-42
Execution time on GPU = 0.0020537
Maximum absolute error = 1.1374e-14
In conclusion, vectorizing the code helps both the CPU and GPU versions to run faster.
However, vectorization helps the GPU version much more than the CPU. The improved
CPU version is nearly twice as fast as the original; the improved GPU version is 13 times
faster than the original. The GPU code went from being 40% slower than the CPU in the
original version, to about five times faster in the revised version.