NAG Numerical Routines for GPUs

Performance Results

Single-threaded CPU equivalents of all the GPU routines are provided to enable verification of results. The CPU equivalents are not optimised, and should not be used in performance comparisons. For such benchmarks, we compare our GPU code against the highly optimised MKL/VSL library from Intel. We parallelized the random number generators in VSL using OpenMP to use multiple CPU cores. The performance figures below were obtained on the following system:

CPU: Intel Core i7 860 running at 2.8GHz
RAM: 8GB
GPU: NVIDIA C2050
OS: Windows 7 64bit

GPU Table

Figures in bold are for double precision.

Testing and Verification

Verification is performed through a suite of rigorous test programs and by comparing the CPU and GPU values.  For the uniform random number generators, the CPU and GPU values are always identical.  For the non-uniform distributions (e.g, Normal), small numerical differences may arise due to different implementations of special functions between the two platforms, and due to the extended precision used in intermediate calculations by many CPU chips.

Acknowledgements

We would like to thank the Technology Strategy Board (TSB) and the Smith Institute for their support in sponsoring this project and EPSRC for supporting Professor Giles’ academic research
 

Website Feedback

If you would like a response from NAG please provide your e-mail address below.

(If you're a human, don't change the following field)
Your first name.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.