MACHINE SPECIFICATION ================================ CPU: Intel Corei7 860 at 2.80GHz RAM: 8GB GPU: NVIDIA Tesla C2050 ================================ ========================================================================================= BSCallDemo ========================================================================================= ---------------------------------------------------- GPU PROPERTIES Using NVIDIA Device: Tesla C2050 Compute Version: 2.0 Clock Rate: 1147MHz Num. Processors: 14 Max Threads/Block: 1024 Warp Size: 32 threads Total Memory: 2687MB Constant Memory: 64KB Shared Memory: 48KB Max Registers/Block: 32768 Concurrent Cpy & Exec: true Concurrent Kernels: true ---------------------------------------------------- CPU initialization time = 1830.01ms GPU initialization time = 0ms Start of Auto-tuning: workPerThd = 5 to 50 Found new min: runtime=10.27843ms, workPerThd=5, thdsPerBlk=32, Est.Blks/SM=133.9286 Found new min: runtime=7.90224ms, workPerThd=5, thdsPerBlk=64, Est.Blks/SM=67 Found new min: runtime=7.860992ms, workPerThd=5, thdsPerBlk=160, Est.Blks/SM=26.78572 Found new min: runtime=7.800288ms, workPerThd=6, thdsPerBlk=128, Est.Blks/SM=27.92857 Found new min: runtime=7.72912ms, workPerThd=8, thdsPerBlk=192, Est.Blks/SM=14 Found new min: runtime=7.70288ms, workPerThd=12, thdsPerBlk=128, Est.Blks/SM=14 Found new min: runtime=7.568864ms, workPerThd=14, thdsPerBlk=192, Est.Blks/SM=8 Found new min: runtime=7.52544ms, workPerThd=17, thdsPerBlk=160, Est.Blks/SM=7.928571 Found new min: runtime=7.498272ms, workPerThd=34, thdsPerBlk=128, Est.Blks/SM=4.928571 Auto-tuning complete Tuning Statistics: nRuns=736 ave=9.472231 min=7.498272 max=15.10717 stdDev=1.557113 DESCRIPTTION: Simple Black-Scholes path dynamics with deterministic term structures of interest and volatility. Uses NAG GPU device-level normal random number generators. ALGTHM: Milstein (with caching), SINGLE precision OPTION: Type = European CALL Maturity = 0.77 Strike = 91 S(0) = 100 Time step = 0.006875 SIMULATION PARAMS: NumTrials = 300000 ThdsPerBlk = 128 WorkPerThd = 34 Est.Blks/SM = 4.928571 RESULTS: CPU Price = 18.93683475 Std Error of CPU estimate = 0.01594193839 GPU Price = 18.93683476 Std Error of GPU estimate = 0.01594193839 CPU runtime = 1930.135986ms - NOTE: this is single thread, unoptimised GPU runtime = 7.535711765ms Speedup = 256.1318665x Attempting to match CPU values ... RESULTS: CPU Price = 18.936834745 Std Error of CPU estimate = 0.015941938385 GPU Price = 18.936834761 Std Error of GPU estimate = 0.015941938385 ========================================================================================= SobolBSCallDemo ========================================================================================= ---------------------------------------------------- GPU PROPERTIES Using NVIDIA Device: Tesla C2050 Compute Version: 2.0 Clock Rate: 1147MHz Num. Processors: 14 Max Threads/Block: 1024 Warp Size: 32 threads Total Memory: 2687MB Constant Memory: 64KB Shared Memory: 48KB Max Registers/Block: 32768 Concurrent Cpy & Exec: true Concurrent Kernels: true ---------------------------------------------------- CPU initialization time = 2480.2229004ms GPU initialization time = 55.661441803ms Start of Auto-tuning: workPerThd = 3 to 50 Found new min: runtime=6.529216ms, workPerThd=3, thdsPerBlk=32, Est.Blks/SM=240 Found new min: runtime=3.640032ms, workPerThd=3, thdsPerBlk=64, Est.Blks/SM=121.1429 Found new min: runtime=2.750912ms, workPerThd=3, thdsPerBlk=96, Est.Blks/SM=80 Found new min: runtime=2.747744ms, workPerThd=3, thdsPerBlk=128, Est.Blks/SM=61.71429 Found new min: runtime=2.747328ms, workPerThd=4, thdsPerBlk=192, Est.Blks/SM=32 Found new min: runtime=2.711168ms, workPerThd=5, thdsPerBlk=96, Est.Blks/SM=48 Found new min: runtime=2.69216ms, workPerThd=6, thdsPerBlk=128, Est.Blks/SM=32 Found new min: runtime=2.686752ms, workPerThd=8, thdsPerBlk=192, Est.Blks/SM=16 Found new min: runtime=2.67936ms, workPerThd=15, thdsPerBlk=96, Est.Blks/SM=16 Found new min: runtime=2.633568ms, workPerThd=17, thdsPerBlk=160, Est.Blks/SM=9.142858 Auto-tuning complete Tuning Statistics: nRuns=768 ave=3.971313 min=2.633568 max=8.404032 stdDev=1.011381 DESCRIPTTION: Simple Black-Scholes path dynamics with deterministic term structures of interest and volatility. Uses NAG GPU quasi-random (Sobol) generator with optional scrambling and constructs sample paths using a Brownian bridge. ALGTHM: Milstein (with caching) with OWEN scrambling, SINGLE precision OPTION: Type = European CALL Maturity = 0.77 Strike = 91 S(0) = 100 SIMULATION PARAMS: NumTrials = 10001 ThdsPerBlk = 160 WorkPerThd = 17 Est.Blks/SM = 9.142858 RESULTS: CPU Price = 18.94181399, Std Error of CPU estimate = 0.0009396159439 GPU Price = 18.94181393, Std Error of GPU estimate = 0.000939610647 CPU runtime = 2606.839844ms - NOTE: this is single thread, unoptimised GPU runtime = 58.32470703ms Speedup = 44.69529343x ========================================================================================= MultiOptionSobol ========================================================================================= ---------------------------------------------------- GPU PROPERTIES Using NVIDIA Device: Tesla C2050 Compute Version: 2.0 Clock Rate: 1147MHz Num. Processors: 14 Max Threads/Block: 1024 Warp Size: 32 threads Total Memory: 2687MB Constant Memory: 64KB Shared Memory: 48KB Max Registers/Block: 32768 Concurrent Cpy & Exec: true Concurrent Kernels: true ---------------------------------------------------- CPU initialization time = 155.0850067ms GPU initialization time = 0.9502720237ms Start of Auto-tuning: workPerThd = 5 to 50 Found new min: runtime=2.654944ms, workPerThd=5, thdsPerBlk=32, Est.Blks/SM=54 Found new min: runtime=1.47456ms, workPerThd=5, thdsPerBlk=64, Est.Blks/SM=27 Found new min: runtime=1.14608ms, workPerThd=5, thdsPerBlk=96, Est.Blks/SM=18 Found new min: runtime=0.954976ms, workPerThd=6, thdsPerBlk=96, Est.Blks/SM=15 Found new min: runtime=0.9264ms, workPerThd=12, thdsPerBlk=96, Est.Blks/SM=7.714286 Auto-tuning complete Tuning Statistics: nRuns=736 ave=2.157277 min=0.9264 max=4.313568 stdDev=0.7155555 DESCRIPTTION: Simple Black-Scholes path dynamics with deterministic term structures of interest and volatility. Prices multiple European Call options with different parameters in parallel. Uses NAG GPU quasi-random (Sobol) generator, and constructs sample paths using a Brownian bridge. ALGTHM: Milstein (with caching), SINGLE precision OPTION: OPTION 1 TYPE=CALL S0=100 K=100 T=1.7 OPTION 2 TYPE=CALL S0=100 K=100 T=1.7 OPTION 3 TYPE=CALL S0=100 K=70 T=1.5 OPTION 4 TYPE=CALL S0=100 K=100 T=1 OPTION 5 TYPE=CALL S0=100 K=100 T=1.7 OPTION 6 TYPE=CALL S0=100 K=70 T=1.2 SIMULATION PARAMS: NumTrials = 20001 ThdsPerBlk = 96 WorkPerThd = 12 Est.Blks/SM = 7.714286 RESULTS: OPTION 1 OPTION 2 OPTION 3 OPTION 4 OPTION 5 OPTION 6 CPUPrice: 19.30513388 26.84605282 37.66472196 10.08480107 32.11863106 31.67306764 GPUPrice: 19.30513331 26.84605114 37.66471917 10.08480103 32.11862916 31.67368598 CPU runtime = 252.1610107ms - NOTE: this is single thread, unoptimised GPU runtime = 1.910976052ms Speedup = 131.9540405x