CVA in the Cloud

Solve Large Scale CVA Computations

CVA in the Cloud

  • Solve large scale CVA computations with NAG software expertise

  • Computing CVA and sensitivities using Origami and AD offers significant performance benefits compared with using finite differences and legacy grid execution software which might take many hours.  

Summary

NAG has developed, in collaboration with Xi-FINTIQ, a CVA demonstration code to show how the NAG Library and NAG Algorithmic Differentiation (AD) tool dco/c++ combined with Origami – a Grid/Cloud Task Execution Framework available through NAG – can work together to solve large scale CVA computations.

CVA Demonstrator

Origami is a lightweight task execution framework. Users combine tasks into a task graph that Origami can execute on an ad-hoc cluster of workstations, on a dedicated in-house grid, on production cloud, or on a hybrid of all these. Origami handles all data transfers.

In our CVA demonstrator the trades in netting sets are valued in batches. CVA is calculated per netting set by running the code forward as normal. The graph is then reserved and the dco/c++ adjoint version of each task is run to calculate sensitivities with respect to market instruments. The resulting graph has a large number of tasks with non-trivial dependencies which Origami automatically processes and executes.

Optimizations

Before running in the cloud we profiled our demonstrator and identified two main opportunities for performance improvements:

  • Reducing the amount of I/O, as this may perform poorly;
  • Optimizing the adjoint calculations, which consumed a significant proportion of the execution time.

Reducing I/O

Earlier versions of our CVA demonstrator relied on writing intermediate results (e.g. the outputs from the intermediate tasks in the graph) out to disk. Because this can introduce a large performance penalty, we modified the demonstrator to maintain these results in memory instead.

Writing a Symbolic Adjoint

The original version of the CVA demonstrator spent approximately 25% of its runtime calculating the adjoint of the NAG Library linear regression routine (g02daf). We therefore wrote a symbolic adjoint to improve the performance of the AD tasks:

Scaling

We ran the code on Microsoft’s Azure cloud computing service. We used D4s_v3 virtual machine (VM) images with 4 virtual CPUs and 16GB RAM running Ubuntu Server 18.04 LTS. The input data set contained 8 netting sets comprising a total of 28 843 swaps and 23 875 Bermudan swaptions. The Monte Carlo simulation used 2 000 paths. The code scales well as the number of VMs is increased.

Adjoint Efficiency

We measured the efficiency of our AD scheme by the adjoint ratio: the runtime of the AD computation divided by the runtime of the forward computation (lower is therefore better). We observe that the symbolic adjoint reduces the adjoint ratio:

Performance

Computing CVA and sensitivities using Origami and AD offers significant performance benefits compared with using finite differences and legacy grid execution software which might take many hours.

  Elapsed Time
Origami (4 VMs x 4 cores) and AD 49m 8s
Origami and AD, symbolic adjoint 36m 23s

Possible Future Work

  • Investigate different checkpointing strategies with the aim of reducing the code’s memory consumption;
  • Port the code to other operating architectures, e.g. GPUs.

Availability

To find out more about this work and related NAG products please contact support@nag.com and see the NAG Algorithmic Differentiation Solutions area 

Acknowledgements

We gratefully acknowledge the support of the POP CoE which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements No. 676553 and 824080, and has partially contributed to this work.

We would like to thank Microsoft for providing the Azure Sponsorship to produce the results presented here.

Jacques du Toit, Nick Dingle, Ian Hotchkiss, Viktor Mosenkis, Justin Ware