NAG Algorithmic Differentiation Solutions
Algorithmic Differentiation (AD) is a Mathematical/Computer Science technique for computing accurate sensitivities quickly. For many models, Adjoint AD (AAD) can compute sensitivites 10s, 100s or even 1000s of times faster than finite differences. NAG are pioneers in providing AD technologies. We help organisations apply AD to their computation - blue chip clients in finance are reaping the benefits of NAG’s expertise in this field, and other industries could benefit extensively from implementing NAG AD Solutions.
AD and AAD are extremely powerful technologies - unfortunately they are demanding to use. Applying them by hand to production sized codes is a serious, lengthy undertaking and requires a team of specialists. Code maintenance and updating also becomes more expensive and complex. For this reason, most people have turned to AD tools to get sensitivities of their simulation codes.
Our world is increasingly driven by simulation. Computer programs simulate the behaviour of systems, and the results are used to make business decisions, control machines, or are fed into further simulations. However, in many industries it has become important not only to compute the simulation value, but also its sensitivity (mathematical derivative) to the model inputs.
- In machine learning, the sensitivity of the model error is used to fit the network parameters
- In seismic applications, the derivative of the demigration operator is needed for Least Squares Reverse Time Migration (LS-RTM)
- In disaster modelling (flood, tsunamis, fire) the sensitivities of the simulation with respect to the geometry of the city or countryside can help in designing defences
- In finance, sensitivities are used to hedge risk and explain exposures
- In portfolio management, sensitivities show how robust a portfolio is to changes in market conditions, and how quickly it could lose value in a down turn
- In aerodynamics (Formula 1, consumer automotive, aerospace, marine, defence) sensitivities of drag, lift or ballistic profile with respect to geometry are used to optimise vehicle shape
- In flood defense and river management, sensitivities with respect to pebble size show how river paths can change over time, and how quickly
- In climate modelling, model sensitivities give new insights into weather patterns and show how quickly and severely local conditions can change
The power of AD comes at a price; positive outcomes of using AD methods are costly for organizations. Adopting a hand-written ‘in-house’ approach to AD means writing (adjoint) AD versions of all your simulation codes. This is expensive, time consuming and requires skilled developers who understand AD very well. In addition, if any changes are made to the simulation codes, the AD codes must be updated or else they will give incorrect results.
Adopting AD is a serious undertaking with far-reaching implications for the business, its software assets, and the productivity of its staff. NAG has been helping customers implement AD for over 10 years: we have a wealth of experience which we offer through our AD Solutions. Blue chip clients are reaping the benefits of NAG’s expertise in this field, and others could benefit extensively from implementing NAG AD Solutions.
NAG AD Solutions provide:
- AD Proof of Concept (PoC) Studies
- Software Tools (dco/c++ and dco/map)
- ‘In-house’ Training
- Ongoing Technical Support
Most business start exploring AD with a Proof of Concept (PoC) assigned to one, or sometimes two, ‘in-house’ developers. Very often, however, these PoCs don't fare very well because:
- the developer is new to (A)AD and has to learn a lot quickly, and usually on his/her own
- the developer must learn how to control memory use efficiently for production-size codes (our tools make this easy, but doing it effectively requires a good understanding of AD concepts and of the underlying code)
- typically, the PoCs are very time-limited: the developer must report back to the business by a given date, schedules are tight, and any hiccups or problems mean the schedules quickly slip
- the developer needs to learn how to use our AD tools: many people don't have the time to read through our documentation and example programs
- the developer often gets pulled onto other more urgent business matters mid-way through the PoC, then has to come back later on and try remember where he/she was
- once the AD code is working, the developer then needs to explain results: e.g., why are there zero derivatives with respect to some parameters when finite difference estimates are non-zero? Typically this is because the underlying code is not differentiable: ideally this should be spotted right at the outset and addressed before AD is applied
This is quite a daunting list. NAG's AD Proof of Concept Support service has grown through client demand: in a nutshell, NAG experts conduct the PoC with the client.
- The organisation takes one of our AD developers in-house, typically for a week.
- Our developers work with the organisation's to get the entire PoC up and running as quickly as possible
- It's a combination of coding help and on-the-job training: we answer questions, point out pitfalls, explain AD concepts, inspect the code for non-differentiabilities, point out where analytic adjoints are desirable, and generally offer advice on design, implementation and testing.
- The PoCs are completed much faster, developers learn faster because there's someone next to them to answer questions, and the resulting AD code is efficient and has predictable memory use.
NAG Proof of Concept AD Support Results
For an organisation new to AD, the NAG Proof of Concept AD Support service allows a rapid deep-dive into what AD means for your business, with the security of knowing there are experts at hand who make sure you get the answers you need, on time. By the end of the PoC the developers can not only report results back to the business with confidence, but they will be familiar with our tools and will have a non-trivial reference code from which to start for the next project.
Contact us for more information.
NAG provides best-in-class C++ operator-overloading AD tools called dco (derivative computation through overloading) and dco/map (dco meta adjoint programming).
- dco/c++ integrates easily with your code base, is extremely flexible, and has been applied to huge (millions of lines) production codes
- dco/c++ allows the memory use of adjoint codes to be constrained almost arbitrarily
- dco/map is a cutting edge C++11 tape free operator overloading AD tool designed specifically to handle accelerators (GPUs, Xeon Phi, etc). An overview is available here: New Technical Poster: High Performance Tape-Free Adjoint AD for C++11
- dco/map combines the benefits of operator overloading (easy application, one single source code) with the benefits of handwritten adjoints (very fast and minimal memory use)
- dco/map can handle the race conditions inherent in parallel adjoint codes easily and efficiently
- dco/c++ and dco/map are easily coupled to create AAD implementations of heterogeneous codes
Learn more about NAG AD Software Tools.
NAG AD Software Tools are very efficient. However, it is sometimes desirable to produce an AD implementation in a different way, often to handle computationally expensive sections of code. Typically, this is through hand-writing an adjoint implementation, or using a high-performance tool such as dco/map to make the AD implementation, or even porting the AD code to a GPU. NAG AD Solution services can assist in all these cases, whether it be writing adjoints for particular pieces of code, or porting to GPU and making GPU adjoint implementations. Contact us for more information.
AD Solutions Case Studies
Case study highlights: Obtaining the gradient through finite differences took a month and a half. The adjoint AD code obtained the gradient in less than 10 minutes.
Figure 2: MITgcm sensitivities of zonal ocean water flow through the Drake Passage to changes in bottom topography.
AD helps our understanding of climate change and improves weather predictions.
Figure 2 shows the sensitivity of the amount of water flowing through the Drake passage to changes in the topography of the ocean floor. The simulation was performed with the AD-enabled MIT Global Circulation Model (MITgcm) run on a supercomputer. The ocean was meshed with 64,800 grid points. (2)
Obtaining the gradient through finite differences took a month and a half. The adjoint AD code obtained the gradient in less than 10 minutes.
The gradient information can be used to further refine climate prediction models and our understanding of global weather, for example the high sensitivity over the South Pacific Ridge and Indonesian Throughflows even though these are far away from the Drake Passage.
Utke J, Naumann U, Wunsch C, Hill C, Heimbach P, Fagan M, Tallent N and Strout M. (2008). OpenAD/F: A modular, open-source tool for automatic differentiation of Fortran codes. ACM Trans. Math. Softw, 34(4) 18:1-18:36.
Case study highlights: The normal simulation on a top-end desktop took 44s, while the AD-enabled simulation took 273s. To obtain the same gradient information, on the same machine, by finite differences would take roughly 5 years.
Figure 1: Sensitivity of drag coefficient to the surface geometry of a car at high speed (left) and low speed (right)
AD enables sensitivity analyses of huge simulations, enabling shape optimization, intelligent design and and comprehensive risk studies.
Figure 1 shows sensitivities of the drag coefficient to each point on a car's surface when it moves at high speed (left) and low speed (right). The simulation was performed with AD-enabled OpenFOAM built on top of the AD Software Tool dco. The surface mesh had 5.5 million cells and the gradient vector was 18GB.
The normal simulation on a top-end desktop took 44s, while the AD-enabled simulation took 273s. To obtain the same gradient information, on the same machine, by finite differences would take roughly 5 years.
The gradient information can now be used to optimize the shape of the car so as to reduce the drag
Towara M and Naumann U (2013). A discrete adjoint model for OpenFOAM. Procedia Comp. Sci. Volume 18.
AD and AAD is used in finance to get sensitivities of complex instruments quickly, enabling real time risk management and hedging of quantities like xVA.
Here we show some results from a paper which studied two typical codes arising in finance: Monte Carlo and PDEs.
Table 1: Run times and memory requirements as a function of gradient size n for Monte Carlo and PDE applications.
Table 1 shows the runtimes of a first-order adjoint code using dco vs. central finite differences on a typical finance application (option pricing under local volatility model, 10K sample paths/spatial points and 360 time steps). The second column f is the normal runtime of the application, cfd is the runtime for central finite differences and AD is adjoint AD runtime along with additional memory required (tape size). Calculations were run on a laptop so only the relative runtimes AD/f and cfd/AD are important, the latter showing the speedup of AD over finite differences.
In finance such derivative information is often used for hedging and risk calculations, so these gradients must be computed many times per day.
du Toit J and Naumann U (2014). Adjoint algorithmic differentiation tool support for typical numerical patterns in computational finance. NAG Technical Report TR3/14.