The most advanced AD Software in the world

Algorithmic Differentiation (AD) and Adjoint AD (AAD) are extremely powerful technologies. Applying them by hand to production sized codes is a serious, lengthy undertaking and requires a team of specialists. Code maintenance and updating also becomes more expensive and complex. For this reason, most people have turned to AD tools to get sensitivities of their simulation codes.

NAG AD Software: dco/c++, dco/map & NAG AD Library

NAG's AD Software is based on over 15 person years of research. Organizations that are utilizing the Tools are reaping the extensive benefits that can be gained from implementing AD methods in their computation. NAG AD Software is been battle proven, at scale, in business critical applications.

Get fast, accurate sensitivities of any order
dco/c++
dco/map
NAG AD Library
(A)AD software tool for computing sensitivities of C++ codes
  • It embodies over 15 person years of R&D, a lot of which has required original research
  • It's an operator-overloading tool with a slick API: the tool is easy to learn, easy to use, can be applied quickly to a code base and integrates easily with build and testing frameworks
  • Arbitrary order derivatives of any code can be computed, accurate to machine precision: the tool can answer all your sensitivity-related questions
  • Tier 1 and 2 banks have applied it to their core pricing and risk libraries and use it in production: the tool has been battle proven, at scale, in business critical applications
  • Both customer and in-house testing show that dco/c++ offers best-in-class performance, thanks to an advanced template engine and highly optimized internal data structures
  • Baseline memory use is low and the intuitive checkpointing interface allows memory use to be further controlled and constrained almost arbitrarily: the success of adjoint AD lies in balancing memory use and computation, and dco/c++ gives the user full control in a very natural way
  • The checkpointing interface allows handwritten adjoints to be specified for any part of the code, allows interfacing with GPUs, and much more: as users learn more about AD, dco/c++ allows them to implement all the tricks that people have developed to handle particular code patterns
  • It supports parallel adjoints: on modern architectures exploiting parallelism effectively is crucial, and dco/c++ allows the parallelism to be carried over into the adjoint code as well

dco/c++ Key Features

  • Slick, productivity-orientated interface
  • Very fast computation with expression templates and highly optimized tape
  • Full control over memory use
  • Supports parallelism and GPUs (in combination with dco/map)
  • Vector tangent and adjoint modes
  • Activity analysis
  • Sparsity pattern detection
  • Tape compression
  • Direct tape manipulation
  • Adjoint MPI support  

These features allow advanced DAG manipulation and allow users to create highly efficient adjoint implementations. In addition to these core features, dco/c++ offers users a wealth of additional functionality: vector mode AD, activity analysis, sparsity pattern detection, tape compression, direct tape manipulation, adjoint MPI support, combined tangent-adjoint debugging types.  These features allow advanced exploration and manipulation of the adjoint computation for the fastest, most efficient implementation.  

 

C++11 tape free operator overloading AD tool designed to handle accelerators (GPUs etc)
  • dco/map combines the benefits of operator overloading (easy application, one single source code) with the benefits of handwritten adjoints (very fast and minimal memory use)
  • dco/map can handle the race conditions inherent in parallel adjoint codes easily and efficiently
  • First and second order tangent and adjoint
  • Produces single unified code for primal, tangent and adjoint
  • Thread safe by design: high performance array and scalar types for shared input data
  • Primal as fast as non dco/map primal
  • Specialised high-performance array types to handle race conditions inherent in parallel adjoints
  • Supports whole of C++11, cross platform 
  • API for storing things you don't want to recompute
  • Easy integration with NAG’s dco/c++ via external adjoint interface
World-class adjoint numerical and statistical solvers
  • The NAG AD Library computes seamlessly with NAG’s AD Tool, dco/c++, and can be used with any other AD solution
  • Delivers exact derivatives instead of approximations when finite differences are used
  • NAG Library users who apply AD can now use high-quality adjoint routines from NAG:
    • No need to write adjoint versions of these routines or resort to inferior replacement
  • No specific product dependency: the NAG AD Library can be used with any AD tool
  • Easy switch between symbolic and algorithmic adjoints
    • Same interface for symbolic and algorithmic adjoints making use quick and easy
  • NAG provides a single AD solution when utilizing both dco/c++ and the NAG AD Library
  • Use of internal representation (IR) to reverse the routine call tree
  • Differentiate output variables with respect to a specified set of input variables only
  • Adjoints need only be interpreted once per output variable which reduces computation time when (usually) the number of outputs is small relative to number of inputs
  • Adjoint routines can be used as intrinsics in dco/c++ (just like using sin(x) in C++)
  • Algorithmic adjoints of routines with user-supplied functions can be used without additional development time
  • Smooth transition from non dco/c++ solution to solution with dco/c++
  • No need to copy variables when used with dco/c++ (binary compatible data types)
  • The NAG AD Library is fully documented, maintained and supported by computational experts: receive first-line technical support as and when needed
AD Benefits

Easy to learn and easy to use

  • dco/c++ is an operator-overloading tool with a slick API: the tool is easy to learn, easy to use, can be applied quickly to a code base and integrates easily with build and testing frameworks

Battle proven, at scale, in business critical applications

  • Tier 1 and 2 banks have applied it to their core pricing and risk libraries and use it in production: the tool has been battle proven, at scale, in business critical applications

Answer all your sensitivity-related questions

  • Arbitrary order derivatives of any code can be computed, accurate to machine precision: the tool can answer all your sensitivity-related questions

Allows interfacing with GPUs, and much more

  • The checkpointing interface allows handwritten adjoints to be specified for any part of the code, allows interfacing with GPUs, and much more: as users learn more about AD, dco/c++ allows them to implement all the tricks that people have developed to handle particular code patterns.
Software Details
dco/c++
dco/map
NAG AD Library
NAG Fortran Compiler for AD
Custom AD Software Solutions
What's new in dco/c++ v3.4

Vector Type (support for vectorization):

The dco/c++ data type gv<DCO_BASE_TYPE, VECTOR_SIZE>::type implements a vector data type primarily useful for SSE/AVX vectorization. This type can then be used as base type for primal, tangent and adjoint dco/c++ types.

Binary Compatible Passive Type:

This type can be used to turn parts of an adjoint computation passive, i.e. no tape activity is performed for objects of this type. Declaring a gbcp type of an adjoint type results in equal type sizes (=binary compatibility), while the gbcp type only provides access to the value object of the active type. A gbcp type can safely be cast to its value type when performing passive computations. Chaining gbcp types can be used to access any lower order of a higher order active type.

Thread-local (global tape):

dco::ga1s<T>, dco::ga1v<T>, etc. are by default thread safe due to use of a thread-local global tape

Complex data type:

dco::complex_t used as specialization of std::complex; req. for Windows and old gcc versions

Faster compilation:

Inlining is important to achieve best run time performance but it can increase compilation time. The user can now switch off the aggressive inlining in dco/c++ for faster compilation.

Learn more about the new and improved features of dco/c++ v3.4 here

New at dco/c++ v3.3.0

Modulo adjoint propagation (less memory use)

The vector of adjoints is compressed by analysing the maximum number of required distinct adjoint memory locations. During interpretation, adjoint memory, which is no longer required, is overwritten and thus reused by indexing through modulo operations. This feature is especially useful for iterative algorithms (e.g. time iterations). The required memory for the vector of adjoints usually stays constant, independent of the number of iterations. Combined with use of disk tape, almost arbitrarily sized tapes can be generated, which might be especially of interest for prototyping or validation purposes.

Sparse tape interpretation (debug capability to avoid NaNs)

The adjoint interpretation of the tape can omit propagation along edges when the corresponding adjoint to be propagated is zero. This might be of use when NaNs or Infs occur as local partial derivatives (e.g. when computing square root of zero), but this local result is only used for subsequent computations which are not relevant for the overall output. This feature might have a performance impact on tape interpretation and should therefore be considered for debugging configuration only.

New in dco/c++ v3.2

  • Re-engineered internals mean dco/c++ is now roughly 30% faster and uses roughly 30% less memory (based on internal testing)
  • Vector reverse mode: for simulations with more than one output, several columns of the Jacobian or Hessian can now be computed at once using vector data types
  • Parallel reverse mode: for simulations with more than one output, the columns of the Jacobian or Hessian can now easily be computed in parallel.  This can be combined with vector reverse mode.
  • Jacobian pre-accumulation: sections of the computation can be collapsed into a pre-computed Jacobian, further reducing memory use
  • Disk tape: allows the tape to be recorded straight to disk.  Although slower, this allows very large computations to complete without having to use checkpointing to reduce memory use
  • Tape activity logging and improved error handling
Overview

dco/map is used to create adjoints of performance-critical sections of code, be they C++/OpenMP or CUDA. It has found application notably in accelerated XVA platforms where it helps deliver first and second order sensitivities. An overview is available here: High Performance Tape-Free Adjoint AD for C++11

  • dco/map combines the benefits of operator overloading (easy application, one single source code) with the benefits of handwritten adjoints (very fast and minimal memory use)
  • dco/map can handle the race conditions inherent in parallel adjoint codes easily and efficiently

What's new in v1.6 dco/map

  • New bitwise-copyable reduction push array.  For many workloads, reduction push is still the fastest array type, and the new class makes it easy to access this performance in existing C++ codes
  • Array management functions now allocate bitwise-copyable adjoint arrays directly, making it significantly easier to integrate the array classes into existing codes and class hierarchies.  It is now easier to apply dco/map to an existing C++ code base
  • Performance improvements to atomic push arrays – these are now approximately 50% faster in double precision
  • Improved dco/map external adjoint object for easier interoperability with dco/c++
  • Enhanced MAP_PRINT functionality to give more information to the user and make it easier to process the data
  • Overhauled the training material: users should now find it easier to get up to speed with dco/map
Overview

The NAG AD Library provides expertly developed, tested, documented, and supported numerical and statistical routines that make the Algorithmic Differentiation process quicker, more efficient and productive, and eliminates the need to write your own code or rely on unsupported code. The Library has been designed so that it can be used with or without any other Algorithmic Differentiation (AD) tool; however, the conversion from code containing calls to primal routines to an adjoint version becomes most seamless when combined with NAG’s AD tool, dco/c++.

Using the NAG AD Library

The numerical and statistical routines in the NAG AD Library can be called from C, C++ and Fortran. Example programs are available for each adjoint routine in both C++ and Fortran. The Library has been designed so that it can be used with or without any other Algorithmic Differentiation (AD) tool; however, the conversion from code containing calls to primal routines to an adjoint version becomes most seamless when combined with the AD tool dco/c++.

NAG AD Library Interfaces

The interface of a NAG AD Library routine follows closely the interface of the primal NAG Library routine on which it is based. The main differences are: real-valued variables change type to a special defined data type; an extra C pointer argument is added as the first argument; functions (with a non-void return type) are replaced by a void function / subroutine containing an extra argument to provide the return value. The same changes also apply to function / subroutine arguments and user workspace arguments are always provided for those that perform computation. Adjoints with respect to active parameters provided in user workspace can also be provided.

Technical Report: Why do we need Adjoint routines?

Documentation

The NAG Library Manual, Mark 27 is available in the following formats:

  • HTML: the full, online manual;
  • ZIP file: the archive as a ZIP file.

The Library consists of a number of generic interfaces:

  • the FL interface, a standard set of interfaces that utilise only simple types which makes them suitable for calling from a wide range of languages, including Fortran (NAG's traditional Fortran Library interfaces), C, C++, VBA and others;
  • the CL Interface, NAG's traditional set of C Library interfaces;
  • and the NAG AD Library interfaces to support Algorithmic Differentiation.

In addition to the generic interfaces described in this manual, NAG supports interfaces tailored to specific environments and programming languages, including Python, Java, .NET and MATLAB®.

The Library is organized into Chapters – each being documented with its own Introduction and Contents list followed by a comprehensive document for each function detailing its purpose, description, list of parameters and possible error exits. Example programs and results are also supplied. All examples are available online to facilitate their use as templates for the users' calling programs.

The NAG Library Manual - prior releases

Previous releases of the NAG Library Manual are available from here

Installer's Notes and Users' Notes

Support documentation for the installation and use of each implementation of the NAG Library is available.

Overview

For simulation programs written in Fortran a version of the NAG Fortran Compiler has been extended to serve as a pre-processor to dco. The seamless integration of AD into a complex build system is facilitated. Hence, the amount of modifications to be made by the user in the original source code can be minimized or even be eliminated entirely.

Overview

dco/c++ is a very efficient, high-productivity AD tool.  However, it is sometimes desirable to produce the AD implementation in a different way, often to handle computationally expensive sections of code.  Typically, this is through hand-writing an adjoint implementation, or using a high-performance tool such as dco/map to make the AD implementation, or even porting the AD code to a GPU. 

NAG AD Solution Services can assist in all these cases, whether it be writing adjoints for particular pieces of code, or porting to GPU and making GPU adjoint implementations.