# NAG Library

## 1Introduction

The NAG Library Manual is the principal documentation for the NAG AD Library and full details of how to use the NAG Library and its documentation can be found in How to Use the NAG Library and its Documentation. Details specific to the NAG AD Library are given in the following sections and in the X10 Chapter Introduction.
The NAG AD Library is designed to be compatible with the AD tool dco. Using the NAG AD Library in conjunction with dco provides the best user experience. However, the use of dco is not essential; the NAG AD Library can be used on its own or in conjunction with any other AD solution. Other than in Section 5, the description assumes no dco licence.

## 2Motivating Example

Let us assume that you regularly run numerical simulations of a scalar objective $y$ (for example, the price of a financial product) depending on $n$ uncertain arguments ${x}_{i}$ (for example, market volatilities). Suppose that a single run of the given implementation of the (pricing) function $y=f\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$ as a program takes one minute on the available computer. In addition to the value $y$, you might be interested in the sensitivity of the value $y$ to small changes in the input values; that is, the derivatives of $y$ with respect to all the ${x}_{i}$. The rudimentary approach to obtaining an approximation to one of these derivatives is to perturb one of the input values ${x}_{j}$, obtain a perturbed output value $\stackrel{^}{y}$ and calculate finite differences approximations based on the perturbed and original values. The NAG AD Library offers an alternative approach by computing, principally, the adjoint of $f$ for retrieving the derivatives. Finite difference approximation of the corresponding $n$ gradient entries requires $\mathcal{O}\left(n\right)$ evaluations of $f$. Their accuracy suffers from truncation and/or numerical effects due to cancellation and rounding in finite precision floating-point arithmetic. Moreover, if, for example, $n=\mathrm{1000}$, the calculation of the approximated gradient with finite differences will take at least $\mathrm{1001}$ times the time for the primal calculation (normal forward evaluations with no additional derivative evaluations, also referred to as passive calculation). In adjoint mode this computation can be expected to take less than $\mathrm{20}$ (often less than $\mathrm{10}$) times the time for the accumulation of the same gradient with machine accuracy.

## 3Brief Introduction to Adjoint Derivatives

To introduce adjoint derivatives and their advantages, we first briefly illustrate the tangent model and a commonly used implementation of this model which approximates by finite differences. For more detailed information on adjoints in general, see Dunford et al. (1971), and for algorithmic adjoints by algorithmic differentiation in particular, see Griewank and Walther (2008) and Naumann (2012).
For a continuously differentiable function (the primal)
 $y := fx with ​ f: ℝn → ℝm ,$
and assuming distinct inputs and outputs, the tangent model is defined as
 $y1 := f1 x,x1 = ∇x fx x1 ,$
with tangents ${x}^{\left(1\right)}\in {ℝ}^{n}$ and ${y}^{\left(1\right)}\in {ℝ}^{m}$, and the tangent function ${f}^{\left(1\right)}:{ℝ}^{n}×{ℝ}^{n}\to {ℝ}^{m}$. The symbol ${\nabla }_{x}$ stands for the full derivative tensor of $f$ with respect to $x$, i.e., the Jacobian. The tangent model calculates a weighted sum of the columns of the Jacobian matrix, which corresponds to a directional derivative of $f$ in direction ${x}^{\left(1\right)}$.
The tangent model can easily be approximated using finite differences by perturbing the input into a scaled direction ${x}^{\left(1\right)}$ with appropriate scaling factor $h$ and calculating the difference quotient
 $f x+h x1 - fx h ≈ ∇x fx x1 .$
Using the tangent model to compute the full Jacobian matrix, we need to calculate the directional derivatives in the direction of the $n$ Cartesian basis vectors in ${ℝ}^{n}$. This approach delivers the Jacobian column-by-column. Since it requires the evaluation of $f$ at the original point $x$ in addition to the $n$ perturbed points, it has a computational complexity of $\mathcal{O}\left(n\right)·\text{cost}\left(f\right)$, which corresponds to $n$ evaluations of the tangent model.
The adjoint model is defined as
 $x1 y1 := f1 x,x1,y1 = x1 + ∇x fx T y1 0 ,$
with adjoint variables ${x}_{\left(1\right)}\in {ℝ}^{n}$ and ${y}_{\left(1\right)}\in {ℝ}^{m}$ (corresponding to respective primal variables $x$ and $y$), and the adjoint function ${f}_{\left(1\right)}:{ℝ}^{n}×{ℝ}^{n}×{ℝ}^{m}\to {ℝ}^{n}×{ℝ}^{m}$ which calculates the product of the transposed Jacobian with ${y}_{\left(1\right)}$, and which sets ${y}_{\left(1\right)}$ to zero on output. The calculated weighted sum of rows of the Jacobian matrix corresponds to an adjoint directional derivative of $f$ in the adjoint direction ${y}_{\left(1\right)}$.
The computation of the full Jacobian matrix requires adjoint directional derivatives in the direction of the $m$ Cartesian basis vectors in ${ℝ}^{m}$. This approach delivers the Jacobian row-by-row and has a computational complexity of $\mathcal{O}\left(m\right)·\text{cost}\left(f\right)$. This is an enormous advantage over the tangent model whenever $R·m, where $R$ quantifies the implementation overhead of the adjoint. For example, in many optimization problems, a single scalar objective value ($m=1$) is computed from a large number of input model arguments ($n$ large).

## 4How to Use the NAG AD Library

As indicated in the previous section, each real-valued variable in the domain of the functional (primal variable), e.g., $x$, has a corresponding adjoint variable, e.g., ${x}_{\left(1\right)}$. The NAG AD Library interface uses special data types (called active data types) to reference the primal variable and the adjoint variable. The respective components are accessible via functions acting on these active data types.

### 4.1Naming Convention

See Section 3.3 in the X10 Chapter Introduction for details of how routines in the NAG AD Library are named; in particular what ‘_a1w_’ means in data types and routine names.

### 4.2How to Calculate an Adjoint

Calculating adjoints using the NAG AD Library requires the following basic steps. Users of dco will be familiar with this procedure. In the following, we assume pure inputs and outputs (no overwriting), i.e., a function $y:=f\left(x\right)\text{.}$ (Please see Section 4.4 for further details on how to handle variables that are both input and output.)
 $x1 := x1 + ∇x fx T y1 .$
The basic procedure around the calling of NAG AD Library routines is (note that IR = Internal Representation):
 1 allocate global memory for internal data / IR (API call) 2 copy routine input data into active data types (Note: this is just an assignment statement in C++ and Fortran) 3 register input variables to IR (API call) (Note: only register those inputs with regard to which adjoints are required) 4 call library routine(s) 5 increment adjoints of outputs ${y}_{\left(1\right)}$ (API call) 6 calculate adjoints of inputs by interpreting the IR (API call) 7 get adjoints of inputs ${x}_{\left(1\right)}$ (API call) 8 free internal memory (API call)
The steps in this procedure are described in more detail for different languages in the next sections. As a simple example of a NAG AD Library routine see s01ba_a1w_f.

#### 4.2.1Driver interface

This section describes the interfaces of routines provided to perform the above steps (except step 4.) The interfaces are very similar across the three programming languages considered: C, C++ and Fortran. Since the NAG AD Library is designed to be compatible with dco, there is a corresponding dco/c++ interface for C++ available; see Section 5. The dco/c++ interface can be used without having a dco/c++ licence by using dco/c++/light, which is part of the NAG AD Library. Using dco/c++/light, though not having the overloading functionality of full dco/c++, provides an easy transition to full dco/c++.
##### 4.2.1.1Modules and includes
C / C++
```#include <nag.h>
#include <nag_stdlib.h>```
Fortran
```Use iso_c_binding, Only: c_ptr
Use nag_library, Only: nag_wp```
##### 4.2.1.2Basic types
The following basic types represent those for real-valued variables and their first order adjoints in working precision.
Accessing primal values from active type:
 Variable C C++ Fortran x x.value nagad_a1w_get_value(x) x%value
##### 4.2.1.3Operating on the Internal Representation (IR)
Initialize IR
Destroy IR
Register active variable (with respect to which we want to differentiate )
##### 4.2.1.4Derivative evaluation
Increment the adjoint component of active type (transposed Jacobians are applied to the set of increments, which are zero by default)
 C / C++: void nagad_a1w_inc_derivative(const nagad_a1w_w_rtype* x, double inc); Fortran: Subroutine nagad_a1w_inc_derivative(x, ax) Type (nagad_a1w_w_rtype) :: x Real (Kind=nag_wp), Intent (In) :: ax
Extract the derivatives of active types with respect to a given registered variable
##### 4.2.1.5Utilities for configuration and callback data objects
To use the NAG AD Library, the X10 Chapter Introduction is essential reading. This chapter contains routines for handling data objects used in the NAG AD Library. In particular, it provides a mechanism for choosing whether to compute adjoints algorithmically or, symobolically (see Section 3.2.2 in the X10 Chapter Introduction), to compute adjoints symbolically.
By default differentiation is performed algorithmically; in this case derivative calculations are performed in the algorithmic computational mode. For those routines (as listed in Section 3.2.2 in the X10 Chapter Introduction) that have symbolic adjoints available, the symbolic computational mode can be set.
##### 4.2.1.6Examples of using NAG AD Library interfaces
A simple example program, in C, C++ and Fortran, of using the above interfaces is shown in Section 10 in x10aa_a1w_f. Additionally, each computational routine in the NAG AD Library contains an example program in C++ and Fortran; C examples also exist for some of these.

#### 4.2.2Routines without user-supplied subroutines or functions

For an example of the interface differences between a routine in the NAG Library and its corresponding routine in the NAG AD Library, compare f07caf (dgtsv) and f07ca_a1w_f. The example programs in Section 10 in f07ca_a1w_f show how the NAG AD Library routine is called in relation to the basic interface calls.

#### 4.2.3Routines with user-supplied subroutines

Many routines require user-supplied subroutines, e.g., the zero finder routine c05ay_a1w_f. Since the output value of the routine depends on the implementation of the user-supplied routine, the derivative (i.e., the adjoint) does so as well. For c05ay_a1w_f the solution and its derivative depend on the residual function. Note that for those primal routines with function arguments, those arguments have been transformed into subroutines for the NAG AD Library equivalent routine. The AD variant subroutine argument has an extra output argument (ret) to provide the return value. This extra return value argument is consistently placed immediately before any user workspace arrays. For example compare the primal function (with function argument) d01fbf and the NAG AD equivalent routine d01fb_a1w_f.
In the algorithmic computational mode, if the primal calculations of the supplied routine consist of simple arithmetic operations and intrinsic functions then the conversion from primal callback to adjoint callback is simply a matter (in C++ with dco/c++) of performing the same operations on the basic AD types, see Section 10 in c05ay_a1w_f. Otherwise (C or C++ without dco/c++) the procedure, enumerated below, must be implemented in the procedure argument. This involves writing an additional callback (for adjoints) which always has the same fixed interface and which is run only during the adjoint interpretation phase. This additional callback is referred to, within the NAG AD Library, as the companion callback to the supplied procedure argument.
In the symbolic computational mode, the computation needs to be broken down into four stages, where the stage to be computed is determined by a callback mode. This is described in more detail in Section 2.4 in the X10 Chapter Introduction and Section 3 in x10bd_a1w_f.
The following description assumes an original user-supplied function calculating $z:=g\left(p,x\right)\text{,}$ where p is assumed to be an argument passed via the ruser variable and x is the current state. The adjoint model then calculates
 $p1 := p1 + ∇p gp,x T z1$
and
 $x1 := x1 + ∇x gp,x T z1 .$
It is recommended that the example programs listed in Section 10 in c05ay_a1w_f and Section 10 in e04gb_a1w_f be examined to see how procedure arguments in symbolic mode are handled for a simple case; in particular it shows how to handle both algorithmic and symbolic computational modes.The basic procedure for the symbolic adjoint computation of a procedure argument using a companion adjoint callback is
 1 create callback data object (API call in C++/Fortran) 2 write input arguments to callback data object e.g., $p$ and $x$ (API call in C++/Fortran) 3 calculate primal values $z$ 4 register output variables (API call) 5 write registered variables to callback data object (API call in C++/Fortran) 6 insert companion callback in IR (API call).
The basic procedure of the companion adjoint callback is
 1 read data from callback data object (API call in C++/Fortran) 2 get adjoints of outputs ${z}_{\left(1\right)}$ (API call) 3 calculate adjoint increments of inputs ${p}_{\left(1\right)}^{+}={\left[{\nabla }_{p}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}$ and ${x}_{\left(1\right)}^{+}={\left[{\nabla }_{x}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}$ 4 increment adjoints of inputs ${p}_{\left(1\right)}+={p}_{\left(1\right)}^{+}$ and ${x}_{\left(1\right)}+={x}_{\left(1\right)}^{+}$ (API call).
As a particular example of calculating adjoint increments, if
 $z := gp,x = sinpx2$
then
 $∇x gp,x = 2coszxp$
and
 $∇p gp,x = coszx2 .$

Since we use AD (algorithmic differentiation) software internally to automatically generate the adjoint code, in standard configuration all adjoint routines compute algorithmic (or discrete) adjoints. Symbolic adjoints on the other hand cannot be generated automatically and thus have to be written by hand. The benefit of symbolic adjoints is that it may exploit mathematical properties of a NAG library routine yielding a more efficient adjoint calculation. The possible efficiency gains through symbolic adjoints depend on the specific routine. Examples for huge benefits are the linear and the nonlinear solvers. See Giles (2008) for more on the symbolic treatment of a linear solver and Naumann et al. (2015) for more on the symbolic treatment of a nonlinear solver. Section 3.1 in f07ca_a1w_f provides a description of how the adjoint is computed symbolically for that routine. Similar descriptions are provided in each routine document for which a symbolic adjoint is available. See Section 3.2.2 in the X10 Chapter Introduction for a list of routines for which this mode is available.

### 4.4How to Handle Overwriting of Variables

In the previous sections, we assumed pure inputs and outputs, i.e., no overwriting of variables. When overwriting variables, accessing adjoints later may result in wrong values. This is due to the fact, that adjoints are accessed via a reference inside the active variable. When a variable gets overwritten, this reference changes.
See Section 10 in f07ca_a1w_f which shows how copies are made of active input/output variables to obtain correct derivative values.

### 4.5Error handling

The NAG AD Library uses the same error reporting mechanism as the main library as described in Section 3 in How to Use the NAG Library and its Documentation. There may be additional routine specific error exits for NAG AD Library routines, relating to cases where the adjoint could not be computed accurately, and two additional error exits common to all routines in the NAG AD Library as described in the following subsections.

#### 4.5.1Dynamic Memory Allocation

In the case where a routine detects a failure to dynamically allocate sufficient memory, the routine will set an error condition, setting $\mathbf{ifail}=-899$, and exit with an appropriate error message.

#### 4.5.2Unexpected Errors

Internal calls to Library routines are checked for error exits even when these exits are not to be expected. Should an unexpected error exit occur the routine will set an error condition by setting ifail and exit with an appropriate error message. The value $\mathbf{ifail}=-89$ is set for unexpected error detection.

## 5The NAG AD Library and dco

dco is available as the licenced tool dco/c++. It implements Algorithmic Differentiation using operator overloading with features presented in its documentation (see NAG and Algorithmic Differentiation).
As already mentioned, the interface of the NAG AD Library is compatible with a dco solution. To be more specific, the data types and the functions acting on those types are binary compatible with dco. The interfaces given in Section 4.2.1 and those from the X10 Chapter Introduction therefore have a corresponding dco/c++ interface. This interface also comes with dco/c++/light, which is part of the NAG AD Library. dco/c++/light can only be used in conjuction with the NAG AD Library.
dco/c++ has various configuration options at compile time. When using dco/c++ or dco/c++/light interfaces, you have to use the correct configuration, i.e., the default. Do not set any compile time flags as described in Section 2.2 of the dco/c++ User Guide (full documentation is available only with dco). If you require and of the flags, please contact NAG.
Algorithmic differentiation can be used within the NAG Library in two different ways, as AD of the NAG Library and AD for the NAG Library.

As described in this document, the NAG AD Library makes it possible for the user to calculate derivatives of NAG Library routines. As shown, the AD interface requires use of special data types. It is advantageous to the understanding of these types if you are familiar with how dco/c++ works, i.e., have read the documentation of dco/c++. In addition, those special data types are in fact binary compatible with the dco/c++ data types. This means that the NAG AD Library routines can be seamlessly integrated into an overall dco/c++ solution.
dco/c++ is particularly convenient for those routines with supplied procedure arguments. In algorithmic computational mode, the conversion from NAG Library to NAG AD Library version of this procedure argument is implemented simply by replacing active types (e.g., double) by the corresponding dco/c++ or nagad data types. Nothing further is required in this case.
For example, if a procedure argument for a NAG Library routine looked like:
```void fcn(const double *x,
double *z,
Integer iuser[],
double ruser[]) {
z = sin(ruser[0]*x*x) - 0.1;
}```
then the NAG AD Library equivalent routine is simply
```void fcn_a1w(void **ad_handle, const nagad_a1w_w_rtype *x,
Integer iuser[],
z = sin(ruser[0]*x*x) - 0.1;
}```
or, using dco/c++ types,
```void fcn_a1w(void **ad_handle, const dco::ga1s<double>::type *x,
dco::ga1s<double>::type *z,
Integer iuser[],
dco::ga1s<double>::type ruser[]) {
z = sin(ruser[0]*x*x) - 0.1;
}
```

### 5.2AD for the NAG Library

Many routines within the NAG Library require derivative information of a user-supplied function or subroutine. dco/c++ can deliver those derivatives automatically by using operator overloading on the function implementation. Depending on the required derivatives (e.g., Jacobian or Gradient), tangent or adjoint modes can be used. In the NAG AD Library, the unconstrained minimization routine e04gb_a1w_f has an objective function argument (lsqfun) for which Jacobian matrix values can be returned. Therefore, the supplied procedure argument can create its own IR, register input variables and interpret adjoints of the objective function in order to return Jacobian values.
As an example, see the implementation of the user-supplied function in Section 10 in e04gb_a1w_f.

## 6How to Use the NAG AD Library Documentation

At Mark 26.2 the adjoint routines are collated together under a NAG AD Library Contents which is separate to the main NAG Library Manual chapter structure described in Section 4.2 in How to Use the NAG Library and its Documentation.
Each routine provides a link to:
• the NAG Library Manual
• the NAG C Library Manual
• the NAG AD Library Contents
• the primal routine from which the adjoint routine is derived
• the primal Chapter Introduction
• the primal Chapter Contents document

### 6.1Specification

Each specification provides details of the Fortran and C++ header interfaces available. Hovering over an argument name will reveal a tooltip popup.
The Fortran specification has a first argument (ad_handle) of type c_ptr which requires a Use iso_c_binding statement at the start of the calling Fortran program unit (e.g., in any Fortran example program of the NAG AD Library). Subsequent arguments that were real-valued in the primal version of the NAG Library are replaced by arguments of the same name with NAG AD type, e.g., nagad_a1w_w_rtype.
The C++ header specifications begin with a #include "nagad.h" line. If full dco/c++ is available (i.e., dco/c++ is installed and licensed) then this may be preceded by the line #include "dco.hpp" as shown in the C++ examples e.g., Section 10 in c05ay_a1w_f.

### 6.2Description

When a NAG AD Library routine can use the symbolic computational mode, details of the symbolic adjoint computation will be provided under the routine description.

### 6.3Argument Descriptions

The argument descriptions are simply short summaries taken from the corresponding passive routine argument description from the NAG Library.

### 6.4Example

Each example is a variant of the example for the primal routine which has been modified to demonstrate calling the NAG AD Library. This will demonstrate the basic stages of adjoint calculation including the calling of NAG AD Library computational routines.
As for the main NAG Library the example programs are designed so that they can be fairly easily modified, and so serve as the basis for a simple program to solve your problem.
For each implementation of the Library, NAG distributes the example programs in machine-readable form, with all necessary modifications applied. Many sites make the programs accessible to you in this form. Generic forms of the programs, without implementation-specific modifications, may be obtained directly from the NAG web site. The Users' Note for your implementation will mention any special changes which need to be made to the example programs.
Note that the results obtained from running the example programs may not be identical in all implementations and may not agree exactly with the results in the Manual.

## 7References

Dunford N, Schwartz J T, Bade W G and Bartle R G (1971) Linear Operators Wiley Interscience, New York
Giles M (2008) Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation Springer
Griewank A and Walther A (2008) Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiations (2nd Edition) SIAM
Hascoët L, Naumann U and Pascual V (2005) ‘To be Recorded’ Analysis in Reverse-Mode Automatic Differentiation Future Generation Computer Systems 21(8) 299–304 Elsevier
Naumann U (2012) The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation SIAM
Naumann U, Lotz J, Leppkes K, Towara M (2015) Algorithmic Differentiation of Numerical Methods: Tangent and Adjoint Solvers for Parameterized Systems of Nonlinear Equations ACM Trans. Math. Softw.