E04 Chapter Contents
NAG Library Manual

# NAG Library Chapter IntroductionE04 – Minimizing or Maximizing a Function

## 1  Scope of the Chapter

An optimization problem involves minimizing a function (called the objective function) of several variables, possibly subject to restrictions on the values of the variables defined by a set of constraint functions. Most routines in the Library are concerned with function minimization only, since the problem of maximizing a given objective function F(x) is equivalent to minimizing $-F\left(x\right)$. Some routines allow you to specify whether you are solving a minimization or maximization problem, carrying out the required transformation of the objective function in the latter case.
In general routines in this chapter find a local minimum of a function $f$, that is a point ${x}^{*}$ s.t. for all $x$ near .
The Chapter E05 contains routines to find the global minimum of a function $f$. At a global minimum  for all $x$.
The Chapter H contains routines typically regarded as belonging to the field of operations research.
This introduction is only a brief guide to the subject of optimization designed for the casual user. Anyone with a difficult or protracted problem to solve will find it beneficial to consult a more detailed text, such as Gill et al. (1981) or Fletcher (1987).
If you are unfamiliar with the mathematics of the subject you may find some sections difficult at first reading; if so, you should concentrate on Sections 2.1, 2.2, 2.5, 2.6 and 3.

## 2  Background to the Problems

### 2.1  Types of Optimization Problems

The solution of optimization problems by a single, all-purpose, method is cumbersome and inefficient. Optimization problems are therefore classified into particular categories, where each category is defined by the properties of the objective and constraint functions, as illustrated by some examples below.
 Properties of Objective Function Properties of Constraints Nonlinear Nonlinear Sums of squares of nonlinear functions Sparse linear Quadratic Linear Sums of squares of linear functions Bounds Linear None
For instance, a specific problem category involves the minimization of a nonlinear objective function subject to bounds on the variables. In the following sections we define the particular categories of problems that can be solved by routines contained in this chapter. Not every category is given special treatment in the current version of the Library; however, the long-term objective is to provide a comprehensive set of routines to solve problems in all such categories.

#### 2.1.1  Unconstrained minimization

In unconstrained minimization problems there are no constraints on the variables. The problem can be stated mathematically as follows:
 $minimizexFx$
where $x\in {R}^{n}$, that is, $x={\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)}^{\mathrm{T}}$.

#### 2.1.2  Nonlinear least squares problems

Special consideration is given to the problem for which the function to be minimized can be expressed as a sum of squared functions. The least squares problem can be stated mathematically as follows:
 $minimizex fTf=∑i=1mfi2x , x∈Rn$
where the $i$th element of the $m$-vector $f$ is the function ${f}_{i}\left(x\right)$.

#### 2.1.3  Minimization subject to bounds on the variables

These problems differ from the unconstrained problem in that at least one of the variables is subject to a simple bound (or restriction) on its value, e.g., ${x}_{5}\le 10$, but no constraints of a more general form are present.
The problem can be stated mathematically as follows:
 $minimizex Fx, x∈Rn$
subject to ${l}_{\mathit{i}}\le {x}_{\mathit{i}}\le {u}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$.
This format assumes that upper and lower bounds exist on all the variables. By conceptually allowing ${u}_{i}=+\infty$ and ${l}_{i}=-\infty$ all the variables need not be restricted.

#### 2.1.4  Minimization subject to linear constraints

A general linear constraint is defined as a constraint function that is linear in more than one of the variables, e.g., $3{x}_{1}+2{x}_{2}\ge 4$. The various types of linear constraint are reflected in the following mathematical statement of the problem:
 $minimizex Fx, x∈Rn$
subject to the
 equality constraints: ${a}_{i}^{\mathrm{T}}x={b}_{i}$ $i=1,2,\dots ,{m}_{1}$; inequality constraints: ${a}_{i}^{\mathrm{T}}x\ge {b}_{i}$ $i={m}_{1}+1,{m}_{1}+2,\dots ,{m}_{2}$; ${a}_{i}^{\mathrm{T}}x\le {b}_{i}$ $i={m}_{2}+1,{m}_{2}+2,\dots ,{m}_{3}$; range constraints: ${s}_{j}\le {a}_{i}^{\mathrm{T}}x\le {t}_{j}$ $i={m}_{3}+1,{m}_{3}+2,\dots ,{m}_{4}\text{;}$ $j=1,2,\dots ,{m}_{4}-{m}_{3}\text{;}$ bounds constraints: ${l}_{i}\le {x}_{i}\le {u}_{i}$ $i=1,2,\dots ,n$
where each ${a}_{i}$ is a vector of length $n$; ${b}_{i}$, ${s}_{j}$ and ${t}_{j}$ are constant scalars; and any of the categories may be empty.
Although the bounds on ${x}_{i}$ could be included in the definition of general linear constraints, we prefer to distinguish between them for reasons of computational efficiency.
If $F\left(x\right)$ is a linear function, the linearly-constrained problem is termed a linear programming problem (LP); if $F\left(x\right)$ is a quadratic function, the problem is termed a quadratic programming problem (QP). For further discussion of LP and QP problems, including the dual formulation of such problems, see Dantzig (1963).

#### 2.1.5  Minimization subject to nonlinear constraints

A problem is included in this category if at least one constraint function is nonlinear, e.g., ${x}_{1}^{2}+{x}_{3}+{x}_{4}-2\ge 0$. The mathematical statement of the problem is identical to that for the linearly-constrained case, except for the addition of the following constraints:
 equality constraints: ${c}_{i}\left(x\right)=0$ $i=1,2,\dots ,{m}_{5}$; inequality constraints: ${c}_{i}\left(x\right)\ge 0$ $i={m}_{5}+1,{m}_{5}+2,\dots ,{m}_{6}$; range constraints: ${v}_{j}\le {c}_{i}\left(x\right)\le {w}_{j}$ $i={m}_{6}+1,{m}_{6}+2,\dots ,{m}_{7}$, $j=1,2,\dots ,{m}_{7}-{m}_{6}$
where each ${c}_{i}$ is a nonlinear function; ${v}_{j}$ and ${w}_{j}$ are constant scalars; and any category may be empty. Note that we do not include a separate category for constraints of the form ${c}_{i}\left(x\right)\le 0$, since this is equivalent to $\text{}-{c}_{i}\left(x\right)\ge 0$.
Although the general linear constraints could be included in the definition of nonlinear constraints, again we prefer to distinguish between them for reasons of computational efficiency.
If $F\left(x\right)$ is a nonlinear function, the nonlinearly-constrained problem is termed a nonlinear programming problem (NLP). For further discussion of NLP problems, see Gill et al. (1981) or Fletcher (1987).

#### 2.1.6  Minimization subject to bounds on the objective function

In all of the above problem categories it is assumed that
 $a≤Fx≤b$
where $a=-\infty$ and $b=+\infty$. Problems in which $a$ and/or $b$ are finite can be solved by adding an extra constraint of the appropriate type (i.e., linear or nonlinear) depending on the form of $F\left(x\right)$. Further advice is given in Section 3.5.

#### 2.1.7  Multi-objective optimization

Sometimes a problem may have two or more objective functions which are to be optimized at the same time. Such problems are called multi-object, multi-criteria or multi-attribute optimization. If the constraints are linear and the objectives are all linear then the terminology ‘goal programming’ is also used.
Techniques used in this chapter and in Chapter E05 may be employed to address such problems.

### 2.2  Geometric Representation and Terminology

To illustrate the nature of optimization problems it is useful to consider the following example in two dimensions:
 $Fx=ex14x12+2x22+4x1x2+2x2+1.$
(This function is used as the example function in the documentation for the unconstrained routines.)
Figure 1
Figure 1 is a contour diagram of $F\left(x\right)$. The contours labelled ${F}_{0},{F}_{1},\dots ,{F}_{4}$ are isovalue contours, or lines along which the function $F\left(x\right)$ takes specific constant values. The point ${x}^{*}={\left(\frac{1}{2},-1\right)}^{\mathrm{T}}$ is a local unconstrained minimum, that is, the value of $F\left({x}^{*}\right)$ ($\text{}=0$) is less than at all the neighbouring points. A function may have several such minima. The lowest of the local minima is termed a global minimum. In the problem illustrated in Figure 1, ${x}^{*}$ is the only local minimum. The point ${x}_{s}$ is said to be a saddle point because it is a minimum along the line AB, but a maximum along CD.
If we add the constraint ${x}_{1}\ge 0$ (a simple bound) to the problem of minimizing $F\left(x\right)$, the solution remains unaltered. In Figure 1 this constraint is represented by the straight line passing through ${x}_{1}=0$, and the shading on the line indicates the unacceptable region (i.e., ${x}_{1}<0$). The region in ${R}^{n}$ satisfying the constraints of an optimization problem is termed the feasible region. A point satisfying the constraints is defined as a feasible point.
If we add the nonlinear constraint ${c}_{1}\left(x\right):{x}_{1}+{x}_{2}-{x}_{1}{x}_{2}-\frac{3}{2}\ge 0$, represented by the curved shaded line in Figure 1, then ${x}^{*}$ is not a feasible point because ${c}_{1}\left({x}^{*}\right)<0$. The solution of the new constrained problem is ${x}_{b}\simeq {\left(1.1825,-1.7397\right)}^{\mathrm{T}}$, the feasible point with the smallest function value (where $F\left({x}_{b}\right)\simeq 3.0607$).

The vector of first partial derivatives of $F\left(x\right)$ is called the gradient vector, and is denoted by $g\left(x\right)$, i.e.,
 $gx= ∂Fx ∂x1 , ∂Fx ∂x2 ,…, ∂Fx ∂xn T.$
For the function illustrated in Figure 1,
 $gx= Fx+ex18x1+ 4x2 ex1 4x2+ 4x1+ 2 .$
The gradient vector is of importance in optimization because it must be zero at an unconstrained minimum of any function with continuous first derivatives.

#### 2.2.2  Hessian matrix

The matrix of second partial derivatives of a function is termed its Hessian matrix. The Hessian matrix of $F\left(x\right)$ is denoted by $G\left(x\right)$, and its $\left(i,j\right)$th element is given by ${\partial }^{2}F\left(x\right)/\partial {x}_{i}\partial {x}_{j}$. If $F\left(x\right)$ has continuous second derivatives, then $G\left(x\right)$ must be positive definite at any unconstrained minimum of $F$.

#### 2.2.3  Jacobian matrix; matrix of constraint normals

In nonlinear least squares problems, the matrix of first partial derivatives of the vector-valued function $f\left(x\right)$ is termed the Jacobian matrix of $f\left(x\right)$ and its $\left(i,j\right)$th component is $\partial {f}_{i}/\partial {x}_{j}$.
The vector of first partial derivatives of the constraint ${c}_{i}\left(x\right)$ is denoted by
 $aix= ∂cix ∂x1 , ∂cix ∂x2 ,…, ∂cix ∂xn T.$
The matrix whose columns are the vectors $\left\{{a}_{i}\right\}$ is termed the matrix of constraint normals. At a point $\stackrel{^}{x}$, the vector ${a}_{i}\left(\stackrel{^}{x}\right)$ is orthogonal (normal) to the isovalue contour of ${c}_{i}\left(x\right)$ passing through $\stackrel{^}{x}$; this relationship is illustrated for a two-dimensional function in Figure 2.
Figure 2
Note that if ${c}_{i}\left(x\right)$ is a linear constraint involving ${a}_{i}^{\mathrm{T}}x$, then its vector of first partial derivatives is simply the vector ${a}_{i}$.

### 2.3  Sufficient Conditions for a Solution

All nonlinear functions will be assumed to have continuous second derivatives in the neighbourhood of the solution.

#### 2.3.1  Unconstrained minimization

The following conditions are sufficient for the point ${x}^{*}$ to be an unconstrained local minimum of $F\left(x\right)$:
 (i) $‖g\left({x}^{*}\right)‖=0$; and (ii) $G\left({x}^{*}\right)$ is positive definite,
where $‖g‖$ denotes the Euclidean length of $g$.

#### 2.3.2  Minimization subject to bounds on the variables

At the solution of a bounds-constrained problem, variables which are not on their bounds are termed free variables. If it is known in advance which variables are on their bounds at the solution, the problem can be solved as an unconstrained problem in just the free variables; thus, the sufficient conditions for a solution are similar to those for the unconstrained case, applied only to the free variables.
Sufficient conditions for a feasible point ${x}^{*}$ to be the solution of a bounds-constrained problem are as follows:
 (i) $‖\stackrel{-}{g}\left({x}^{*}\right)‖=0$; and (ii) $\stackrel{-}{G}\left({x}^{*}\right)$ is positive definite; and (iii) ${g}_{j}\left({x}^{*}\right)<0,{x}_{j}={u}_{j}$; ${g}_{j}\left({x}^{*}\right)>0,{x}_{j}={l}_{j}$,
where $\stackrel{-}{g}\left(x\right)$ is the gradient of $F\left(x\right)$ with respect to the free variables, and $\stackrel{-}{G}\left(x\right)$ is the Hessian matrix of $F\left(x\right)$ with respect to the free variables. The extra condition (iii) ensures that $F\left(x\right)$ cannot be reduced by moving off one or more of the bounds.

#### 2.3.3  Linearly-constrained minimization

For the sake of simplicity, the following description does not include a specific treatment of bounds or range constraints, since the results for general linear inequality constraints can be applied directly to these cases.
At a solution ${x}^{*}$, of a linearly-constrained problem, the constraints which hold as equalities are called the active or binding constraints. Assume that there are $t$ active constraints at the solution ${x}^{*}$, and let $\stackrel{^}{A}$ denote the matrix whose columns are the columns of $A$ corresponding to the active constraints, with $\stackrel{^}{b}$ the vector similarly obtained from $b$; then
 $A^Tx*=b^.$
The matrix $Z$ is defined as an $n×\left(n-t\right)$ matrix satisfying:
 $A^TZ=0; ZTZ=I.$
The columns of $Z$ form an orthogonal basis for the set of vectors orthogonal to the columns of $\stackrel{^}{A}$.
Define
• ${g}_{Z}\left(x\right)={Z}^{\mathrm{T}}g\left(x\right)$, the projected gradient vector of $F\left(x\right)$;
• ${G}_{Z}\left(x\right)={Z}^{\mathrm{T}}G\left(x\right)Z$, the projected Hessian matrix of $F\left(x\right)$.
At the solution of a linearly-constrained problem, the projected gradient vector must be zero, which implies that the gradient vector $g\left({x}^{*}\right)$ can be written as a linear combination of the columns of $\stackrel{^}{A}$, i.e., $g\left({x}^{*}\right)=\sum _{i=1}^{t}{\lambda }_{i}^{*}{\stackrel{^}{a}}_{i}=\stackrel{^}{A}{\lambda }^{*}$. The scalar ${\lambda }_{i}^{*}$ is defined as the Lagrange multiplier corresponding to the $i$th active constraint. A simple interpretation of the $i$th Lagrange multiplier is that it gives the gradient of $F\left(x\right)$ along the $i$th active constraint normal; a convenient definition of the Lagrange multiplier vector (although not a recommended method for computation) is:
 $λ*=A^TA^-1A^Tgx*.$
Sufficient conditions for ${x}^{*}$ to be the solution of a linearly-constrained problem are:
 (i) ${x}^{*}$ is feasible, and ${\stackrel{^}{A}}^{\mathrm{T}}{x}^{*}=\stackrel{^}{b}$; and (ii) $‖{g}_{Z}\left({x}^{*}\right)‖=0$, or equivalently, $g\left({x}^{*}\right)=\stackrel{^}{A}{\lambda }^{*}$; and (iii) ${G}_{Z}\left({x}^{*}\right)$ is positive definite; and (iv) ${\lambda }_{i}^{*}>0$ if ${\lambda }_{i}^{*}$ corresponds to a constraint ${\stackrel{^}{a}}_{i}^{\mathrm{T}}{x}^{*}\ge {\stackrel{^}{b}}_{i}$; ${\lambda }_{i}^{*}<0$ if ${\lambda }_{i}^{*}$ corresponds to a constraint ${\stackrel{^}{a}}_{i}^{\mathrm{T}}{x}^{*}\le {\stackrel{^}{b}}_{i}$. The sign of ${\lambda }_{i}^{*}$ is immaterial for equality constraints, which by definition are always active.

#### 2.3.4  Nonlinearly-constrained minimization

For nonlinearly-constrained problems, much of the terminology is defined exactly as in the linearly-constrained case. The set of active constraints at $x$ again means the set of constraints that hold as equalities at $x$, with corresponding definitions of $\stackrel{^}{c}$ and $\stackrel{^}{A}$: the vector $\stackrel{^}{c}\left(x\right)$ contains the active constraint functions, and the columns of $\stackrel{^}{A}\left(x\right)$ are the gradient vectors of the active constraints. As before, $Z$ is defined in terms of $\stackrel{^}{A}\left(x\right)$ as a matrix such that:
 $A^TZ=0; ZTZ=I$
where the dependence on $x$ has been suppressed for compactness.
The projected gradient vector ${g}_{Z}\left(x\right)$ is the vector ${Z}^{\mathrm{T}}g\left(x\right)$. At the solution ${x}^{*}$ of a nonlinearly-constrained problem, the projected gradient must be zero, which implies the existence of Lagrange multipliers corresponding to the active constraints, i.e., $g\left({x}^{*}\right)=\stackrel{^}{A}\left({x}^{*}\right){\lambda }^{*}$.
The Lagrangian function is given by:
 $Lx,λ=Fx-λTc^x.$
We define ${g}_{L}\left(x\right)$ as the gradient of the Lagrangian function; ${G}_{L}\left(x\right)$ as its Hessian matrix, and ${\stackrel{^}{G}}_{L}\left(x\right)$ as its projected Hessian matrix, i.e., ${\stackrel{^}{G}}_{L}={Z}^{\mathrm{T}}{G}_{L}Z$.
Sufficient conditions for ${x}^{*}$ to be the solution of a nonlinearly-constrained problem are:
 (i) ${x}^{*}$ is feasible, and $\stackrel{^}{c}\left({x}^{*}\right)=0$; and (ii) $‖{g}_{Z}\left({x}^{*}\right)‖=0$, or, equivalently, $g\left({x}^{*}\right)=\stackrel{^}{A}\left({x}^{*}\right){\lambda }^{*}$; and (iii) ${\stackrel{^}{G}}_{L}\left({x}^{*}\right)$ is positive definite; and (iv) ${\lambda }_{i}^{*}>0$ if ${\lambda }_{i}^{*}$ corresponds to a constraint of the form ${\stackrel{^}{c}}_{i}\ge 0$. The sign of ${\lambda }_{i}^{*}$ is immaterial for equality constraints, which by definition are always active.
Note that condition (ii) implies that the projected gradient of the Lagrangian function must also be zero at ${x}^{*}$, since the application of ${Z}^{\mathrm{T}}$ annihilates the matrix $\stackrel{^}{A}\left({x}^{*}\right)$.

### 2.4  Background to Optimization Methods

All the algorithms contained in this chapter generate an iterative sequence $\left\{{x}^{\left(k\right)}\right\}$ that converges to the solution ${x}^{*}$ in the limit, except for some special problem categories (i.e., linear and quadratic programming). To terminate computation of the sequence, a convergence test is performed to determine whether the current estimate of the solution is an adequate approximation. The convergence tests are discussed in Section 2.6.
Most of the methods construct a sequence $\left\{{x}^{\left(k\right)}\right\}$ satisfying:
 $x k+1 =x k +α k p k ,$
where the vector ${p}^{\left(k\right)}$ is termed the direction of search, and ${\alpha }^{\left(k\right)}$ is the steplength. The steplength ${\alpha }^{\left(k\right)}$ is chosen so that $F\left({x}^{\left(k+1\right)}\right) and is computed using one of the techniques for one-dimensional optimization referred to in Section 2.4.1.

#### 2.4.1  One-dimensional optimization

The Library contains two special routines for minimizing a function of a single variable. Both routines are based on safeguarded polynomial approximation. One routine requires function evaluations only and fits a quadratic polynomial whilst the other requires function and gradient evaluations and fits a cubic polynomial. See Section 4.1 of Gill et al. (1981).

#### 2.4.2  Methods for unconstrained optimization

The distinctions among methods arise primarily from the need to use varying levels of information about derivatives of $F\left(x\right)$ in defining the search direction. We describe three basic approaches to unconstrained problems, which may be extended to other problem categories. Since a full description of the methods would fill several volumes, the discussion here can do little more than allude to the processes involved, and direct you to other sources for a full explanation.
 (a) Newton-type Methods (Modified Newton Methods) Newton-type methods use the Hessian matrix $G\left({x}^{\left(k\right)}\right)$, or a finite difference approximation to $G\left({x}^{\left(k\right)}\right)$, to define the search direction. The routines in the Library either require a subroutine that computes the elements of $G\left({x}^{\left(k\right)}\right)$ directly, or they approximate $G\left({x}^{\left(k\right)}\right)$ by finite differences. Newton-type methods are the most powerful methods available for general problems and will find the minimum of a quadratic function in one iteration. See Sections 4.4 and 4.5.1 of Gill et al. (1981). (b) Quasi-Newton Methods Quasi-Newton methods approximate the Hessian $G\left({x}^{\left(k\right)}\right)$ by a matrix ${B}^{\left(k\right)}$ which is modified at each iteration to include information obtained about the curvature of $F$ along the current search direction ${p}^{\left(k\right)}$. Although not as robust as Newton-type methods, quasi-Newton methods can be more efficient because $G\left({x}^{\left(k\right)}\right)$ is not computed directly, or approximated by finite differences. Quasi-Newton methods minimize a quadratic function in $n$ iterations, where $n$ is the number of variables. See Section 4.5.2 of Gill et al. (1981). (c) Conjugate-gradient Methods Unlike Newton-type and quasi-Newton methods, conjugate-gradient methods do not require the storage of an $n$ by $n$ matrix and so are ideally suited to solve large problems. Conjugate-gradient type methods are not usually as reliable or efficient as Newton-type, or quasi-Newton methods. See Section 4.8.3 of Gill et al. (1981).

#### 2.4.3  Methods for nonlinear least squares problems

These methods are similar to those for unconstrained optimization, but exploit the special structure of the Hessian matrix to give improved computational efficiency.
Since
 $Fx=∑i=1mfi2x$
the Hessian matrix $G\left(x\right)$ is of the form
 $Gx = 2 JxT Jx + ∑i=1m fix Gix ,$
where $J\left(x\right)$ is the Jacobian matrix of $f\left(x\right)$, and ${G}_{i}\left(x\right)$ is the Hessian matrix of ${f}_{i}\left(x\right)$.
In the neighbourhood of the solution, $‖f\left(x\right)‖$ is often small compared to $‖J{\left(x\right)}^{\mathrm{T}}J\left(x\right)‖$ (for example, when $f\left(x\right)$ represents the goodness-of-fit of a nonlinear model to observed data). In such cases, $2J{\left(x\right)}^{\mathrm{T}}J\left(x\right)$ may be an adequate approximation to $G\left(x\right)$, thereby avoiding the need to compute or approximate second derivatives of $\left\{{f}_{i}\left(x\right)\right\}$. See Section 4.7 of Gill et al. (1981).

#### 2.4.4  Methods for handling constraints

Bounds on the variables are dealt with by fixing some of the variables on their bounds and adjusting the remaining free variables to minimize the function. By examining estimates of the Lagrange multipliers it is possible to adjust the set of variables fixed on their bounds so that eventually the bounds active at the solution should be correctly identified. This type of method is called an active set method. One feature of such methods is that, given an initial feasible point, all approximations ${x}^{\left(k\right)}$ are feasible. This approach can be extended to general linear constraints. At a point, $x$, the set of constraints which hold as equalities being used to predict, or approximate, the set of active constraints is called the working set.
Nonlinear constraints are more difficult to handle. If at all possible, it is usually beneficial to avoid including nonlinear constraints during the formulation of the problem. The methods currently implemented in the Library handle nonlinearly constrained problems by transforming them into a sequence of quadratic programming problems. A feature of such methods is that ${x}^{\left(k\right)}$ is not guaranteed to be feasible except in the limit, and this is certainly true of the routines currently in the Library. See Chapter 6, particularly Sections 6.4 and 6.5, of Gill et al. (1981).
Anyone interested in a detailed description of methods for optimization should consult the references.

#### 2.4.5  Methods for handling multi-objective optimization

Suppose we have objective functions ${f}_{i}\left(x\right)$, $i>1$, all of which we need to minimize at the same time. There are two main approaches to this problem:
(a) Combine the individual objectives into one composite objective. Typically this might be a weighted sum of the objectives, e.g.,
 $w1 f1x + w2 f2x + ⋯ + wn fnx$
Here you choose the weights to express the relative importance of the corresponding objective. Ideally each of the ${f}_{i}\left(x\right)$ should be of comparable size at a solution.
(b) Order the objectives in order of importance. Suppose ${f}_{\mathit{i}}$ are ordered such that ${f}_{\mathit{i}}\left(x\right)$ is more important than ${f}_{\mathit{i}+1}\left(x\right)$, for $\mathit{i}=1,2,\dots ,n-1$. Then in the lexicographical approach to multi-objective optimization a sequence of subproblems are solved. Firstly solve the problem for objective function ${f}_{1}\left(x\right)$ and denote by ${r}_{1}$ the value of this minimum. If $\left(\mathit{i}-1\right)$ subproblems have been solved with results ${r}_{\mathit{i}-1}$ then subproblem $\mathit{i}$ becomes $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({f}_{\mathit{i}}\left(x\right)\right)$ subject to ${r}_{k}\le {f}_{k}\left(x\right)\le {r}_{k}$, for $\mathit{k}=1,2,\dots ,i-1$ plus the other constraints.
Clearly the bounds on ${f}_{k}$ might be relaxed at your discretion.
In general, if NAG routines from the Chapter E04 are used then only local minima are found. This means that a better solution to an individual objective might be found without worsening the optimal solutions to the other objectives. Ideally you seek a Pareto solution; one in which an improvement in one objective can only be achieved by a worsening of another objective.
To obtain a Pareto solution routines from Chapter E05 might be used or, alternatively, a pragmatic attempt to derive a global minimum might be tried (see E05UCF). In this approach a variety of different minima are computed for each subproblem by starting from a range of different starting points. The best solution achieved is taken to be the global minimum. The more starting points chosen the greater confidence you might have in the computed global minimum.

### 2.5  Scaling

Scaling (in a broadly defined sense) often has a significant influence on the performance of optimization methods. Since convergence tolerances and other criteria are necessarily based on an implicit definition of ‘small’ and ‘large’, problems with unusual or unbalanced scaling may cause difficulties for some algorithms. Although there are currently no user-callable scaling routines in the Library, scaling is automatically performed by default in the routines which solve sparse LP, QP or NLP problems and in some newer dense solver routines. The following sections present some general comments on problem scaling.

#### 2.5.1  Transformation of variables

One method of scaling is to transform the variables from their original representation, which may reflect the physical nature of the problem, to variables that have certain desirable properties in terms of optimization. It is generally helpful for the following conditions to be satisfied:
 (i) the variables are all of similar magnitude in the region of interest; (ii) a fixed change in any of the variables results in similar changes in $F\left(x\right)$. Ideally, a unit change in any variable produces a unit change in $F\left(x\right)$; (iii) the variables are transformed so as to avoid cancellation error in the evaluation of $F\left(x\right)$.
Normally, you should restrict yourself to linear transformations of variables, although occasionally nonlinear transformations are possible. The most common such transformation (and often the most appropriate) is of the form
 $xnew=Dxold,$
where $D$ is a diagonal matrix with constant coefficients. Our experience suggests that more use should be made of the transformation
 $xnew=Dxold+v,$
where $v$ is a constant vector.
Consider, for example, a problem in which the variable ${x}_{3}$ represents the position of the peak of a Gaussian curve to be fitted to data for which the extreme values are $150$ and $170$; therefore ${x}_{3}$ is known to lie in the range $150$$170$. One possible scaling would be to define a new variable ${\stackrel{-}{x}}_{3}$, given by
 $x-3=x3170.$
A better transformation, however, is given by defining ${\stackrel{-}{x}}_{3}$ as
 $x-3=x3-16010.$
Frequently, an improvement in the accuracy of evaluation of $F\left(x\right)$ can result if the variables are scaled before the routines to evaluate $F\left(x\right)$ are coded. For instance, in the above problem just mentioned of Gaussian curve-fitting, ${x}_{3}$ may always occur in terms of the form $\left({x}_{3}-{x}_{m}\right)$, where ${x}_{m}$ is a constant representing the mean peak position.

#### 2.5.2  Scaling the objective function

The objective function has already been mentioned in the discussion of scaling the variables. The solution of a given problem is unaltered if $F\left(x\right)$ is multiplied by a positive constant, or if a constant value is added to $F\left(x\right)$. It is generally preferable for the objective function to be of the order of unity in the region of interest; thus, if in the original formulation $F\left(x\right)$ is always of the order of ${10}^{+5}$ (say), then the value of $F\left(x\right)$ should be multiplied by ${10}^{-5}$ when evaluating the function within an optimization routine. If a constant is added or subtracted in the computation of $F\left(x\right)$, usually it should be omitted, i.e., it is better to formulate $F\left(x\right)$ as ${x}_{1}^{2}+{x}_{2}^{2}$ rather than as ${x}_{1}^{2}+{x}_{2}^{2}+1000$ or even ${x}_{1}^{2}+{x}_{2}^{2}+1$. The inclusion of such a constant in the calculation of $F\left(x\right)$ can result in a loss of significant figures.

#### 2.5.3  Scaling the constraints

A ‘well scaled’ set of constraints has two main properties. Firstly, each constraint should be well-conditioned with respect to perturbations of the variables. Secondly, the constraints should be balanced with respect to each other, i.e., all the constraints should have ‘equal weight’ in the solution process.
The solution of a linearly- or nonlinearly-constrained problem is unaltered if the $i$th constraint is multiplied by a positive weight ${w}_{i}$. At the approximation of the solution determined by a Library routine, any active linear constraints will (in general) be satisfied ‘exactly’ (i.e., to within the tolerance defined by machine precision) if they have been properly scaled. This is in contrast to any active nonlinear constraints, which will not (in general) be satisfied ‘exactly’ but will have ‘small’ values (for example, ${\stackrel{^}{c}}_{1}\left({x}^{*}\right)={10}^{-8}$, ${\stackrel{^}{c}}_{2}\left({x}^{*}\right)={-10}^{-6}$, and so on). In general, this discrepancy will be minimized if the constraints are weighted so that a unit change in $x$ produces a similar change in each constraint.
A second reason for introducing weights is related to the effect of the size of the constraints on the Lagrange multiplier estimates and, consequently, on the active set strategy. This means that different sets of weights may cause an algorithm to produce different sequences of iterates. Additional discussion is given in Gill et al. (1981).

### 2.6  Analysis of Computed Results

#### 2.6.1  Convergence criteria

The convergence criteria inevitably vary from routine to routine, since in some cases more information is available to be checked (for example, is the Hessian matrix positive definite?), and different checks need to be made for different problem categories (for example, in constrained minimization it is necessary to verify whether a trial solution is feasible). Nonetheless, the underlying principles of the various criteria are the same; in non-mathematical terms, they are:
 (i) is the sequence $\left\{{x}^{\left(k\right)}\right\}$ converging? (ii) is the sequence $\left\{{F}^{\left(k\right)}\right\}$ converging? (iii) are the necessary and sufficient conditions for the solution satisfied?
The decision as to whether a sequence is converging is necessarily speculative. The criterion used in the present routines is to assume convergence if the relative change occurring between two successive iterations is less than some prescribed quantity. Criterion (iii) is the most reliable but often the conditions cannot be checked fully because not all the required information may be available.

#### 2.6.2  Checking results

Little a priori guidance can be given as to the quality of the solution found by a nonlinear optimization algorithm, since no guarantees can be given that the methods will not fail. Therefore, you should always check the computed solution even if the routine reports success. Frequently a ‘solution’ may have been found even when the routine does not report a success. The reason for this apparent contradiction is that the routine needs to assess the accuracy of the solution. This assessment is not an exact process and consequently may be unduly pessimistic. Any ‘solution’ is in general only an approximation to the exact solution, and it is possible that the accuracy you have specified is too stringent.
Further confirmation can be sought by trying to check whether or not convergence tests are almost satisfied, or whether or not some of the sufficient conditions are nearly satisfied. When it is thought that a routine has returned a nonzero value of IFAIL only because the requirements for ‘success’ were too stringent it may be worth restarting with increased convergence tolerances.
For nonlinearly-constrained problems, check whether the solution returned is feasible, or nearly feasible; if not, the solution returned is not an adequate solution.
Confidence in a solution may be increased by resolving the problem with a different initial approximation to the solution. See Section 8.3 of Gill et al. (1981) for further information.

#### 2.6.3  Monitoring progress

Many of the routines in the chapter have facilities to allow you to monitor the progress of the minimization process, and you are encouraged to make use of these facilities. Monitoring information can be a great aid in assessing whether or not a satisfactory solution has been obtained, and in indicating difficulties in the minimization problem or in the ability of the routine to cope with the problem.
The behaviour of the function, the estimated solution and first derivatives can help in deciding whether a solution is acceptable and what to do in the event of a return with a nonzero value of IFAIL.

#### 2.6.4  Confidence intervals for least squares solutions

When estimates of the parameters in a nonlinear least squares problem have been found, it may be necessary to estimate the variances of the parameters and the fitted function. These can be calculated from the Hessian of $F\left(x\right)$ at the solution.
In many least squares problems, the Hessian is adequately approximated at the solution by $G=2{J}^{\mathrm{T}}J$ (see Section 2.4.3). The Jacobian, $J$, or a factorization of $J$ is returned by all the comprehensive least squares routines and, in addition, a routine is available in the Library to estimate variances of the parameters following the use of most of the nonlinear least squares routines, in the case that $G=2{J}^{\mathrm{T}}J$ is an adequate approximation.
Let $H$ be the inverse of $G$, and $S$ be the sum of squares, both calculated at the solution $\stackrel{-}{x}$; an unbiased estimate of the variance of the $i$th parameter ${x}_{i}$ is
 $var⁡x-i=2S m-n Hii$
and an unbiased estimate of the covariance of ${\stackrel{-}{x}}_{i}$ and ${\stackrel{-}{x}}_{j}$ is
 $covarx-i,x-j=2S m-n Hij.$
If ${x}^{*}$ is the true solution, then the $100\left(1-\beta \right)%\text{}$ confidence interval on $\stackrel{-}{x}$ is
 $x-i-var⁡x-i. t1-β/2,m-n
where ${t}_{\left(1-\beta /2,m-n\right)}$ is the $100\left(1-\beta \right)/2$ percentage point of the $t$-distribution with $m-n$ degrees of freedom.
In the majority of problems, the residuals ${f}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,m$, contain the difference between the values of a model function $\varphi \left(z,x\right)$ calculated for $m$ different values of the independent variable $z$, and the corresponding observed values at these points. The minimization process determines the parameters, or constants $x$, of the fitted function $\varphi \left(z,x\right)$. For any value, $\stackrel{-}{z}$, of the independent variable $z$, an unbiased estimate of the variance of $\varphi$ is
 $var⁡ϕ=2S m-n ∑i=1n ∑j=1n ∂ϕ ∂xi z- ∂ϕ ∂xj z- Hij.$
The $100\left(1-\beta \right)%$ confidence interval on $F$ at the point $\stackrel{-}{z}$ is
 $ϕz-,x--var⁡ϕ.tβ/2,m-n<ϕz-,x*<ϕz-,x-+var⁡ϕ.tβ/2,m-n.$
For further details on the analysis of least squares solutions see Bard (1974) and Wolberg (1967).

## 3  Recommendations on Choice and Use of Available Routines

The choice of routine depends on several factors: the type of problem (unconstrained, etc.); the level of derivative information available (function values only, etc.); your experience (there are easy-to-use versions of some routines); whether or not storage is a problem; whether or not the routine is to be used in a multithreaded environment; and whether computational time has a high priority. Not all choices are catered for in the current version of the Library.

### 3.1  Easy-to-use and Comprehensive Routines

Many routines appear in the Library in two forms: a comprehensive form and an easy-to-use form. The objective in the easy-to-use forms is to make the routine simple to use by including in the calling sequence only those parameters absolutely essential to the definition of the problem, as opposed to parameters relevant to the solution method. If you are an experienced user the comprehensive routines have additional parameters which enable you to improve their efficiency by ‘tuning’ the method to a particular problem. If you are a casual or inexperienced user, this feature is of little value and may in some cases cause a failure because of a poor choice of some parameters.
In the easy-to-use routines, these extra parameters are determined either by fixing them at a known safe and reasonably efficient value, or by an auxiliary routine which generates a ‘good’ value automatically.
For routines introduced since Mark 12 of the Library a different approach has been adopted towards the choice of easy-to-use and comprehensive routines. The optimization routine has an easy-to-use parameter list, but additional parameters may be changed from their default values by calling an ‘option’ setting routine before the call to the main optimization routine. This approach has the advantages of allowing the options to be given in the form of keywords and requiring only those options that are to be different from their default values to be set.

Many of the routines in this chapter come in pairs, with each routine in the pair having exactly the same functionality, except that one of them has additional parameters in order to make it safe for use in multithreaded applications. The routine that is safe for use in multithreaded applicatons has an ‘A’ as the last character in the name, in place of the usual ‘F’.
An example of such a pair is E04ABA and E04ABF.
All ‘F’ routines not scheduled for withdrawal from the Library and where there is no ‘A’ version of that routine are thread safe provided that the implementation as a whole is considered thread safe (refer to the Users' Note for your implementation).

### 3.3  Reverse Communication Routines

Most of the routines in this chapter are called just once in order to compute the minimum of a given objective function subject to a set of constraints on the variables. The objective function and nonlinear constraints (if any) are specified by you and written as subroutines to a very rigid format described in the relevant routine document.
This chapter also contains a pair of reverse communication routines, E04UFF/E04UFA, which solve dense NLP problems using a sequential quadratic programming method. These may be convenient to use when the minimization routine is being called from a computer language which does not fully support procedure arguments in a way that is compatible with the Library. See Section 3.2.3 in the Essential Introduction for more information about reverse communication routines.

### 3.4  Service Routines

One of the most common errors in the use of optimization routines is that user-supplied subroutines do not evaluate the relevant partial derivatives correctly. Because exact gradient information normally enhances efficiency in all areas of optimization, you are encouraged to provide analytical derivatives whenever possible. However, mistakes in the computation of derivatives can result in serious and obscure run-time errors. Consequently, service routines are provided to perform an elementary check on the gradients you supplied. These routines are inexpensive to use in terms of the number of calls they require to user-supplied subroutines.
The appropriate checking routines are as follows:
 Minimization routine Checking routine(s) E04KDF E04HCF E04LBF E04HCF and E04HDF E04GBF E04YAF E04GDF E04YAF E04HEF E04YAF and E04YBF
It should be noted that routines E04UFF/E04UFA, E04USF/E04USA, E04VHF and E04WDF each incorporate a check on the gradients being supplied. This involves verifying the gradients at the first point that satisfies the linear constraints and bounds. There is also an option to perform a more reliable (but more expensive) check on the individual gradient elements being supplied. Note that the checks are not infallible.
A second type of service routine computes a set of finite differences to be used when approximating first derivatives. Such differences are required as input parameters by some routines that use only function evaluations.
E04YCF estimates selected elements of the variance-covariance matrix for the computed regression parameters following the use of a nonlinear least squares routine.
E04XAF/E04XAA estimates the gradient and Hessian of a function at a point, given a routine to calculate function values only, or estimates the Hessian of a function at a point, given a routine to calculate function and gradient values.

### 3.5  Function Evaluations at Infeasible Points

All the routines for constrained problems will ensure that any evaluations of the objective function occur at points which approximately satisfy any simple bounds or linear constraints. Satisfaction of such constraints is only approximate because routines which estimate derivatives by finite differences may require function evaluations at points which just violate such constraints even though the current iteration just satisfies them.
There is no attempt to ensure that the current iteration satisfies any nonlinear constraints. If you wish to prevent your objective function being evaluated outside some known region (where it may be undefined or not practically computable), you may try to confine the iteration within this region by imposing suitable simple bounds or linear constraints (but beware as this may create new local minima where these constraints are active).
Note also that some routines allow you to return the parameter (IFLAG or MODE) with a negative value to force an immediate clean exit from the minimization routine when the objective function (or nonlinear constraints where appropriate) cannot be evaluated.

### 3.6  Related Problems

Apart from the standard types of optimization problem, there are other related problems which can be solved by routines in this or other chapters of the Library.
H02BBF solves dense integer LP problems, H02CBF solves dense integer QP problems, H02CEF solves sparse integer QP problems and H03ABF solves a special type of such problem known as a ‘transportation’ problem.
Several routines in Chapters F04 and F08 solve linear least squares problems, i.e., $\mathrm{minimize}\sum _{i=1}^{m}{r}_{i}{\left(x\right)}^{2}$ where ${r}_{i}\left(x\right)={b}_{i}-\sum _{j=1}^{n}{a}_{ij}{x}_{j}$.
E02GAF solves an overdetermined system of linear equations in the ${l}_{1}$ norm, i.e., minimizes $\sum _{i=1}^{m}\left|{r}_{i}\left(x\right)\right|$, with ${r}_{i}$ as above, and E02GBF solves the same problem subject to linear inequality constraints.
E02GCF solves an overdetermined system of linear equations in the ${l}_{\infty }$ norm, i.e., minimizes $\underset{i}{\mathrm{max}}\phantom{\rule{0.25em}{0ex}}\left|{r}_{i}\left(x\right)\right|$, with ${r}_{i}$ as above.
Chapter E05 contains routines for global minimization.
Section 2.4.5 describes how a multi-objective optimization problem might be addressed using routines from this chapter and from Chapter E05.

### 3.7  Choosing Between Variant Routines for Some Problems

As evidenced by the wide variety of routines available in Chapter E04, it is clear that no single algorithm can solve all optimization problems. It is important to try to match the problem to the most suitable routine, and that is what the decision trees in Section 4 help to do.
Sometimes in Chapter E04 more than one routine is available to solve precisely the same minimization problem. Thus, for example, the general nonlinear programming routines E04UCF/E04UCA and E04WDF are based on similar methods. Experience shows that although both routines can usually solve the same problem and get similar results, sometimes one routine will be faster, sometimes one might find a different local minimum to the other, or, in difficult cases, one routine may obtain a solution when the other one fails.
After using one of these routines, if the results obtained are unacceptable for some reason, it may be worthwhile trying the other routine instead. In the absence of any other information, in the first instance you are recommended to try using E04UCF/E04UCA, and if that proves unsatisfactory, try using E04WDF. Although the algorithms used are very similar, the two routines each have slightly different optional parameters which may allow the course of the computation to be altered in different ways.
Other pairs of routines which solve the same kind of problem are E04NQF (recommended first choice) or E04NKF/E04NKA, for sparse quadratic or linear programming problems, and E04UGF/E04UGA or E04VHF, for sparse nonlinear programming. In these cases the parameter lists are not so similar as E04UCF/E04UCA or E04WDF, but the same considerations apply.

## 4  Decision Trees

### Tree 1: Selection chart for unconstrained problems

 Only one variable? Are first derivatives available? E04BBF yes yes no no E04ABF Does the function have many discontinuities? E04CBF yes no Is store size a problem? E04DGF yes no Is the function a sum of squares? Are you an experienced user? Are first derivatives available? Are second derivatives available? E04HEF yes yes yes yes no no no no Are there more than ten variables? E04GBF yes no E04GDF E04FCF Are first derivatives available? Are second derivatives available? E04HYF yes yes no no Are there more than ten variables? E04GYF yes no E04GZF E04FYF Are you an experienced user? Are first derivatives available? Are second derivatives available? E04LBF yes yes yes no no no Is computational cost critical? E04UCF, E04UFF, E04UGF, E04VHF or E04WDF yes no E04KDF E04UCF, E04UFF, E04UGF, E04VHF or E04WDF Are first derivatives available? Are second derivatives available? E04LYF yes yes no no Is computational cost critical? E04KYF yes no E04KZF E04JYF

### Tree 2: Selection chart for bound-constrained, linearly-constrained and nonlinearly-constrained problems

 Are there any nonlinear constraints? Is the objective function a sum of squares? (A least squares problem) E04USF yes yes no no Are the constraints sparse? E04UGF or E04VHF yes no E04UCF, E04UFF or E04WDF Is the objective function linear? (An LP problem) See Tree 3 yes no Is the objective function quadratic? (A $QP$ or least squares problem) Is the problem a least squares problem? Are the constraints simple bounds? E04PCF yes yes yes no no no E04NCF and E04PCF See Tree 3 Is the objective function a sum of squares? (A least squares problem) E04USF yes no Are the constraints simple bounds? Are you an experienced user? Are first derivatives available? Are second derivatives available? E04LBF yes yes yes yes no no no no E04KDF, E04UCF, E04UFF, E04UGF or E04VHF E04UCF, E04UFF, E04UGF, E04VHF or E04WDF Are first derivatives available? Are second derivatives available? E04LYF yes yes no no Is computational cost critical? E04KYF yes no E04KZF E04JCF or E04JYF E04UCF, E04UFF, E04UGF or E04VHF

### Tree 3: Linear and Quadratic Programming (LP and QP)

 Is the linear constraint matrix sparse? E04NQF, E04NKF, E04VHF and E04UGF yes no Is the objective function linear (an LP problem)? E04MFF yes no Is the QP problem convex? E04NCF yes no E04NFF

## 5  Functionality Index

 Constrained minimum of a sum of squares, nonlinear constraints,
 using function values and optionally first derivatives, sequential QP method,
 dense E04USF
 Convex QP problem or linearly-constrained linear least squares problem (dense) E04NCF
 Linear least squares with bounds on the variables E04PCF
 Linear programming (LP) problem (dense) E04MFF
 LP or QP problem (sparse) E04NKF
 LP or QP problem (sparse) (recommended – see Section 3.7) E04NQF
 Minimum, function of one variable,
 using first derivative E04BBF
 using function values only E04ABF
 Minimum, function of several variables, nonlinear constraints,
 using function values and optionally first derivatives, sequential QP method,
 dense E04WDF
 dense (recommended – see Section 3.7) E04UCF
 sparse E04VHF
 sparse (recommended – see Section 3.7) E04UGF
 Minimum, function of several variables, nonlinear constraints (comprehensive),
 using function values and optionally first derivatives, sequential QP method,
 reverse communication (dense) E04UFF
 Minimum, function of several variables, simple bounds,
 using first and second derivatives, modified Newton algorithm E04LBF
 Minimum, function of several variables, simple bounds (comprehensive),
 using first derivatives, modified Newton algorithm E04KDF
 Minimum, function of several variables, simple bounds (easy-to-use),
 using first and second derivatives, modified Newton algorithm E04LYF
 using first derivatives,
 modified Newton algorithm E04KZF
 quasi-Newton algorithm E04KYF
 using function values only, by quadratic approximation E04JCF
 using function values only, quasi-Newton algorithm E04JYF
 Quadratic programming (QP) problem (dense) E04NFF
 Service routines,
 check user's routine for calculating,
 first derivatives of function E04HCF
 Hessian of a sum of squares E04YBF
 Jacobian of first derivatives E04YAF
 second derivatives of function E04HDF
 convert MPS data file defining LP or QP problem to format required by E04NQF (recommended) E04MXF
 convert MPSX data file defining LP or QP problem to format required by E04NQF E04MZF
 covariance matrix for nonlinear least squares problem E04YCF
 determine Jacobian sparsity structure before a call of E04VHF E04VJF
 estimate gradient and/or Hessian of a function E04XAF
 initialization routine for,
 E04DGA, E04MFA, E04NCA, E04NFA, E04UFA, E04UGA and E04USA E04WBF
 E04NQF E04NPF
 E04VHF E04VGF
 E04WDF E04WCF
 retrieve integer optional parameter values used by,
 E04NQF E04NXF
 E04VHF E04VRF
 E04WDF E04WKF
 retrieve real optional parameter values used by,
 E04NQF E04NYF
 E04VHF E04VSF
 E04WDF E04WLF
 supply integer optional parameter values to,
 E04NQF E04NTF
 E04VHF E04VMF
 E04WDF E04WGF
 supply optional parameter values from external file for,
 E04DGF/E04DGA E04DJF
 E04MFF/E04MFA E04MGF
 E04NCF/E04NCA E04NDF
 E04NFF/E04NFA E04NGF
 E04NKF/E04NKA E04NLF
 E04NQF E04NRF
 E04UCF/E04UCA E04UDF
 E04UGF/E04UGA E04UHF
 E04USF/E04USA E04UQF
 E04VHF E04VKF
 E04WDF E04WEF
 supply optional parameter values to,
 E04DGF/E04DGA E04DKF
 E04MFF/E04MFA E04MHF
 E04NCF/E04NCA E04NEF
 E04NFF/E04NFA E04NHF
 E04NKF/E04NKA E04NMF
 E04NQF E04NSF
 E04UCF/E04UCA E04UEF
 E04UGF/E04UGA E04UJF
 E04USF/E04USA E04URF
 E04VHF E04VLF
 E04WDF E04WFF
 supply real optional parameter values to,
 E04NQF E04NUF
 E04VHF E04VNF
 E04WDF E04WHF
 Unconstrained minimum, function of several variables,
 using first derivatives, pre-conditioned conjugate gradient algorithm E04DGF
 using function values only, simplex algorithm E04CBF
 Unconstrained minimum of a sum of squares (comprehensive):
 using first derivatives,
 combined Gauss–Newton and modified Newton algorithm E04GDF
 combined Gauss–Newton and quasi-Newton algorithm E04GBF
 using function values only,
 combined Gauss–Newton and modified Newton algorithm E04FCF
 using second derivatives,
 combined Gauss–Newton and modified Newton algorithm E04HEF
 Unconstrained minimum of a sum of squares (easy-to-use):
 using first derivatives,
 combined Gauss–Newton and modified Newton algorithm E04GZF
 combined Gauss–Newton and quasi-Newton algorithm E04GYF
 using function values only,
 combined Gauss–Newton and modified Newton algorithm E04FYF
 using second derivatives,
 combined Gauss–Newton and modified Newton algorithm E04HYF

## 6  Auxiliary Routines Associated with Library Routine Parameters

 E04CBK nagf_opt_uncon_simplex_dummy_monitSee the description of the parameter MONIT in E04CBF. E04FCV nagf_opt_lsq_uncon_quasi_deriv_comp_lsqlin_funSee the description of the parameter LSQLIN in E04GBF. E04FDZ nagf_opt_lsq_dummy_lsqmonSee the description of the parameter LSQMON in E04FCF, E04GDF and E04HEF. E04HEV nagf_opt_lsq_uncon_quasi_deriv_comp_lsqlin_derivSee the description of the parameter LSQLIN in E04GBF. E04JCP nagf_opt_bounds_bobyqa_func_dummy_monfunSee the description of the parameter MONFUN in E04JCF. E54NFU nagf_opt_qp_dense_sample_qphessSee the description of the parameter QPHESS in E04NFF/E04NFA and H02CBF. E04NFU nagf_opt_qp_dense_sample_qphess_oldSee the description of the parameter QPHESS in E04NFF/E04NFA and H02CBF. E54NKU nagf_opt_qpconvex1_sparse_dummy_qphxSee the description of the parameter QPHX in E04NKF/E04NKA and H02CEF. E04NKU nagf_opt_qpconvex1_sparse_dummy_qphx_oldSee the description of the parameter QPHX in E04NKF/E04NKA and H02CEF. E04NSH nagf_opt_qpconvex2_sparse_dummy_qphxSee the description of the parameter QPHX in E04NQF. E04UDM nagf_opt_nlp1_dummy_confunSee the description of the parameter CONFUN in E04UCF/E04UCA and E04USF/E04USA. E04UGN nagf_opt_nlp1_sparse_dummy_objfunSee the description of the parameter OBJFUN in E04UGF/E04UGA. E04UGM nagf_opt_nlp1_sparse_dummy_confunSee the description of the parameter CONFUN in E04UGF/E04UGA. E04WDP nagf_opt_nlp2_dummy_confunSee the description of the parameter CONFUN in E04WDF.

## 7  Routines Withdrawn or Scheduled for Withdrawal

The following lists all those routines that have been withdrawn since Mark 18 of the Library or are scheduled for withdrawal at one of the next two marks.
 WithdrawnRoutine Mark ofWithdrawal Replacement Routine(s) E04CCF/E04CCA 24 E04CBF E04FDF 19 E04FYF E04GCF 19 E04GYF E04GEF 19 E04GZF E04HFF 19 E04HYF E04JAF 19 E04JYF E04KAF 19 E04KYF E04KCF 19 E04KZF E04LAF 19 E04LYF E04MBF 18 E04MFF/E04MFA E04NAF 18 E04NFF/E04NFA E04UNF 22 E04USF/E04USA E04UPF 19 E04USF/E04USA E04ZCF/E04ZCA 24 No longer required

## 8  References

Bard Y (1974) Nonlinear Parameter Estimation Academic Press
Dantzig G B (1963) Linear Programming and Extensions Princeton University Press
Fletcher R (1987) Practical Methods of Optimization (2nd Edition) Wiley
Gill P E and Murray W (ed.) (1974) Numerical Methods for Constrained Optimization Academic Press
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press
Murray W (ed.) (1972) Numerical Methods for Unconstrained Optimization Academic Press
Wolberg J R (1967) Prediction Analysis Van Nostrand

E04 Chapter Contents