hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_opt_uncon_conjgrd_comp (e04dg)

Purpose

nag_opt_uncon_conjgrd_comp (e04dg) minimizes an unconstrained nonlinear function of several variables using a pre-conditioned, limited memory quasi-Newton conjugate gradient method. First derivatives (or an ‘acceptable’ finite difference approximation to them) are required. It is intended for use on large scale problems.

Syntax

[iter, objf, objgrd, x, user, lwsav, iwsav, rwsav, ifail] = e04dg(objfun, x, lwsav, iwsav, rwsav, 'n', n, 'user', user)
[iter, objf, objgrd, x, user, lwsav, iwsav, rwsav, ifail] = nag_opt_uncon_conjgrd_comp(objfun, x, lwsav, iwsav, rwsav, 'n', n, 'user', user)
Before calling nag_opt_uncon_conjgrd_comp (e04dg), or the option setting function (e04dk), function nag_opt_init (e04wb) must be called.

Description

nag_opt_uncon_conjgrd_comp (e04dg) is designed to solve unconstrained minimization problems of the form
minimize F(x)  subject to  x,
xRn
minimize xRn F(x)  subject to  -x,
where xx is an nn-element vector.
You must supply an initial estimate of the solution.
For maximum reliability, it is preferable to provide all first partial derivatives. If all of the derivatives cannot be provided, you are recommended to obtain approximate values (using finite differences) by calling nag_opt_estimate_deriv (e04xa) from within objfun.
The method used by nag_opt_uncon_conjgrd_comp (e04dg) is described in Section [Algorithmic Details].

References

Gill P E and Murray W (1979) Conjugate-gradient methods for large-scale nonlinear optimization Technical Report SOL 79-15 Department of Operations Research, Stanford University
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press

Parameters

Compulsory Input Parameters

1:     objfun – function handle or string containing name of m-file
objfun must calculate the objective function F(x)F(x) and possibly its gradient as well for a specified nn-element vector xx.
[mode, objf, objgrd, user] = objfun(mode, n, x, nstate, user)

Input Parameters

1:     mode – int64int32nag_int scalar
Indicates which values must be assigned during each call of objfun. Only the following values need be assigned:
mode = 0mode=0
objf.
mode = 2mode=2
objf and objgrd.
2:     n – int64int32nag_int scalar
nn, the number of variables.
3:     x(n) – double array
xx, the vector of variables at which the objective function and its gradient are to be evaluated.
4:     nstate – int64int32nag_int scalar
Will be 11 on the first call of objfun by nag_opt_uncon_conjgrd_comp (e04dg), and 00 for all subsequent calls. Thus, you may wish to test, nstate within objfun in order to perform certain calculations once only. For example, you may read data or initialize global variables when nstate = 1nstate=1.
5:     user – Any MATLAB object
objfun is called from nag_opt_uncon_conjgrd_comp (e04dg) with the object supplied to nag_opt_uncon_conjgrd_comp (e04dg).

Output Parameters

1:     mode – int64int32nag_int scalar
May be set to a negative value if you wish to terminate the solution to the current problem, and in this case nag_opt_uncon_conjgrd_comp (e04dg) will terminate with ifail set to mode.
2:     objf – double scalar
The value of the objective function at xx.
3:     objgrd(n) – double array
If mode = 2mode=2, objgrd(i)objgrdi must contain the value of (F)/(xi) F xi evaluated at xx, for i = 1,2,,ni=1,2,,n.
4:     user – Any MATLAB object
Note:  objfun should be tested separately before being used in conjunction with nag_opt_uncon_conjgrd_comp (e04dg). See also the description of the optional parameter Verify.
2:     x(n) – double array
n, the dimension of the array, must satisfy the constraint n > 0n>0.
An initial estimate of the solution.
3:     lwsav(120120) – logical array
4:     iwsav(610610) – int64int32nag_int array
5:     rwsav(475475) – double array
The arrays lwsav, iwsav and rwsav must not be altered between calls to any of the functions (e04dg), (e04dk) or nag_opt_init (e04wb).

Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array x.
nn, the number of variables.
Constraint: n > 0n>0.
2:     user – Any MATLAB object
user is not used by nag_opt_uncon_conjgrd_comp (e04dg), but is passed to objfun. Note that for large objects it may be more efficient to use a global variable which is accessible from the m-files than to use user.

Input Parameters Omitted from the MATLAB Interface

iwork work iuser ruser

Output Parameters

1:     iter – int64int32nag_int scalar
The total number of iterations performed.
2:     objf – double scalar
The value of the objective function at the final iterate.
3:     objgrd(n) – double array
The gradient of the objective function at the final iterate (or its finite difference approximation).
4:     x(n) – double array
The final estimate of the solution.
5:     user – Any MATLAB object
6:     lwsav(120120) – logical array
7:     iwsav(610610) – int64int32nag_int array
8:     rwsav(475475) – double array
9:     ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).
nag_opt_uncon_conjgrd_comp (e04dg) returns with ifail = 0ifail=0 if the following three conditions are satisfied:
(i) Fk1Fk < τF(1 + |Fk|)Fk-1-Fk<τF(1+|Fk|)
(ii) xk1xk < sqrt(τF)(1 + xk)xk-1-xk<τF(1+xk)
(iii) gkroot(3,τF)(1 + |Fk|)gkτF3(1+|Fk|) or gk < εAgk<εA
where τFτF is the value of the optional parameter Optimality Tolerance (default value = ε0.8default value=ε0.8) and εAεA is the absolute error associated with computing the objective function.
For a full discussion on termination criteria see Chapter 8 of Gill et al. (1981).

Error Indicators and Warnings

Note: nag_opt_uncon_conjgrd_comp (e04dg) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

W ifail < 0ifail<0
A negative value of ifail indicates an exit from nag_opt_uncon_conjgrd_comp (e04dg) because you set mode < 0mode<0 in objfun. The value of ifail will be the same as your setting of mode.
  ifail = 1ifail=1
Not used by this function.
  ifail = 2ifail=2
Not used by this function.
  ifail = 3ifail=3
The limiting number of iterations (as determined by the optional parameter Iteration Limit (default value = max (50,5n)default value=max(50,5n)) has been reached.
If the algorithm appears to be making satisfactory progress, then optional parameter Iteration Limit may be too small. If so, increase its value and rerun nag_opt_uncon_conjgrd_comp (e04dg). If the algorithm seems to be making little or no progress, then you should check for incorrect gradients as described under ifail = 7ifail=7.
  ifail = 4ifail=4
The computed upper bound on the step length taken during the linesearch was too small. A rerun with an increased value of the optional parameter Maximum Step Length (ρρ say) may be successful unless ρ1020ρ1020 (the default value), in which case the current point cannot be improved upon.
  ifail = 5ifail=5
Not used by this function.
W ifail = 6ifail=6
The conditions for an acceptable solution (see parameter ifail in Section [Parameters]) have not all been met, but a lower point could not be found.
If objfun computes the objective function and its gradient correctly, then this may occur because an overly stringent accuracy has been requested, i.e., the value of the optional parameter Optimality Tolerance (default value = ε0.8default value=ε0.8) is too small or if αk0αk0. In this case you should apply the three tests described under ifail = 0ifail=0 to determine whether or not the final solution is acceptable. For a discussion of attainable accuracy see Gill et al. (1981).
If many iterations have occurred in which essentially no progress has been made or nag_opt_uncon_conjgrd_comp (e04dg) has failed to move from the initial point, objfun may be incorrect. You should refer to the comments below under ifail = 7ifail=7 and check the gradients using the optional parameter Verify (default value = 0default value=0). Unfortunately, there may be small errors in the objective gradients that cannot be detected by the verification process. Finite difference approximations to first derivatives are catastrophically affected by even small inaccuracies.
W ifail = 7ifail=7
The user-supplied derivatives of the objective function appear to be incorrect.
Large errors were found in the derivatives of the objective function. This value of ifail will occur if the verification process indicated that at least one gradient element had no correct figures. You should refer to the printed output to determine which elements are suspected to be in error.
As a first step, you should check that the code for the objective values is correct – for example, by computing the function at a point where the correct value is known. However, care should be taken that the chosen point fully tests the evaluation of the function. It is remarkable how often the values x = 0x=0 or x = 1x=1 are used to test function evaluation procedures, and how often the special properties of these numbers make the test meaningless.
Special care should be used in this test if computation of the objective function involves subsidiary data communicated in global storage. Although the first evaluation of the function may be correct, subsequent calculations may be in error because some of the subsidiary data has accidentally been overwritten.
Errors in programming the function may be quite subtle in that the function value is almost correct. For example, the function may not be accurate to full precision because of the inaccurate calculation of a subsidiary quantity, or the limited accuracy of data upon which the function depends. A common error on machines where numerical calculations are usually performed in double precision is to include even one single precision constant in the calculation of the function; since some compilers do not convert such constants to double precision, half the correct figures may be lost by such a seemingly trivial error.
  ifail = 8ifail=8
The gradient (g = (F)/(x))(g= F x ) at the starting point x0x0 is ‘too small’. More precisely, the value of g(x0)T g(x0) g(x0)T g(x0)  is less than εr|1 + F(x0)|εr|1+F(x0)|, where εrεr is the value of the optional parameter Function Precision (default value = ε0.9default value=ε0.9).
The problem should be rerun from a different starting point.
  ifail = 9ifail=9
An input parameter is invalid.

Accuracy

On successful exit (ifail = 0ifail=0) the accuracy of the solution will be as defined by the optional parameter Optimality Tolerance (default value = ε0.8default value=ε0.8).

Further Comments

To evaluate an ‘acceptable’ set of finite difference intervals using nag_opt_estimate_deriv (e04xa) requires 22 function evaluations per variable for a well-scaled problem and up to 66 function evaluations per variable for a badly scaled problem.

Description of Printed Output

This section describes the intermediate printout and final printout produced by nag_opt_uncon_conjgrd_comp (e04dg). You can control the level of printed output (see the description of the optional parameter Print Level). Note that the intermediate printout and final printout are produced only if Print Level10Print Level10 (the default for nag_opt_uncon_conjgrd_comp (e04dg), by default no output is produced by nag_opt_uncon_conjgrd_comp (e04dg)).
The following line of summary output ( < 80<80 characters) is produced at every iteration. In all cases, the values of the quantities are those in effect on completion of the given iteration.
Itn is the iteration count.
Step is the step αkαk taken along the computed search direction. On reasonably well-behaved problems, the unit step (i.e., αk = 1αk=1) will be taken as the solution is approached.
Nfun is the cumulated number of evaluations of the objective function needed for the linesearch. Evaluations needed for the verification of the gradients by finite differences are not included. Nfun is printed as a guide to the amount of work required for the linesearch. nag_opt_uncon_conjgrd_comp (e04dg) will perform at most 1111 function evaluations per iteration.
Objective is the value of the objective function at xkxk.
Norm G is the Euclidean norm of the gradient of the objective function at xkxk.
Norm X is the Euclidean norm of xkxk.
Norm (X(k-1)-X(k)) is the Euclidean norm of xk1xkxk-1-xk.
The following describes the printout for each variable.
Variable gives the name (Varbl) and index jj, for j = 1,2,,nj=1,2,,n of the variable.
Value is the value of the variable at the final iteration.
Gradient Value is the value of the gradient of the objective function with respect to the jjth variable at the final iteration.
Numerical values are output with a fixed number of digits; they are not guaranteed to be accurate to this precision.

Example

function nag_opt_uncon_conjgrd_comp_example
x = [-1;
     1];
[cwsav,lwsav,iwsav,rwsav,ifail] = nag_opt_init('nag_opt_uncon_conjgrd_comp');
[iter, objf, objgrd, xOut, user, lwsavOut, iwsavOut, rwsavOut, ifail] = ...
    nag_opt_uncon_conjgrd_comp(@objfun, x, lwsav, iwsav, rwsav);
 iter, objf, objgrd, xOut, ifail

function [mode, objf, objgrd, user] = objfun(mode, n, x, nstate, user)

      expx1 = exp(x(1));
      objf = expx1*(4.0*x(1)^2+2.0*x(2)^2+4.0*x(1)*x(2)+2.0*x(2)+1.0);
      if (mode == 2)
         objgrd(1) = 4.0*expx1*(2.0*x(1)+x(2)) + objf;
         objgrd(2) = 2.0*expx1*(2.0*x(2)+2.0*x(1)+1.0);
      else
         objgrd = zeros(2,1);
      end
 

iter =

                   10


objf =

   5.3083e-14


objgrd =

   1.0e-06 *

    0.9125
    0.8316


xOut =

    0.5000
   -1.0000


ifail =

                    0


function e04dg_example
x = [-1;
     1];
[cwsav,lwsav,iwsav,rwsav,ifail] = e04wb('e04dg');
[iter, objf, objgrd, xOut, user, lwsavOut, iwsavOut, rwsavOut, ifail] = ...
    e04dg(@objfun, x, lwsav, iwsav, rwsav);
 iter, objf, objgrd, xOut, ifail

function [mode, objf, objgrd, user] = objfun(mode, n, x, nstate, user)

      expx1 = exp(x(1));
      objf = expx1*(4.0*x(1)^2+2.0*x(2)^2+4.0*x(1)*x(2)+2.0*x(2)+1.0);
      if (mode == 2)
         objgrd(1) = 4.0*expx1*(2.0*x(1)+x(2)) + objf;
         objgrd(2) = 2.0*expx1*(2.0*x(2)+2.0*x(1)+1.0);
      else
         objgrd = zeros(2,1);
      end
 

iter =

                   10


objf =

   5.3083e-14


objgrd =

   1.0e-06 *

    0.9125
    0.8316


xOut =

    0.5000
   -1.0000


ifail =

                    0


Note: the remainder of this document is intended for more advanced users. Section [Algorithmic Details] contains a detailed description of the algorithm which may be needed in order to understand Section [Optional Parameters]. Section [Optional Parameters] describes the optional parameters which may be set by calls to nag_opt_uncon_conjgrd_option_string (e04dk).

Algorithmic Details

This section contains a description of the method used by nag_opt_uncon_conjgrd_comp (e04dg).
nag_opt_uncon_conjgrd_comp (e04dg) uses a pre-conditioned conjugate gradient method and is based upon algorithm PLMA as described in Section 4.8.3 of Gill and Murray (1979) and Gill et al. (1981).
The algorithm proceeds as follows:
Let x0x0 be a given starting point and let kk denote the current iteration, starting with k = 0k=0. The iteration requires gkgk, the gradient vector evaluated at xkxk, the kkth estimate of the minimum. At each iteration a vector pkpk (known as the direction of search) is computed and the new estimate xk + 1xk+1 is given by xk + αkpkxk+αkpk where αkαk (the step length) minimizes the function F(xk + αkpk)F(xk+αkpk) with respect to the scalar αkαk. A choice of initial step α0α0 is taken as
α0 = min {1, 2 × |FkFest| / gkT gk }
α0 = min{1, 2 × |Fk-Fest| / gkT gk }
where FestFest is a user-supplied estimate of the function value at the solution. If FestFest is not specified, the software always chooses the unit step length for α0α0. Subsequent step length estimates are computed using cubic interpolation with safeguards.
A quasi-Newton method can be used to compute the search direction pkpk by updating the inverse of the approximate Hessian (Hk)(Hk) and computing
pk + 1 = Hk + 1gk + 1.
pk+1=-Hk+1gk+1.
(1)
The updating formula for the approximate inverse is given by
Hk + 1 = Hk 1/( ykT sk ) (HkykskT + skykTHk) + 1/( ykT sk ) (1 + ( ykT Hk yk )/( ykT sk )) sk skT ,
Hk+1 = Hk - 1 ykT sk ( Hk yk skT + sk ykT Hk ) + 1 ykT sk ( 1 + ykT Hk yk ykT sk ) sk skT ,
(2)
where yk = gk1gkyk=gk-1-gk and sk = xk + 1xk = αkpksk=xk+1-xk=αkpk.
The method used to obtain the search direction is based upon computing pk + 1pk+1 as Hk + 1gk + 1-Hk+1gk+1 where Hk + 1Hk+1 is a matrix obtained by updating the identity matrix with a limited number of quasi-Newton corrections. The storage of an nn by nn matrix is avoided by storing only the vectors that define the rank two corrections – hence the term ‘limited-memory’ quasi-Newton method. The precise method depends upon the number of updating vectors stored. For example, the direction obtained with the ‘one-step’ limited memory update is given by (1) using (2) with HkHk equal to the identity matrix, viz.
pk + 1 = gk + 1 + 1/( ykT sk ) (skTgk + 1yk + ykTgk + 1sk) ( skT gk + 1 )/( ykT sk ) (1 + ( ykT yk )/( ykT sk )) sk .
pk+1 = - gk+1 + 1 ykT sk ( skT gk+1 yk + ykT gk+1 sk ) - skT gk+1 ykT sk ( 1 + ykT yk ykT sk ) sk .
Using a limited-memory quasi-Newton formula, such as the one above, guarantees pk + 1pk+1 to be a descent direction if all the inner products ykT sk ykT sk  are positive for all vectors ykyk and sksk used in the updating formula.

Optional Parameters

Several optional parameters in nag_opt_uncon_conjgrd_comp (e04dg) define choices in the problem specification or the algorithm logic. In order to reduce the number of formal parameters of nag_opt_uncon_conjgrd_comp (e04dg) these optional parameters have associated default values that are appropriate for most problems. Therefore, you need only specify those optional parameters whose values are to be different from their default values.
The remainder of this section can be skipped if you wish to use the default values for all optional parameters.
The following is a list of the optional parameters available. A full description of each optional parameter is provided in Section [Description of the optional parameters].
Optional parameters may be specified by calling nag_opt_uncon_conjgrd_option_string (e04dk) before a call to nag_opt_uncon_conjgrd_comp (e04dg).
nag_opt_uncon_conjgrd_option_string (e04dk) can be called to supply options directly, one call being necessary for each optional parameter. For example,
[lwsav, iwsav, rwsav, inform] = e04dk('Print Level = 1', lwsav, iwsav, rwsav);
nag_opt_uncon_conjgrd_option_string (e04dk) should be consulted for a full description of this method of supplying optional parameters.
All optional parameters not specified by you are set to their default values. Optional parameters specified by you are unaltered by nag_opt_uncon_conjgrd_comp (e04dg) (unless they define invalid values) and so remain in effect for subsequent calls unless altered by you.

Description of the Optional Parameters

For each option, we give a summary line, a description of the optional parameter and details of constraints.
The summary line contains:
Keywords and character values are case and white space insensitive.
Defaults  
This special keyword may be used to reset all optional parameters to their default values.
Estimated Optimal Function Value  rr
This value of rr specifies the user-supplied guess of the optimum objective function value FestFest. This value is used to calculate an initial step length α0α0 (see Section [Algorithmic Details]). If the value of rr is not specified (the default), then this has the effect of setting α0α0 to unity. It should be noted that for badly scaled functions a unit step along the steepest descent direction will often compute the objective function at very large values of xx.
Function Precision  rr
Default = ε0.9=ε0.9
The parameter defines εrεr, which is intended to be a measure of the accuracy with which the problem function F(x)F(x) can be computed. If r < εr<ε or r1r1, the default value is used.
The value of εrεr should reflect the relative precision of 1 + |F(x)|1+|F(x)|; i.e., εrεr acts as a relative precision when |F||F| is large, and as an absolute precision when |F||F| is small. For example, if F(x)F(x) is typically of order 10001000 and the first six significant digits are known to be correct, an appropriate value for εrεr would be 10610-6. In contrast, if F(x)F(x) is typically of order 10410-4 and the first six significant digits are known to be correct, an appropriate value for εrεr would be 101010-10. The choice of εrεr can be quite complicated for badly scaled problems; see Chapter 8 of Gill et al. (1981) for a discussion of scaling techniques. The default value is appropriate for most simple functions that are computed with full accuracy. However when the accuracy of the computed function values is known to be significantly worse than full precision, the value of εrεr should be large enough so that no attempt will be made to distinguish between function values that differ by less than the error inherent in the calculation.
Iteration Limit  ii
Default = max (50,5n)=max(50,5n)
Iters  
Itns  
The value of ii specifies the maximum number of iterations allowed before termination. If i < 0i<0, the default value is used.
Problems whose Hessian matrices at the solution contain sets of clustered eigenvalues are likely to be minimized in significantly fewer than nn iterations. Problems without this property may require anything between nn and 5n5n iterations, with approximately 2n2n iterations being a common figure for moderately difficult problems.
Linesearch Tolerance  rr
Default = 0.9=0.9
The value rr controls the accuracy with which the step αα taken during each iteration approximates a minimum of the function along the search direction (the smaller the value of rr, the more accurate the linesearch). The default value r = 0.9r=0.9 requests an inaccurate search, and is appropriate for most problems. A more accurate search may be appropriate when it is desirable to reduce the number of iterations – for example, if the objective function is cheap to evaluate. If r < 0r<0 or r1r1, the default value is used.
List  
Default for e04dg = Liste04dg=List
Nolist  
Default for e04dg = Noliste04dg=Nolist
Normally each optional parameter specification is printed as it is supplied. Optional parameter Nolist may be used to suppress the printing and optional parameter List may be used to restore printing.
Maximum Step Length  rr
Default = 1020=1020
If r > 0r>0, the maximum allowable step length for the linesearch is taken as min (1/(x02am()),r/(pk)) min(1x02am(),rpk) . If r0r0, the default value is used.
Optimality Tolerance  rr
Default = εR0.8=εR0.8
The parameter rr specifies the accuracy to which you wish the final iterate to approximate a solution of the problem. Broadly speaking, rr indicates the number of correct figures desired in the objective function at the solution. For example, if rr is 10610-6 and termination occurs with ifail = 0ifail=0 (see Section [Parameters]), then the final point satisfies the termination criteria, where τFτF represents Optimality Tolerance. If r < εrr<εr or r1r1, the default value is used.
Print Level  ii
The value ii controls the amount of printout produced by nag_opt_uncon_conjgrd_comp (e04dg), as indicated below. A detailed description of the printout is given in Section [Description of Printed Output] (summary output at each iteration and the final solution).
ii Output
0000 No output.
0101 The final solution only.
0505 One line of summary output ( < 80<80 characters; see Section [Description of Printed Output]) for each iteration (no printout of the final solution).
1010 The final solution and one line of summary output for each iteration.
Start Objective Check at Variable  i1i1
Default = 1=1
Stop Objective Check at Variable  i2i2
Default = n=n
These keywords take effect only if Verify Level > 0Verify Level>0. They may be used to control the verification of gradient elements computed by objfun. For example, if the first 3030 elements of the objective gradient appeared to be correct in an earlier run, so that only element 3131 remains questionable, it is reasonable to specify Start Objective Check at Variable = 31Start Objective Check at Variable=31. If the first 3030 variables appear linearly in the objective, so that the corresponding gradient elements are constant, the above choice would also be appropriate.
If i10i10 or i1 > max (1,min (n,i2))i1>max(1,min(n,i2)), the default value is used. If i20i20 or i2 > ni2>n, the default value is used.
Verify Level  ii
Default = 0=0
Verify  
Verify Gradients  
Verify Objective Gradients  
These keywords refer to finite difference checks on the gradient elements computed by objfun. Gradients are verified at the user-supplied initial estimate of the solution. The possible choices for ii are as follows:
ii Meaning
1-1 No checks are performed.
0-0 Only a ‘cheap’ test will be performed, requiring one call to objfun.
1-1 In addition to the ‘cheap’ test, individual gradient elements will also be checked using a reliable (but more expensive) test.
For example, the objective gradient will be verified if Verify, Verify = YESVerify=YES, Verify Gradients, Verify Objective Gradients or Verify Level = 1Verify Level=1 is specified.

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013