f11gr:: Large Scale Linear Systems (NAG Toolbox)

nag_sparse_complex_herm_basic_setup (f11gr) is a setup function which must be called before the iterative solver nag_sparse_complex_herm_basic_solver (f11gs). nag_sparse_complex_herm_basic_diag (f11gt), the third function in the suite, can be used to return additional information about the computation. Either of two methods can be used:

1.	Conjugate Gradient Method (CG) For this method (see Hestenes and Stiefel (1952), Golub and Van Loan (1996), Barrett et al. (1994) and Dias da Cunha and Hopkins (1994)), the matrix $A$ should ideally be positive definite. The application of the Conjugate Gradient method to indefinite matrices may lead to failure or to lack of convergence.
2.	Lanczos Method (SYMMLQ) This method, based upon the algorithm SYMMLQ (see Paige and Saunders (1975) and Barrett et al. (1994)), is suitable for both positive definite and indefinite matrices. It is more robust than the Conjugate Gradient method but less efficient when $A$ is positive definite.

Both CG and SYMMLQ methods start from the residual

r_{0} = b - A x_{0}

, where

x_{0}

is an initial estimate for the solution (often

x_{0} = 0

), and generate an orthogonal basis for the Krylov subspace

span \{A^{k} r_{0}\}

, for

k = 0, 1, \dots

, by means of three-term recurrence relations (see Golub and Van Loan (1996)). A sequence of real symmetric tridiagonal matrices

\{T_{k}\}

is also generated. Here and in the following, the index

k

denotes the iteration count. The resulting real symmetric tridiagonal systems of equations are usually more easily solved than the original problem. A sequence of solution iterates

\{x_{k}\}

is thus generated such that the sequence of the norms of the residuals

\{‖r_{k}‖\}

converges to a required tolerance. Note that, in general, the convergence is not monotonic.

Faster convergence can be achieved using a preconditioner (see Golub and Van Loan (1996) and Barrett et al. (1994)). A preconditioner maps the original system of equations onto a different system, say

\bar{A} \bar{x} = \bar{b},

(1)

with, hopefully, better characteristics with respect to its speed of convergence: for example, the condition number of the matrix of the coefficients can be improved or eigenvalues in its spectrum can be made to coalesce. An orthogonal basis for the Krylov subspace

span \{{\bar{A}}^{k} {\bar{r}}_{0}\}

, for

k = 0, 1, \dots

, is generated and the solution proceeds as outlined above. The algorithms used are such that the solution and residual iterates of the original system are produced, not their preconditioned counterparts. Note that an unsuitable preconditioner or no preconditioning at all may result in a very slow rate, or lack, of convergence. However, preconditioning involves a trade-off between the reduction in the number of iterations required for convergence and the additional computational costs per iteration. Also, setting up a preconditioner may involve non-negligible overheads.

A preconditioner must be Hermitian and positive definite, i.e., representable by

M = E E^{H}

, where

M

is nonsingular, and such that

\bar{A} = E^{- 1} A E^{- H} \sim I_{n}

in (1), where

I_{n}

is the identity matrix of order

n

. Also, we can define

\bar{r} = E^{- 1} r

and

\bar{x} = E^{H} x

. These are formal definitions, used only in the design of the algorithms; in practice, only the means to compute the matrix-vector products

v = A u

and to solve the preconditioning equations

M v = u

are required, that is, explicit information about

M

E

or their inverses is not required at any stage.

The first termination criterion

{‖r_{k}‖}_{p} \leq τ ({‖b‖}_{p} + {‖A‖}_{p} \times {‖x_{k}‖}_{p})

(2)

is available for both conjugate gradient and Lanczos (SYMMLQ) methods. In (2),

p = 1, \infty ​ or ​ 2

and

τ

denotes a user-specified tolerance subject to

\max (10, \sqrt{n}) ε \leq τ < 1

, where

ε

is the machine precision. Facilities are provided for the estimation of the norm of the matrix of the coefficients

{‖A‖}_{1} = {‖A‖}_{\infty}

, when this is not known in advance, used in (2), by applying Higham's method (see Higham (1988)). Note that

{‖A‖}_{2}

cannot be estimated internally. This criterion uses an error bound derived from backward error analysis to ensure that the computed solution is the exact solution of a problem as close to the original as the termination tolerance requires. Termination criteria employing bounds derived from forward error analysis could be used, but any such criteria would require information about the condition number

κ (A)

which is not easily obtainable.

The second termination criterion

{‖{\bar{r}}_{k}‖}_{2} \leq τ \max (1.0, {‖b‖}_{2} / {‖r_{0}‖}_{2}) ({‖{\bar{r}}_{0}‖}_{2} + σ_{1} (\bar{A}) \times {‖Δ {\bar{x}}_{k}‖}_{2})

(3)

is available only for the Lanczos method (SYMMLQ). In (3),

σ_{1} (\bar{A}) = {‖\bar{A}‖}_{2}

is the largest singular value of the (preconditioned) iteration matrix

\bar{A}

. This termination criterion monitors the progress of the solution of the preconditioned system of equations and is less expensive to apply than criterion (2). When

σ_{1} (\bar{A})

is not supplied, facilities are provided for its estimation by

σ_{1} (\bar{A}) \sim \max_{k} σ_{1} (T_{k})

. The interlacing property

σ_{1} (T_{k - 1}) \leq σ_{1} (T_{k})

and Gerschgorin's theorem provide lower and upper bounds from which

σ_{1} (T_{k})

can be easily computed by bisection. Alternatively, the less expensive estimate

σ_{1} (\bar{A}) \sim \max_{k} {‖T_{k}‖}_{1}

can be used, where

σ_{1} (\bar{A}) \leq {‖T_{k}‖}_{1}

by Gerschgorin's theorem. Note that only order of magnitude estimates are required by the termination criterion.

Termination criterion (2) is the recommended choice, despite its (small) additional costs per iteration when using the Lanczos method (SYMMLQ). Also, if the norm of the initial estimate is much larger than the norm of the solution, that is, if

‖x_{0}‖ ≫ ‖x‖

, a dramatic loss of significant digits could result in complete lack of convergence. The use of criterion (2) will enable the detection of such a situation, and the iteration will be restarted at a suitable point. No such restart facilities are provided for criterion (3).

The sequence of calls to the functions comprising the suite is enforced: first, the setup function nag_sparse_complex_herm_basic_setup (f11gr) must be called, followed by the solver nag_sparse_complex_herm_basic_solver (f11gs). nag_sparse_complex_herm_basic_diag (f11gt) can be called either when nag_sparse_complex_herm_basic_solver (f11gs) is carrying out a monitoring step or after nag_sparse_complex_herm_basic_solver (f11gs) has completed its tasks. Incorrect sequencing will raise an error condition.

References

Parameters

Compulsory Input Parameters

Optional Input Parameters

Output Parameters

Error Indicators and Warnings

Accuracy

Further Comments

When

σ_{1} (\bar{A})

is not supplied (

sigmax \leq 0.0

) but it is required, it is estimated by nag_sparse_complex_herm_basic_solver (f11gs) using either of the two methods described in Description, as specified by the argument sigcmp. In particular, if

sigcmp ='S'

, then the computation of

σ_{1} (\bar{A})

is deemed to have converged when the differences between three successive values of

σ_{1} (T_{k})

differ, in a relative sense, by less than the tolerance sigtol, i.e., when

\max (\frac{|σ_{1}^{(k)} - σ_{1}^{(k - 1)}|}{σ_{1}^{(k)}}, \frac{|σ_{1}^{(k)} - σ_{1}^{(k - 2)}|}{σ_{1}^{(k)}}) \leq sigtol .

The computation of

σ_{1} (\bar{A})

is also terminated when the iteration count exceeds the maximum value allowed, i.e.,

k \geq maxits

Example

function f11gr_example


fprintf('f11gr example results\n\n');

% Solve sparse Hermitian system Ax = b using CG method with
% Incomplete Cholesky preconditioning (IC)

% Define A and b 
n  = int64(9);
nz = int64(23);
a    = complex(zeros(3*nz,1));
irow = zeros(3*nz, 1, 'int64');
icol = zeros(3*nz, 1, 'int64');
a(1:nz) = [ 6 + 0i;         -1 + 1i; 6 + 0i;         0 + 1i; 5 + 0i;
            5 + 0i;          2 - 2i; 4 + 0i;         1 + 1i; 2 + 0i; 6 + 0i;
           -4 + 3i; 0 + 1i; -1 + 0i; 6 + 0i;        -1 - 1i; 0 - 1i; 9 + 0i; 
            1 + 3i; 1 + 2i; -1 + 0i; 1 + 4i; 9 + 0i];

irow(1:nz) = [1; 2;2; 3;3;  4; 5;5; 6;6;6;  7;7;7;7; 8;8;8;  9;9;9;9;9];
icol(1:nz) = [1; 1;2; 2;3;  4; 1;5; 3;4;6;  2;5;6;7; 4;6;8;  1;5;6;8;9];

b = [ 8 + 54i; -10 - 92i; 25 + 27i;
     26 - 28i;  54 + 12i; 26 - 22i;
     47 + 65i;  71 - 57i; 60 + 70i];

% Setup IC factorization
lfill  = int64(0);
dtol   = 0;
mic    = 'N';
dscale = 0;
ipiv   = zeros(n, 1, 'int64');

[a, irow, icol, ipiv, istr, nnzc, npivm, ifail] = ...
  f11jn( ...
         nz, a, irow, icol, lfill, dtol, mic, dscale, ipiv);

% Iterative method setup
method = 'CG    ';
precon = 'Preconditioned';
tol    = (x02aj)^(3/8);
maxitn = int64(20);
anorm  = 0;
sigmax = 0;
maxits = int64(9);
monit  = int64(2);

[lwreq, work, ifail] = ...
  f11gr( ...
         method, precon, int64(n), tol, maxitn, anorm, sigmax, ...
         maxits, monit, 'sigcmp', 's', 'norm_p', '1');

% Reverse communication loop calling f11ge
irevcm = int64(0);
u      = complex(zeros(n,1));
v      = b;
wgt    = zeros(n,1);

while (irevcm ~= 4)
  [irevcm, u, v, work, ifail] = ...
    f11gs( ...
           irevcm, u, v, wgt, work);

  if (irevcm == 1)
    % v = Au
    [v, ifail] = f11xs( ...
                        a(1:nz), irow(1:nz), icol(1:nz), 'N', u);
  elseif (irevcm == 2)
    % Solve (IC)v = u
    [v, ifail] = f11jp( ...
                        a, irow, icol, ipiv, istr, 'N', u);
  elseif (irevcm == 3)
    % Monitoring
    [itn, stplhs, stprhs, anorm, sigmax, its, sigerr, ifail] = ...
    f11gt(work);
    fprintf('\nMonitoring at iteration number %2d\n',itn);
    fprintf('residual norm:              %14.4e\n', stplhs);
    fprintf('\n   Solution Vector\n');
    disp(u);
    fprintf('\n   Residual Vector\n');
    disp(v);
  end
end

% Get information about the computation
[itn, stplhs, stprhs, anorm, sigmax, its, sigerr, ifail] = ...
f11gt(work);

fprintf('\nNumber of iterations for convergence:     %4d\n', itn);
fprintf('Residual norm:                           %14.4e\n', stplhs);
fprintf('Right-hand side of termination criteria: %14.4e\n', stprhs);
fprintf('i-norm of matrix a:                      %14.4e\n', anorm);
fprintf('\n   Solution Vector\n');
disp(u);
fprintf('\n   Residual Vector\n');
disp(v);

f11gr example results


Monitoring at iteration number  2
residual norm:                  1.4937e+01

   Solution Vector
   0.2142 + 4.5333i
  -1.6589 -12.6722i
   2.4101 + 7.4551i
   4.4400 - 6.4174i
   9.1135 + 3.7812i
   4.4419 - 4.0382i
   1.4757 + 1.2662i
   8.4872 - 3.5347i
   5.9948 + 0.9685i


   Residual Vector
  -1.8370 + 3.6956i
  -0.6501 + 0.2546i
  -0.1262 - 0.1362i
  -0.1312 + 0.1413i
  -1.1471 + 0.7339i
  -0.5505 - 1.0535i
   1.7165 - 1.4614i
  -0.3583 + 0.2876i
  -0.3028 - 0.3532i


Monitoring at iteration number  4
residual norm:                  1.4602e+00

   Solution Vector
   1.0061 + 8.9847i
   1.9637 - 7.9768i
   3.0067 + 7.0285i
   3.9830 - 5.9636i
   5.0390 + 5.0432i
   6.0488 - 4.0771i
   6.9710 + 3.0168i
   8.0118 - 1.9806i
   9.0074 + 0.9646i


   Residual Vector
   0.0115 - 0.0282i
   0.0135 - 0.1734i
   0.0182 + 0.0196i
   0.0189 - 0.0204i
  -0.0909 - 0.1090i
  -0.2389 + 0.3244i
   0.1903 - 0.0155i
   0.0516 - 0.0414i
   0.0436 + 0.0509i


Number of iterations for convergence:        5
Residual norm:                               9.0594e-14
Right-hand side of termination criteria:     2.8433e-03
i-norm of matrix a:                          2.2000e+01

   Solution Vector
   1.0000 + 9.0000i
   2.0000 - 8.0000i
   3.0000 + 7.0000i
   4.0000 - 6.0000i
   5.0000 + 5.0000i
   6.0000 - 4.0000i
   7.0000 + 3.0000i
   8.0000 - 2.0000i
   9.0000 + 1.0000i


   Residual Vector
   1.0e-13 *

  -0.0178 + 0.0000i
   0.0355 - 0.2842i
  -0.0355 + 0.0355i
   0.0355 - 0.0711i
  -0.0711 + 0.0355i
  -0.0711 + 0.0000i
   0.0000 + 0.0000i
   0.0000 - 0.0711i
   0.0000 - 0.1421i

NAG Toolbox: nag_sparse_complex_herm_basic_setup (f11gr)

▸▿ Contents

Purpose

Syntax

Description

References

Parameters

Compulsory Input Parameters

Optional Input Parameters

Output Parameters

Error Indicators and Warnings

Accuracy

Further Comments

Example