NAG Library Routine Document
f11grf
(complex_herm_basic_setup)
1
Purpose
f11grf is a setup routine, the first in a suite of three routines for the iterative solution of a complex Hermitian system of simultaneous linear equations.
f11grf must be called before
f11gsf, the iterative solver. The third routine in the suite,
f11gtf, can be used to return additional information about the computation.
These three routines are suitable for the solution of large sparse complex Hermitian systems of equations.
2
Specification
Fortran Interface
Subroutine f11grf ( 
method,
precon,
sigcmp,
norm,
weight,
iterm,
n,
tol,
maxitn,
anorm,
sigmax,
sigtol,
maxits,
monit,
lwreq,
work,
lwork,
ifail) 
Integer, Intent (In)  :: 
iterm,
n,
maxitn,
maxits,
monit,
lwork  Integer, Intent (Inout)  :: 
ifail  Integer, Intent (Out)  :: 
lwreq  Real (Kind=nag_wp), Intent (In)  :: 
tol,
anorm,
sigmax,
sigtol  Complex (Kind=nag_wp), Intent (Out)  :: 
work(lwork)  Character (*), Intent (In)  :: 
method  Character (1), Intent (In)  :: 
precon,
sigcmp,
norm,
weight 

C Header Interface
#include nagmk26.h
void 
f11grf_ (
const char *method,
const char *precon,
const char *sigcmp,
const char *norm,
const char *weight,
const Integer *iterm,
const Integer *n,
const double *tol,
const Integer *maxitn,
const double *anorm,
const double *sigmax,
const double *sigtol,
const Integer *maxits,
const Integer *monit,
Integer *lwreq,
Complex work[],
const Integer *lwork,
Integer *ifail,
const Charlen length_method,
const Charlen length_precon,
const Charlen length_sigcmp,
const Charlen length_norm,
const Charlen length_weight) 

3
Description
The suite consisting of the
routines
f11grf,
f11gsf and
f11gtf
is designed to solve the complex Hermitian system of simultaneous linear equations
$Ax=b$ of order
$n$, where
$n$ is large and the matrix of the coefficients
$A$ is sparse.
f11grf is a setup routine which must be called before the iterative solver
f11gsf.
f11gtf, the third routine in the suite, can be used to return additional information about the computation. Either of two methods can be used:
Both CG and SYMMLQ methods start from the residual
${r}_{0}=bA{x}_{0}$, where
${x}_{0}$ is an initial estimate for the solution (often
${x}_{0}=0$), and generate an orthogonal basis for the Krylov subspace
$\mathrm{span}\left\{{A}^{\mathit{k}}{r}_{0}\right\}$, for
$\mathit{k}=0,1,\dots $, by means of threeterm recurrence relations (see
Golub and Van Loan (1996)). A sequence of real symmetric tridiagonal matrices
$\left\{{T}_{k}\right\}$ is also generated. Here and in the following, the index
$k$ denotes the iteration count. The resulting real symmetric tridiagonal systems of equations are usually more easily solved than the original problem. A sequence of solution iterates
$\left\{{x}_{k}\right\}$ is thus generated such that the sequence of the norms of the residuals
$\left\{\Vert {r}_{k}\Vert \right\}$ converges to a required tolerance. Note that, in general, the convergence is not monotonic.
In exact arithmetic, after $n$ iterations, this process is equivalent to an orthogonal reduction of $A$ to real symmetric tridiagonal form, ${T}_{n}={Q}^{\mathrm{H}}AQ$; the solution ${x}_{n}$ would thus achieve exact convergence. In finiteprecision arithmetic, cancellation and roundoff errors accumulate causing loss of orthogonality. These methods must therefore be viewed as genuinely iterative methods, able to converge to a solution within a prescribed tolerance.
The orthogonal basis is not formed explicitly in either method. The basic difference between the two methods lies in the method of solution of the resulting real symmetric tridiagonal systems of equations: the conjugate gradient method is equivalent to carrying out an $LD{L}^{\mathrm{H}}$ (Cholesky) factorization whereas the Lanczos method (SYMMLQ) uses an $LQ$ factorization.
Faster convergence can be achieved using a
preconditioner (see
Golub and Van Loan (1996) and
Barrett et al. (1994)). A preconditioner maps the original system of equations onto a different system, say
with, hopefully, better characteristics with respect to its speed of convergence: for example, the condition number of the matrix of the coefficients can be improved or eigenvalues in its spectrum can be made to coalesce. An orthogonal basis for the Krylov subspace
$\mathrm{span}\left\{{\stackrel{}{A}}^{\mathit{k}}{\stackrel{}{r}}_{0}\right\}$, for
$\mathit{k}=0,1,\dots $, is generated and the solution proceeds as outlined above. The algorithms used are such that the solution and residual iterates of the original system are produced, not their preconditioned counterparts. Note that an unsuitable preconditioner or no preconditioning at all may result in a very slow rate, or lack, of convergence. However, preconditioning involves a tradeoff between the reduction in the number of iterations required for convergence and the additional computational costs per iteration. Also, setting up a preconditioner may involve nonnegligible overheads.
A preconditioner must be
Hermitian and positive definite, i.e., representable by
$M=E{E}^{\mathrm{H}}$, where
$M$ is nonsingular, and such that
$\stackrel{}{A}={E}^{1}A{E}^{\mathrm{H}}\sim {I}_{n}$ in
(1), where
${I}_{n}$ is the identity matrix of order
$n$. Also, we can define
$\stackrel{}{r}={E}^{1}r$ and
$\stackrel{}{x}={E}^{\mathrm{H}}x$. These are formal definitions, used only in the design of the algorithms; in practice, only the means to compute the matrixvector products
$v=Au$ and to solve the preconditioning equations
$Mv=u$ are required, that is, explicit information about
$M$,
$E$ or their inverses is not required at any stage.
The first termination criterion
is available for both conjugate gradient and Lanczos (SYMMLQ) methods. In
(2),
$p=1,\infty \text{ or}2$ and
$\tau $ denotes a userspecified tolerance subject to
$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(10,\sqrt{n}\right)\epsilon \le \tau <1$, where
$\epsilon $ is the
machine precision. Facilities are provided for the estimation of the norm of the matrix of the coefficients
${\Vert A\Vert}_{1}={\Vert A\Vert}_{\infty}$, when this is not known in advance, used in
(2), by applying Higham's method (see
Higham (1988)). Note that
${\Vert A\Vert}_{2}$ cannot be estimated internally. This criterion uses an error bound derived from
backward error analysis to ensure that the computed solution is the exact solution of a problem as close to the original as the termination tolerance requires. Termination criteria employing bounds derived from
forward error analysis could be used, but any such criteria would require information about the condition number
$\kappa \left(A\right)$ which is not easily obtainable.
The second termination criterion
is available only for the Lanczos method (SYMMLQ). In
(3),
${\sigma}_{1}\left(\stackrel{}{A}\right)={\Vert \stackrel{}{A}\Vert}_{2}$ is the largest singular value of the (preconditioned) iteration matrix
$\stackrel{}{A}$. This termination criterion monitors the progress of the solution of the preconditioned system of equations and is less expensive to apply than criterion
(2). When
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is not supplied, facilities are provided for its estimation by
${\sigma}_{1}\left(\stackrel{}{A}\right)\sim {\displaystyle \underset{k}{\mathrm{max}}}\phantom{\rule{0.25em}{0ex}}{\sigma}_{1}\left({T}_{k}\right)$. The interlacing property
${\sigma}_{1}\left({T}_{k1}\right)\le {\sigma}_{1}\left({T}_{k}\right)$ and Gerschgorin's theorem provide lower and upper bounds from which
${\sigma}_{1}\left({T}_{k}\right)$ can be easily computed by bisection. Alternatively, the less expensive estimate
${\sigma}_{1}\left(\stackrel{}{A}\right)\sim {\displaystyle \underset{k}{\mathrm{max}}}\phantom{\rule{0.25em}{0ex}}{\Vert {T}_{k}\Vert}_{1}$ can be used, where
${\sigma}_{1}\left(\stackrel{}{A}\right)\le {\Vert {T}_{k}\Vert}_{1}$ by Gerschgorin's theorem. Note that only order of magnitude estimates are required by the termination criterion.
Termination criterion
(2) is the recommended choice, despite its (small) additional costs per iteration when using the Lanczos method (SYMMLQ). Also, if the norm of the initial estimate is much larger than the norm of the solution, that is, if
$\Vert {x}_{0}\Vert \gg \Vert x\Vert $, a dramatic loss of significant digits could result in complete lack of convergence. The use of criterion
(2) will enable the detection of such a situation, and the iteration will be restarted at a suitable point. No such restart facilities are provided for criterion
(3).
Optionally, a vector
$w$ of userspecified weights can be used in the computation of the vector norms in termination criterion
(2), i.e.,
${{\Vert v\Vert}_{p}}^{\left(w\right)}={\Vert {v}^{\left(w\right)}\Vert}_{p}$, where
${\left({v}^{\left(w\right)}\right)}_{\mathit{i}}={w}_{\mathit{i}}{v}_{\mathit{i}}$, for
$\mathit{i}=1,2,\dots ,n$. Note that the use of weights increases the computational costs.
The sequence of calls to the routines comprising the suite is enforced: first, the setup routine
f11grf must be called, followed by the solver
f11gsf.
f11gtf can be called either when
f11gsf is carrying out a monitoring step or after
f11gsf has completed its tasks. Incorrect sequencing will raise an error condition.
4
References
Barrett R, Berry M, Chan T F, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C and Van der Vorst H (1994) Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods SIAM, Philadelphia
Dias da Cunha R and Hopkins T (1994) PIM 1.1 — the parallel iterative method package for systems of linear equations user's guide — Fortran 77 version Technical Report Computing Laboratory, University of Kent at Canterbury, Kent, UK
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hestenes M and Stiefel E (1952) Methods of conjugate gradients for solving linear systems J. Res. Nat. Bur. Stand. 49 409–436
Higham N J (1988) FORTRAN codes for estimating the onenorm of a real or complex matrix, with applications to condition estimation ACM Trans. Math. Software 14 381–396
Paige C C and Saunders M A (1975) Solution of sparse indefinite systems of linear equations SIAM J. Numer. Anal. 12 617–629
5
Arguments
 1: $\mathbf{method}$ – Character(*)Input

On entry: the iterative method to be used.
 ${\mathbf{method}}=\text{'CG'}$
 Conjugate gradient method.
 ${\mathbf{method}}=\text{'SYMMLQ'}$
 Lanczos method (SYMMLQ).
Constraint:
${\mathbf{method}}=\text{'CG'}$ or $\text{'SYMMLQ'}$.
 2: $\mathbf{precon}$ – Character(1)Input

On entry: determines whether preconditioning is used.
 ${\mathbf{precon}}=\text{'N'}$
 No preconditioning.
 ${\mathbf{precon}}=\text{'P'}$
 Preconditioning.
Constraint:
${\mathbf{precon}}=\text{'N'}$ or $\text{'P'}$.
 3: $\mathbf{sigcmp}$ – Character(1)Input

On entry: determines whether an estimate of
${\sigma}_{1}\left(\stackrel{}{A}\right)={\Vert {E}^{1}A{E}^{\mathrm{H}}\Vert}_{2}$, the largest singular value of the preconditioned matrix of the coefficients, is to be computed using the bisection method on the sequence of tridiagonal matrices
$\left\{{T}_{k}\right\}$ generated during the iteration. Note that
$\stackrel{}{A}=A$ when a preconditioner is not used.
If
${\mathbf{sigmax}}>0.0$ (see below), i.e., when
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is supplied, the value of
sigcmp is ignored.
 ${\mathbf{sigcmp}}=\text{'S'}$
 ${\sigma}_{1}\left(\stackrel{}{A}\right)$ is to be computed using the bisection method.
 ${\mathbf{sigcmp}}=\text{'N'}$
 The bisection method is not used.
If the termination criterion
(3) is used, requiring
${\sigma}_{1}\left(\stackrel{}{A}\right)$, an inexpensive estimate is computed and used (see
Section 3).
Suggested value:
${\mathbf{sigcmp}}=\text{'N'}$.
Constraint:
${\mathbf{sigcmp}}=\text{'S'}$ or $\text{'N'}$.
 4: $\mathbf{norm}$ – Character(1)Input

On entry: defines the matrix and vector norm to be used in the termination criteria.
 ${\mathbf{norm}}=\text{'1'}$
 Use the ${l}_{1}$ norm.
 ${\mathbf{norm}}=\text{'I'}$
 Use the ${l}_{\infty}$ norm.
 ${\mathbf{norm}}=\text{'2'}$
 Use the ${l}_{2}$ norm.
Suggested values:
 if ${\mathbf{iterm}}=1$, ${\mathbf{norm}}=\text{'I'}$;
 if ${\mathbf{iterm}}=2$, ${\mathbf{norm}}=\text{'2'}$.
Constraints:
 if ${\mathbf{iterm}}=1$, ${\mathbf{norm}}=\text{'1'}$, $\text{'I'}$ or $\text{'2'}$;
 if ${\mathbf{iterm}}=2$, ${\mathbf{norm}}=\text{'2'}$.
 5: $\mathbf{weight}$ – Character(1)Input

On entry: specifies whether a vector
$w$ of usersupplied weights is to be used in the vector norms used in the computation of termination criterion
(2) (
${\mathbf{iterm}}=1$):
${{\Vert v\Vert}_{p}}^{\left(w\right)}={\Vert {v}^{\left(w\right)}\Vert}_{p}$, where
${v}_{\mathit{i}}^{\left(w\right)}={w}_{\mathit{i}}{v}_{\mathit{i}}$, for
$\mathit{i}=1,2,\dots ,n$. The suffix
$p=1,2,\infty $ denotes the vector norm used, as specified by the argument
norm. Note that weights cannot be used when
${\mathbf{iterm}}=2$, i.e., when criterion
(3) is used.
 ${\mathbf{weight}}=\text{'W'}$
 Usersupplied weights are to be used and must be supplied on initial entry to f11gsf.
 ${\mathbf{weight}}=\text{'N'}$
 All weights are implicitly set equal to one. Weights do not need to be supplied on initial entry to f11gsf.
Suggested value:
${\mathbf{weight}}=\text{'N'}$.
Constraints:
 if ${\mathbf{iterm}}=1$, ${\mathbf{weight}}=\text{'W'}$ or $\text{'N'}$;
 if ${\mathbf{iterm}}=2$, ${\mathbf{weight}}=\text{'N'}$.
 6: $\mathbf{iterm}$ – IntegerInput

On entry: defines the termination criterion to be used.
 ${\mathbf{iterm}}=1$
 Use the termination criterion defined in (2) (both conjugate gradient and Lanczos (SYMMLQ) methods).
 ${\mathbf{iterm}}=2$
 Use the termination criterion defined in (3) (Lanczos method (SYMMLQ) only).
Suggested value:
${\mathbf{iterm}}=1$.
Constraints:
 if ${\mathbf{method}}=\text{'CG'}$, ${\mathbf{iterm}}=1$;
 if ${\mathbf{method}}=\text{'SYMMLQ'}$, ${\mathbf{iterm}}=1$ or $2$.
 7: $\mathbf{n}$ – IntegerInput

On entry: $n$, the order of the matrix $A$.
Constraint:
${\mathbf{n}}>0$.
 8: $\mathbf{tol}$ – Real (Kind=nag_wp)Input

On entry: the tolerance
$\tau $ for the termination criterion.
If ${\mathbf{tol}}\le 0.0$, $\tau =\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(\sqrt{\epsilon},\sqrt{n}\epsilon \right)$ is used, where $\epsilon $ is the machine precision.
Otherwise $\tau =\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{tol}},10\epsilon ,\sqrt{n}\epsilon \right)$ is used.
Constraint:
${\mathbf{tol}}<1.0$.
 9: $\mathbf{maxitn}$ – IntegerInput

On entry: the maximum number of iterations.
Constraint:
${\mathbf{maxitn}}>0$.
 10: $\mathbf{anorm}$ – Real (Kind=nag_wp)Input

On entry: if
${\mathbf{anorm}}>0.0$, the value of
${\Vert A\Vert}_{p}$ to be used in the termination criterion
(2) (
${\mathbf{iterm}}=1$).
If
${\mathbf{anorm}}\le 0.0$,
${\mathbf{iterm}}=1$ and
${\mathbf{norm}}=\text{'1'}$ or
$\text{'I'}$,
${\Vert A\Vert}_{1}={\Vert A\Vert}_{\infty}$ is estimated internally by
f11gsf.
If
${\mathbf{iterm}}=2$,
anorm is not referenced.
Constraint:
if ${\mathbf{iterm}}=1$ and ${\mathbf{norm}}=2$, ${\mathbf{anorm}}>0.0$.
 11: $\mathbf{sigmax}$ – Real (Kind=nag_wp)Input

On entry: if
${\mathbf{sigmax}}>0.0$, the value of
${\sigma}_{1}\left(\stackrel{}{A}\right)={\Vert {E}^{1}A{E}^{\mathrm{H}}\Vert}_{2}$.
If
${\mathbf{sigmax}}\le 0.0$,
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is estimated by
f11gsf when either
${\mathbf{sigcmp}}=\text{'S'}$ or termination criterion
(3) (
${\mathbf{iterm}}=2$) is employed, though it will be used only in the latter case.
Otherwise,
sigmax is not referenced.
 12: $\mathbf{sigtol}$ – Real (Kind=nag_wp)Input

On entry: the tolerance used in assessing the convergence of the estimate of
${\sigma}_{1}\left(\stackrel{}{A}\right)={\Vert \stackrel{}{A}\Vert}_{2}$ when the bisection method is used.
If ${\mathbf{sigtol}}\le 0.0$, the default value ${\mathbf{sigtol}}=0.01$ is used. The actual value used is $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{sigtol}},\epsilon \right)$.
If
${\mathbf{sigcmp}}=\text{'N'}$ or
${\mathbf{sigmax}}>0.0$,
sigtol is not referenced.
Suggested value:
${\mathbf{sigtol}}=0.01$ should be sufficient in most cases.
Constraint:
if ${\mathbf{sigcmp}}=\text{'S'}$ and ${\mathbf{sigmax}}\le 0.0$, ${\mathbf{sigtol}}<1.0$.
 13: $\mathbf{maxits}$ – IntegerInput

On entry: the maximum iteration number
$k={\mathbf{maxits}}$ for which
${\sigma}_{1}\left({T}_{k}\right)$ is computed by bisection (see also
Section 3). If
${\mathbf{sigcmp}}=\text{'N'}$ or
${\mathbf{sigmax}}>0.0$,
maxits is not referenced.
Suggested value:
${\mathbf{maxits}}=\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(10,n\right)$ when
sigtol is of the order of its default value
$\left(0.01\right)$.
Constraint:
if ${\mathbf{sigcmp}}=\text{'S'}$ and ${\mathbf{sigmax}}\le 0.0$, $1\le {\mathbf{maxits}}\le {\mathbf{maxitn}}$.
 14: $\mathbf{monit}$ – IntegerInput

On entry: if
${\mathbf{monit}}>0$, the frequency at which a monitoring step is executed by
f11gsf: the current solution and residual iterates will be returned by
f11gsf and a call to
f11gtf made possible every
monit iterations, starting from iteration number
monit. Otherwise, no monitoring takes place. There are some additional computational costs involved in monitoring the solution and residual vectors when the Lanczos method (SYMMLQ) is used.
Constraint:
${\mathbf{monit}}\le {\mathbf{maxitn}}$.
 15: $\mathbf{lwreq}$ – IntegerOutput

On exit: the minimum amount of workspace required by
f11gsf. (See also
Section 5 in
f11gsf.)
 16: $\mathbf{work}\left({\mathbf{lwork}}\right)$ – Complex (Kind=nag_wp) arrayCommunication Array

On exit: the array
work is initialized by
f11grf. It must
not be modified before calling the next routine in the suite, namely
f11gsf.
 17: $\mathbf{lwork}$ – IntegerInput

On entry: the dimension of the array
work as declared in the (sub)program from which
f11grf is called.
Constraint:
${\mathbf{lwork}}\ge 120$.
Note: although the minimum value of
lwork ensures the correct functioning of
f11grf, a larger value is required by the other routines in the suite, namely
f11gsf and
f11gtf. The required value is as follows:
Method 
Requirements 
CG 
${\mathbf{lwork}}=120+5n+p$ 
SYMMLQ 
${\mathbf{lwork}}=120+6n+p$ 
where
 $p=2\times \left({\mathbf{maxits}}+1\right)$, when an estimate of ${\sigma}_{1}\left(A\right)$ (sigmax) is computed;
 $p=0$, otherwise.
 18: $\mathbf{ifail}$ – IntegerInput/Output

On entry:
ifail must be set to
$0$,
$1\text{ or}1$. If you are unfamiliar with this argument you should refer to
Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{ or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{ or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=i$

On entry, the $i$th argument had an illegal value.
 ${\mathbf{ifail}}=1$

f11grf has been called out of sequence.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 3.9 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 3.8 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 3.7 in How to Use the NAG Library and its Documentation for further information.
7
Accuracy
Not applicable.
8
Parallelism and Performance
f11grf is not threaded in any implementation.
When
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is not supplied (
${\mathbf{sigmax}}\le 0.0$) but it is required, it is estimated by
f11gsf using either of the two methods described in
Section 3, as specified by the argument
sigcmp. In particular, if
${\mathbf{sigcmp}}=\text{'S'}$, then the computation of
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is deemed to have converged when the differences between three successive values of
${\sigma}_{1}\left({T}_{k}\right)$ differ, in a relative sense, by less than the tolerance
sigtol, i.e., when
The computation of
${\sigma}_{1}\left(\stackrel{}{A}\right)$ is also terminated when the iteration count exceeds the maximum value allowed, i.e.,
$k\ge {\mathbf{maxits}}$.
Bisection is increasingly expensive with increasing iteration count. A reasonably large value of
sigtol, of the order of the suggested value, is recommended and an excessive value of
maxits should be avoided. Under these conditions,
${\sigma}_{1}\left(\stackrel{}{A}\right)$ usually converges within very few iterations.
10
Example
This example solves a complex Hermitian system of simultaneous linear equations using the conjugate gradient method, where the matrix of the coefficients
$A$, has a random sparsity pattern. An incomplete Cholesky preconditioner is used (
f11jaf and
f11jbf).
10.1
Program Text
Program Text (f11grfe.f90)
10.2
Program Data
Program Data (f11grfe.d)
10.3
Program Results
Program Results (f11grfe.r)