If $A$, a square matrix of order $n$, is nonsingular (has rank $n$), then its inverse $X$ exists and satisfies the equations $AX=XA=I$ (the identity or unit matrix).
It is worth noting that if $AX-I=R$, so that $R$ is the ‘residual’ matrix, then a bound on the relative error is given by $\Vert R\Vert $, i.e.,
A real matrix $A$ has no inverse if it is square ($n\times n$) and singular (has rank $\text{}<n$), or if it is of shape ($m\times n$) with $m\ne n$, but there is a Generalized or Pseudo-inverse${A}^{+}$ which satisfies the equations
where $Q$ is an $m\times m$ orthogonal matrix and $R$ is an $n\times n$, nonsingular, upper triangular matrix. The pseudo-inverse of $A$ is then given by
where $Q$ is an $n\times n$ orthogonal matrix and $R$ is an $m\times m$, nonsingular, upper triangular matrix. The pseudo-inverse of $A$ is then given by
$${A}^{+}=\stackrel{~}{Q}{R}^{-1}\text{,}$$
where $\stackrel{~}{Q}$ consists of the first $m$ columns of $Q$.
(c)if $m\ge n$ and $\mathrm{rank}\left(A\right)=r\le n$ then $A$ can be factorized using a $QR$ factorization, with column interchanges, as
where $Q$ is an $m\times m$ orthogonal matrix, $R$ is an $r\times n$ upper trapezoidal matrix and $P$ is an $n\times n$ permutation matrix. The pseudo-inverse of $A$ is then given by
where $\stackrel{~}{Q}$ consists of the first $r$ columns of $Q$.
(d)if $\mathrm{rank}\left(A\right)=r\le k=\mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$ then $A$ can be factorized as the singular value decomposition
$$A=U\Sigma {V}^{\mathrm{T}}\text{,}$$
where $U$ is an $m\times m$ orthogonal matrix, $V$ is an $n\times n$ orthogonal matrix and $\Sigma $ is an $m\times n$ diagonal matrix with non-negative diagonal elements $\sigma $. The first $k$ columns of $U$ and $V$ are the left- and right-hand singular vectors of $A$ respectively and the $k$ diagonal elements of $\Sigma $ are the singular values of $A$. $\Sigma $ may be chosen so that
If $\stackrel{~}{U}$ and $\stackrel{~}{V}$ consist of the first $r$ columns of $U$ and $V$ respectively and $\stackrel{~}{\Sigma}$ is an $r\times r$ diagonal matrix with diagonal elements ${\sigma}_{1},{\sigma}_{2},\dots ,{\sigma}_{r}$ then $A$ is given by
which is the classical eigenvalue (spectral) factorization of ${A}^{\mathrm{T}}A$.
(e)if $A$ is complex then the above relationships are still true if we use ‘unitary’ in place of ‘orthogonal’ and conjugate transpose in place of transpose. For example, the singular value decomposition of $A$ is
$$A=U\Sigma {V}^{\mathrm{H}}\text{,}$$
where $U$ and $V$ are unitary, ${V}^{\mathrm{H}}$ the conjugate transpose of $V$ and $\Sigma $ is as in (ii)(d) above.
2.2Matrix Factorizations
The routines in this section perform matrix factorizations which are required for the solution of systems of linear equations with various special structures. A few routines which perform associated computations are also included.
Other routines for matrix factorizations are to be found in Chapters F07, F08 and F11.
This section also contains a few routines associated with eigenvalue problems (see Chapter F02). (Historical note: this section used to contain many more such routines, but they have now been superseded by routines in Chapter F08.)
Finally, this section contains routines for computing non-negative matrix factorizations, which are used for dimensional reduction and classification in data analysis. Given a rectangular $m\times n$ matrix, $A$, with non-negative elements, a non-negative matrix factorization of $A$ is an approximate factorization of $A$ into the product of an $m\times k$ non-negative matrix $W$ and a $k\times n$ non-negative matrix $H$, so that $A\approx WH$. Typically $k$ is chosen so that $k\ll \mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$. The matrices $W$ and $H$ are then computed to minimize ${|A-WH|}_{F}$. The factorization is not unique.
2.3Matrix Arithmetic and Manipulation
The intention of routines in this section (f01c, f01d, f01v, and f01z) is to cater for some of the commonly occurring operations in matrix manipulation, i.e., transposing a matrix or adding part of one matrix to another, and for conversion between different storage formats,such as conversion between rectangular band matrix storage and packed band matrix storage. For vector or matrix-vector or matrix-matrix operations refer to Chapters F06 and F16.
2.4Matrix Functions
Given a square matrix $A$, the matrix function $f\left(A\right)$ is a matrix with the same dimensions as $A$ which provides a generalization of the scalar function $f$.
If $A$ has a full set of eigenvectors $V$ then $A$ can be factorized as
$$A=VD{V}^{-1}\text{,}$$
where $D$ is the diagonal matrix whose diagonal elements, ${d}_{i}$, are the eigenvalues of $A$. $f\left(A\right)$ is given by
where $f\left(D\right)$ is the diagonal matrix whose $i$th diagonal element is $f\left({d}_{i}\right)$.
In general, $A$ may not have a full set of eigenvectors. The matrix function can then be defined via a Cauchy integral. For $A\in {\u2102}^{n\times n}$,
where $\Gamma $ is a closed contour surrounding the eigenvalues of $A$, and $f$ is analytic within $\Gamma $.
Some matrix functions are defined implicitly. A matrix logarithm is a solution $X$ to the equation
$${e}^{X}=A\text{.}$$
In general, $X$ is not unique, but if $A$ has no eigenvalues on the closed negative real line then a unique principal logarithm exists whose eigenvalues have imaginary part between $\pi $ and $-\pi $. Similarly, a matrix square root is a solution $X$ to the equation
$${X}^{2}=A\text{.}$$
If $A$ has no eigenvalues on the closed negative real line then a unique principal square root exists with eigenvalues in the right half-plane. If $A$ has a vanishing eigenvalue then $\mathrm{log}\left(A\right)$ cannot be computed. If the vanishing eigenvalue is defective (its algebraic multiplicity exceeds its geometric multiplicity, or equivalently it occurs in a Jordan block of size greater than $1$) then the square root cannot be computed. If the vanishing eigenvalue is semisimple (its algebraic and geometric multiplicities are equal, or equivalently it occurs only in Jordan blocks of size $1$) then a square root can be computed.
Algorithms for computing matrix functions are usually tailored to a specific function. Currently, Chapter F01 contains routines for calculating the exponential, logarithm, sine, cosine, sinh, cosh, square root and the general real power of both real and complex matrices. In addition, there are routines to compute a general function of real symmetric and complex Hermitian matrices and a general function of general real and complex matrices.
The Fréchet derivative of a matrix function $f\left(A\right)$ in the direction of the matrix $E$ is the linear function mapping $E$ to ${L}_{f}(A,E)$ such that
The Fréchet derivative measures the first-order effect on $f\left(A\right)$ of perturbations in $A$. Chapter F01 contains routines for calculating the Fréchet derivative of the exponential, logarithm and real powers of both real and complex matrices.
The condition number of a matrix function is a measure of its sensitivity to perturbations in the data. The absolute condition number measures these perturbations in an absolute sense and is defined by
Chapter F01 contains routines for calculating the condition number of the matrix exponential, logarithm, sine, cosine, sinh, cosh, square root and the general real power of both real and complex matrices. It also contains routines for estimating the condition number of a general function of a real or complex matrix.
3Recommendations on Choice and Use of Available Routines
3.1Matrix Inversion
Note: before using any routine for matrix inversion, consider carefully whether it is needed.
Although the solution of a set of linear equations $Ax=b$ can be written as $x={A}^{-1}b$, the solution should never be computed by first inverting $A$ and then computing ${A}^{-1}b$; the routines in Chapters F04 or F07 should always be used to solve such sets of equations directly; they are faster in execution, and numerically more stable and accurate. Similar remarks apply to the solution of least squares problems which again should be solved by using the routines in
Chapters F04 and F08
rather than by computing a pseudo-inverse.
(a)Nonsingular square matrices of order $n$
This chapter describes techniques for inverting a general real matrix $A$ and matrices which are positive definite (have all eigenvalues positive) and are either real and symmetric or complex and Hermitian. It is wasteful and uneconomical not to use the appropriate routine when a matrix is known to have one of these special forms. A general routine must be used when the matrix is not known to be positive definite. In most routines, the inverse is computed by solving the linear equations $A{x}_{\mathit{i}}={e}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$, where ${e}_{i}$ is the $i$th column of the identity matrix.
Routines are given for calculating the approximate inverse, that is solving the linear equations just once, and also for obtaining the accurate inverse by successive iterative corrections of this first approximation. The latter, of course, are more costly in terms of time and storage, since each correction involves the solution of $n$ sets of linear equations and since the original $A$ and its $LU$ decomposition must be stored together with the first and successively corrected approximations to the inverse. In practice, the storage requirements for the ‘corrected’ inverse routines are about double those of the ‘approximate’ inverse routines, though the extra computer time is not prohibitive since the same matrix and the same $LU$ decomposition is used in every linear equation solution.
Despite the extra work of the ‘corrected’ inverse routines, they are superior to the ‘approximate’ inverse routines. A correction provides a means of estimating the number of accurate figures in the inverse or the number of ‘meaningful’ figures relating to the degree of uncertainty in the coefficients of the matrix.
The residual matrix $R=AX-I$, where $X$ is a computed inverse of $A$, conveys useful information. Firstly
$\Vert R\Vert $ is a bound on the relative error in $X$ and secondly $\Vert R\Vert <\frac{1}{2}$ guarantees the convergence of the iterative process in the ‘corrected’ inverse routines.
The decision trees for inversion show which routines in
Chapter F04 and
Chapter F07 should be used for the inversion of other special types of matrices not treated in the chapter.
(b)General real rectangular matrices
For real matrices f08aef and f01qjf return $QR$ and $RQ$ factorizations of $A$ respectively and f08bff returns the $QR$ factorization with column interchanges. The corresponding complex routines are f08asf, f01rjf and f08btf respectively. Routines are also provided to form the orthogonal matrices and transform by the orthogonal matrices following the use of the above routines. f01qgfandf01rgf form the $RQ$ factorization of an upper trapezoidal matrix for the real and complex cases respectively.
f01blf uses the $QR$ factorization as described in Section 2.1(ii)(a) and is the only routine that explicitly returns a pseudo-inverse. If $m\ge n$ then the routine will calculate the pseudo-inverse ${A}^{+}$ of the matrix $A$. If $m<n$ then the $n\times m$ matrix ${A}^{\mathrm{T}}$ should be used. The routine will calculate the pseudo-inverse $Z={\left({A}^{\mathrm{T}}\right)}^{+}={\left({A}^{+}\right)}^{\mathrm{T}}$ of ${A}^{\mathrm{T}}$ and the required pseudo-inverse will be ${Z}^{\mathrm{T}}$. The routine also attempts to calculate the rank, $r$, of the matrix given a tolerance to decide when elements can be regarded as zero. However, should this routine fail due to an incorrect determination of the rank, the singular value decomposition method (described below) should be used.
f08kbfandf08kpf
compute the singular value decomposition as described in Section 2 for real and complex matrices respectively. If $A$ has rank $r\le k=\mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$ then the $k-r$ smallest singular values will be negligible and the pseudo-inverse of $A$ can be obtained as ${A}^{+}=V{\Sigma}^{-1}{U}^{\mathrm{T}}$ as described in Section 2. If the rank of $A$ is not known in advance it can be estimated from the singular values (see Section 2.4 in the F04 Chapter Introduction).
In the real case with $m\ge n$, f08aef followed by f02wuf provide details of the $QR$ factorization or the singular value decomposition depending on whether or not $A$ is of full rank and for some problems provides an attractive alternative to f08kbf.
For large sparse matrices, leading terms in the singular value decomposition can be computed using routines from Chapter F12.
3.2Matrix Factorizations
Most of these routines serve a special purpose required for the solution of sets of simultaneous linear equations or the eigenvalue problem. For further details, you should consult Sections 3 or 4 in the F02 Chapter Introduction or Sections 3 or 4 in the F04 Chapter Introduction.
f01brfandf01bsf are provided for factorizing general real sparse matrices. A more recent algorithm for the same problem is available through f11mef. For factorizing real symmetric positive definite sparse matrices, see f11jaf. These routines should only be used when $A$ is not banded and when the total number of nonzero elements is less than 10% of the total number of elements. In all other cases, either the band routines or the general routines should be used.
f01mdfandf01mef compute the Cheng–Higham modified Cholesky factorization of a real symmetric matrix and the positive definite perturbed input matrix from the factors.
The routines f01saf (for dense matrices) and f01sbf (sparse matrices, using a reverse communication interface) are provided for computing non-negative matrix factorizations.
3.3Matrix Arithmetic and Manipulation
The routines in the f01c section are designed for the general handling of $m\times n$ matrices. Emphasis has been placed on flexibility in the argument specifications and on avoiding, where possible, the use of internally declared arrays. They are, therefore, suited for use with large matrices of variable row and column dimensions. Routines are included for the addition and subtraction of sub-matrices of larger matrices, as well as the standard manipulations of full matrices. Those routines involving matrix multiplication may use additional-precision arithmetic for the accumulation of inner products. See also Chapter F06.
The routines in the f01d section perform arithmetic operations on triangular matrices.
The routines in the f01v (LAPACK) and f01z section are designed to allow conversion between full storage format and one of the packed storage schemes required by some of the routines in Chapters F02, F04, F06, F07 and F08.
3.3.1NAG Names and LAPACK Names
Routines with NAG name beginning f01v may be called either by their NAG names or by their LAPACK names. When using the NAG Library, the double precision form of the LAPACK name must be used (beginning with D- or Z-).
References to Chapter F01 routines in the manual normally include the LAPACK double precision names, for example, f01vef.
The LAPACK routine names follow a simple scheme (which is similar to that used for the BLAS in Chapter F06). Most names have the structure XYYTZZ, where the components have the following meanings:
– the initial letter, X, indicates the data type (real or complex) and precision:
S – real, single precision (in Fortran, 4 byte length REAL)
D – real, double precision (in Fortran, 8 byte length REAL)
C – complex, single precision (in Fortran, 8 byte length COMPLEX)
Z – complex, double precision (in Fortran, 16 byte length COMPLEX)
– the fourth letter, T, indicates that the routine is performing a storage scheme transformation (conversion)
– the letters YY indicate the original storage scheme used to store a triangular part of the matrix $A$, while the letters ZZ indicate the target storage scheme of the conversion (YY cannot equal ZZ since this would do nothing):
TF – Rectangular Full Packed Format (RFP)
TP – Packed Format
TR – Full Format
3.4Matrix Functions
f01ecfandf01fcf compute the matrix exponential, ${e}^{A}$, of a real and complex square matrix $A$ respectively. If estimates of the condition number of the matrix exponential are required then f01jgfandf01kgf should be used. If Fréchet derivatives are required then f01jhfandf01khf should be used.
f01edfandf01fdf compute the matrix exponential, ${e}^{A}$, of a real symmetric and complex Hermitian matrix respectively. If the matrix is real symmetric, or complex Hermitian then it is recommended that f01edf, or f01fdf be used as they are more efficient and, in general, more accurate than f01ecfandf01fcf.
f01ejfandf01fjf compute the principal matrix logarithm, $\mathrm{log}\left(A\right)$, of a real and complex square matrix $A$ respectively. If estimates of the condition number of the matrix logarithm are required then f01jjfandf01kjf should be used. If Fréchet derivatives are required then f01jkfandf01kkf should be used.
f01ekfandf01fkf compute the matrix exponential, sine, cosine, sinh or cosh of a real and complex square matrix $A$ respectively. If the matrix exponential is required then it is recommended that f01ecforf01fcf be used as they are, in general, more accurate than f01ekfandf01fkf. If estimates of the condition number of the matrix function are required then f01jafandf01kaf should be used.
f01elfandf01emf compute the matrix function, $f\left(A\right)$, of a real square matrix.
f01flfandf01fmf compute the matrix function of a complex square matrix. The derivatives of $f$ are required for these computations. f01elfandf01flf use numerical differentiation to obtain the derivatives of $f$. f01emfandf01fmf use derivatives you have supplied. If estimates of the condition number are required but you are not supplying derivatives then f01jbfandf01kbf should be used.
If estimates of the condition number of the matrix function are required and you are supplying derivatives of $f$ then f01jcfandf01kcf should be used.
If the matrix $A$ is real symmetric or complex Hermitian then it is recommended that to compute the matrix function, $f\left(A\right)$, f01effandf01fff are used respectively as they are more efficient and, in general, more accurate than
f01elf,f01emf,f01flfandf01fmf.
f01gafandf01haf compute the matrix function ${e}^{tA}B$ for explicitly stored dense real and complex matrices $A$ and $B$ respectively while f01gbfandf01hbf compute the same using reverse communication. In the latter case, control is returned to you. You should calculate any required matrix-matrix products and then call the routine again. See Section 7 in How to Use the NAG Library for further information.
f01enfandf01fnf compute the principal square root ${A}^{1/2}$ of a real and complex square matrix $A$ respectively. If $A$ is complex and upper triangular then f01fpf should be used. If $A$ is real and upper quasi-triangular then f01epf should be used. If estimates of the condition number of the matrix square root are required then f01jdfandf01kdf should be used.
f01eqfandf01fqf compute the matrix power ${A}^{p}$, where $p\in \mathbb{R}$, of real and complex matrices respectively. If estimates of the condition number of the matrix power are required then f01jefandf01kef should be used. If Fréchet derivatives are required then f01jffandf01kff should be used.
4Decision Trees
The decision trees show the routines in this chapter and in Chapter F04, Chapter F07 and Chapter F08 that should be used for inverting matrices of various types. They also show which routine should be used to calculate various matrix functions.
Note 1: the inverse of a band matrix $A$ does not, in general, have the same shape as $A$, and no routines are provided specifically for finding such an inverse. The matrix must either be treated as a full matrix or the equations $AX=B$ must be solved, where $B$ has been initialized to the identity matrix $I$. In the latter case, see the decision trees in Section 4 in the F04 Chapter Introduction.
Note 2: by ‘guaranteed accuracy’ we mean that the accuracy of the inverse is improved by the use of the iterative refinement technique using additional precision.