If $A$, a square matrix of order $n$, is nonsingular (has rank $n$), then its inverse $X$ exists and satisfies the equations $AX=XA=I$ (the identity or unit matrix).
It is worth noting that if $AX-I=R$, so that $R$ is the ‘residual’ matrix, then a bound on the relative error is given by $\Vert R\Vert $, i.e.,
A real matrix $A$ has no inverse if it is square ($n\times n$) and singular (has rank $\text{}<n$), or if it is of shape ($m\times n$) with $m\ne n$, but there is a Generalized or Pseudo-inverse${A}^{+}$ which satisfies the equations
where $Q$ is an $m\times m$ orthogonal matrix and $R$ is an $n\times n$, nonsingular, upper triangular matrix. The pseudo-inverse of $A$ is then given by
where $Q$ is an $n\times n$ orthogonal matrix and $R$ is an $m\times m$, nonsingular, upper triangular matrix. The pseudo-inverse of $A$ is then given by
$${A}^{+}=\stackrel{~}{Q}{R}^{-1}\text{,}$$
where $\stackrel{~}{Q}$ consists of the first $m$ columns of $Q$.
(c)if $m\ge n$ and $\mathrm{rank}\left(A\right)=r\le n$ then $A$ can be factorized using a $QR$ factorization, with column interchanges, as
where $Q$ is an $m\times m$ orthogonal matrix, $R$ is an $r\times n$ upper trapezoidal matrix and $P$ is an $n\times n$ permutation matrix. The pseudo-inverse of $A$ is then given by
where $\stackrel{~}{Q}$ consists of the first $r$ columns of $Q$.
(d)if $\mathrm{rank}\left(A\right)=r\le k=\mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$ then $A$ can be factorized as the singular value decomposition
$$A=U\Sigma {V}^{\mathrm{T}}\text{,}$$
where $U$ is an $m\times m$ orthogonal matrix, $V$ is an $n\times n$ orthogonal matrix and $\Sigma $ is an $m\times n$ diagonal matrix with non-negative diagonal elements $\sigma $. The first $k$ columns of $U$ and $V$ are the left- and right-hand singular vectors of $A$ respectively and the $k$ diagonal elements of $\Sigma $ are the singular values of $A$. $\Sigma $ may be chosen so that
If $\stackrel{~}{U}$ and $\stackrel{~}{V}$ consist of the first $r$ columns of $U$ and $V$ respectively and $\stackrel{~}{\Sigma}$ is an $r\times r$ diagonal matrix with diagonal elements ${\sigma}_{1},{\sigma}_{2},\dots ,{\sigma}_{r}$ then $A$ is given by
which is the classical eigenvalue (spectral) factorization of ${A}^{\mathrm{T}}A$.
(e)if $A$ is complex then the above relationships are still true if we use ‘unitary’ in place of ‘orthogonal’ and conjugate transpose in place of transpose. For example, the singular value decomposition of $A$ is
$$A=U\Sigma {V}^{\mathrm{H}}\text{,}$$
where $U$ and $V$ are unitary, ${V}^{\mathrm{H}}$ the conjugate transpose of $V$ and $\Sigma $ is as in (ii)(d) above.
2.2Matrix Factorizations
The functions in this section perform matrix factorizations which are required for the solution of systems of linear equations with various special structures. A few functions which perform associated computations are also included.
Other functions for matrix factorizations are to be found in Chapters F07, F08 and F11.
This section also contains a few functions associated with eigenvalue problems (see Chapter F02). (Historical note: this section used to contain many more such functions, but they have now been superseded by functions in Chapter F08.)
Finally, this section contains functions for computing non-negative matrix factorizations, which are used for dimensional reduction and classification in data analysis. Given a rectangular $m\times n$ matrix, $A$, with non-negative elements, a non-negative matrix factorization of $A$ is an approximate factorization of $A$ into the product of an $m\times k$ non-negative matrix $W$ and a $k\times n$ non-negative matrix $H$, so that $A\approx WH$. Typically $k$ is chosen so that $k\ll \mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$. The matrices $W$ and $H$ are then computed to minimize ${|A-WH|}_{F}$. The factorization is not unique.
2.3Matrix Arithmetic and Manipulation
The intention of functions in this section (f01d and f01v) is to cater for some of the commonly occurring operations in matrix manipulation, i.e., conversion between different storage formats,such as conversion between rectangular band matrix storage and packed band matrix storage. For vector or matrix-vector or matrix-matrix operations refer to Chapters F06 and F16.
2.4Matrix Functions
Given a square matrix $A$, the matrix function $f\left(A\right)$ is a matrix with the same dimensions as $A$ which provides a generalization of the scalar function $f$.
If $A$ has a full set of eigenvectors $V$ then $A$ can be factorized as
$$A=VD{V}^{-1}\text{,}$$
where $D$ is the diagonal matrix whose diagonal elements, ${d}_{i}$, are the eigenvalues of $A$. $f\left(A\right)$ is given by
where $f\left(D\right)$ is the diagonal matrix whose $i$th diagonal element is $f\left({d}_{i}\right)$.
In general, $A$ may not have a full set of eigenvectors. The matrix function can then be defined via a Cauchy integral. For $A\in {\u2102}^{n\times n}$,
where $\Gamma $ is a closed contour surrounding the eigenvalues of $A$, and $f$ is analytic within $\Gamma $.
Some matrix functions are defined implicitly. A matrix logarithm is a solution $X$ to the equation
$${e}^{X}=A\text{.}$$
In general, $X$ is not unique, but if $A$ has no eigenvalues on the closed negative real line then a unique principal logarithm exists whose eigenvalues have imaginary part between $\pi $ and $-\pi $. Similarly, a matrix square root is a solution $X$ to the equation
$${X}^{2}=A\text{.}$$
If $A$ has no eigenvalues on the closed negative real line then a unique principal square root exists with eigenvalues in the right half-plane. If $A$ has a vanishing eigenvalue then $\mathrm{log}\left(A\right)$ cannot be computed. If the vanishing eigenvalue is defective (its algebraic multiplicity exceeds its geometric multiplicity, or equivalently it occurs in a Jordan block of size greater than $1$) then the square root cannot be computed. If the vanishing eigenvalue is semisimple (its algebraic and geometric multiplicities are equal, or equivalently it occurs only in Jordan blocks of size $1$) then a square root can be computed.
Algorithms for computing matrix functions are usually tailored to a specific function. Currently, Chapter F01 contains routines for calculating the exponential, logarithm, sine, cosine, sinh, cosh, square root and the general real power of both real and complex matrices. In addition, there are routines to compute a general function of real symmetric and complex Hermitian matrices and a general function of general real and complex matrices.
The Fréchet derivative of a matrix function $f\left(A\right)$ in the direction of the matrix $E$ is the linear function mapping $E$ to ${L}_{f}(A,E)$ such that
The Fréchet derivative measures the first-order effect on $f\left(A\right)$ of perturbations in $A$. Chapter F01 contains functions for calculating the Fréchet derivative of the exponential, logarithm and real powers of both real and complex matrices.
The condition number of a matrix function is a measure of its sensitivity to perturbations in the data. The absolute condition number measures these perturbations in an absolute sense and is defined by
Chapter F01 contains routines for calculating the condition number of the matrix exponential, logarithm, sine, cosine, sinh, cosh, square root and the general real power of both real and complex matrices. It also contains routines for estimating the condition number of a general function of a real or complex matrix.
3Recommendations on Choice and Use of Available Functions
3.1Matrix Inversion
Note: before using any function for matrix inversion, consider carefully whether it is needed.
Although the solution of a set of linear equations $Ax=b$ can be written as $x={A}^{-1}b$, the solution should never be computed by first inverting $A$ and then computing ${A}^{-1}b$; the functions in Chapters F04 or F07 should always be used to solve such sets of equations directly; they are faster in execution, and numerically more stable and accurate. Similar remarks apply to the solution of least squares problems which again should be solved by using the functions in
Chapters F04 and F08
rather than by computing a pseudo-inverse.
(a)Nonsingular square matrices of order $n$
This chapter describes techniques for inverting a general real matrix $A$ and matrices which are positive definite (have all eigenvalues positive) and are either real and symmetric or complex and Hermitian. It is wasteful and uneconomical not to use the appropriate function when a matrix is known to have one of these special forms. A general function must be used when the matrix is not known to be positive definite. In most functions, the inverse is computed by solving the linear equations $A{x}_{\mathit{i}}={e}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$, where ${e}_{i}$ is the $i$th column of the identity matrix.
The residual matrix $R=AX-I$, where $X$ is a computed inverse of $A$, conveys useful information. Firstly
$\Vert R\Vert $ is a bound on the relative error in $X$ and secondly $\Vert R\Vert <\frac{1}{2}$ guarantees the convergence of the iterative process in the ‘corrected’ inverse functions.
The decision trees for inversion show which functions in
Chapter F07 should be used for the inversion of other special types of matrices not treated in the chapter.
(b)General real rectangular matrices
For real matrices f08aec returns the $QR$ factorization of the matrix and f08bfc returns the $QR$ factorization with column interchanges. The corresponding complex functions are f08asc and f08btc respectively. Functions are also provided to form the orthogonal matrices and transform by the orthogonal matrices following the use of the above functions.
f08kbcandf08kpc
compute the singular value decomposition as described in Section 2 for real and complex matrices respectively. If $A$ has rank $r\le k=\mathrm{min}\phantom{\rule{0.125em}{0ex}}(m,n)$ then the $k-r$ smallest singular values will be negligible and the pseudo-inverse of $A$ can be obtained as ${A}^{+}=V{\Sigma}^{-1}{U}^{\mathrm{T}}$ as described in Section 2. If the rank of $A$ is not known in advance it can be estimated from the singular values (see Section 2.4 in the F04 Chapter Introduction).
For large sparse matrices, leading terms in the singular value decomposition can be computed using functions from Chapter F12.
3.2Matrix Factorizations
Most of these functions serve a special purpose required for the solution of sets of simultaneous linear equations or the eigenvalue problem. For further details, you should consult Sections 3 or 4 in the F02 Chapter Introduction or Sections 3 or 4 in the F04 Chapter Introduction.
f11dac is provided for factorizing general real sparse matrices. A more recent algorithm for the same problem is available through f11mec. For factorizing real symmetric positive definite sparse matrices, see f11jac. These functions should only be used when $A$ is not banded and when the total number of nonzero elements is less than 10% of the total number of elements. In all other cases, either the band functions or the general functions should be used.
f01mdcandf01mec compute the Cheng–Higham modified Cholesky factorization of a real symmetric matrix and the positive definite perturbed input matrix from the factors.
The functions f01sac (for dense matrices) and f01sbc (sparse matrices, using a reverse communication interface) are provided for computing non-negative matrix factorizations.
3.3Matrix Arithmetic and Manipulation
The functions in the f01d section perform arithmetic operations on triangular matrices.
The functions in the f01v (LAPACK) section are designed to allow conversion between full storage format and one of the packed storage schemes required by some of the functions in Chapters F02, F04, F06, F07 and F08.
3.3.1NAG Names and LAPACK Names
Functions with NAG short name beginning f01v may be called either by their NAG short names or by their NAG long names. The NAG long names for a function is simply the LAPACK name (in lower case) prepended by nag_, for example, nag_matop_dtrttf is the long name for f01vec.
When using the NAG Library, the double precision form of the LAPACK name must be used (beginning with d- or z-).
References to Chapter F01 functions in the manual normally include the LAPACK double precision names, for example, f01vec.
The LAPACK function names follow a simple scheme (which is similar to that used for the BLAS in Chapter F16). Most names have the structure nag_xyytzz, where the components have the following meanings:
– the initial letter, x, indicates the data type (real or complex) and precision:
d – real, double precision
z – complex, double precision
– the fourth letter, t, indicates that the function is performing a storage scheme transformation (conversion)
– the letters yy indicate the original storage scheme used to store a triangular part of the matrix $A$, while the letters zz indicate the target storage scheme of the conversion (yy cannot equal zz since this would do nothing):
tf – Rectangular Full Packed Format (RFP)
tp – Packed Format
tr – Full Format
3.4Matrix Functions
f01eccandf01fcc compute the matrix exponential, ${e}^{A}$, of a real and complex square matrix $A$ respectively. If estimates of the condition number of the matrix exponential are required then f01jgcandf01kgc should be used. If Fréchet derivatives are required then f01jhcandf01khc should be used.
f01edcandf01fdc compute the matrix exponential, ${e}^{A}$, of a real symmetric and complex Hermitian matrix respectively. If the matrix is real symmetric, or complex Hermitian then it is recommended that f01edc, or f01fdc be used as they are more efficient and, in general, more accurate than f01eccandf01fcc.
f01ejcandf01fjc compute the principal matrix logarithm, $\mathrm{log}\left(A\right)$, of a real and complex square matrix $A$ respectively. If estimates of the condition number of the matrix logarithm are required then f01jjcandf01kjc should be used. If Fréchet derivatives are required then f01jkcandf01kkc should be used.
f01ekcandf01fkc compute the matrix exponential, sine, cosine, sinh or cosh of a real and complex square matrix $A$ respectively. If the matrix exponential is required then it is recommended that f01eccorf01fcc be used as they are, in general, more accurate than f01ekcandf01fkc. If estimates of the condition number of the matrix function are required then f01jacandf01kac should be used.
f01elcandf01emc compute the matrix function, $f\left(A\right)$, of a real square matrix.
f01flcandf01fmc compute the matrix function of a complex square matrix. The derivatives of $f$ are required for these computations. f01elcandf01flc use numerical differentiation to obtain the derivatives of $f$. f01emcandf01fmc use derivatives you have supplied. If estimates of the condition number are required but you are not supplying derivatives then f01jbcandf01kbc should be used.
If estimates of the condition number of the matrix function are required and you are supplying derivatives of $f$ then f01jccandf01kcc should be used.
If the matrix $A$ is real symmetric or complex Hermitian then it is recommended that to compute the matrix function, $f\left(A\right)$, f01efcandf01ffc are used respectively as they are more efficient and, in general, more accurate than
f01elc,f01emc,f01flcandf01fmc.
f01gacandf01hac compute the matrix function ${e}^{tA}B$ for explicitly stored dense real and complex matrices $A$ and $B$ respectively while f01gbcandf01hbc compute the same using reverse communication. In the latter case, control is returned to you. You should calculate any required matrix-matrix products and then call the function again. See Section 7 in How to Use the NAG Library for further information.
f01encandf01fnc compute the principal square root ${A}^{1/2}$ of a real and complex square matrix $A$ respectively. If $A$ is complex and upper triangular then f01fpc should be used. If $A$ is real and upper quasi-triangular then f01epc should be used. If estimates of the condition number of the matrix square root are required then f01jdcandf01kdc should be used.
f01eqcandf01fqc compute the matrix power ${A}^{p}$, where $p\in \mathbb{R}$, of real and complex matrices respectively. If estimates of the condition number of the matrix power are required then f01jecandf01kec should be used. If Fréchet derivatives are required then f01jfcandf01kfc should be used.
4Decision Trees
The decision trees show the functions in this chapter and in Chapter F07 and Chapter F08 that should be used for inverting matrices of various types. They also show which function should be used to calculate various matrix functions.
Note 1: the inverse of a band matrix $A$ does not, in general, have the same shape as $A$, and no functions are provided specifically for finding such an inverse. The matrix must either be treated as a full matrix or the equations $AX=B$ must be solved, where $B$ has been initialized to the identity matrix $I$. In the latter case, see the decision trees in Section 4 in the F04 Chapter Introduction.
Note 2: by ‘guaranteed accuracy’ we mean that the accuracy of the inverse is improved by the use of the iterative refinement technique using additional precision.