where ${X}_{i}$ is the state vector of length $n$ at time $i$, ${Y}_{i}$ is the observation vector of length $m$ at time $i$ and ${W}_{i}$ of length $l$ and ${V}_{i}$ of length $m$ are the independent state noise and measurement noise respectively. The matrices $A,B$ and $C$ are time invariant.
The estimate of ${X}_{i}$ given observations ${Y}_{1}$ to ${Y}_{i-1}$ is denoted by ${\hat{X}}_{i\mid i-1}$ with state covariance matrix $\mathrm{Var}\left({\hat{X}}_{i\mid i-1}\right)={P}_{i\mid i-1}={S}_{i}{S}_{i}^{\mathrm{T}}$ while the estimate of ${X}_{i}$ given observations ${Y}_{1}$ to ${Y}_{i}$ is denoted by ${\hat{X}}_{i\mid i}$ with covariance matrix $\mathrm{Var}\left({\hat{X}}_{i\mid i}\right)={P}_{i\mid i}$. The update of the estimate, ${\hat{X}}_{i\mid i-1}$, from time $i$ to time $(i+1)$ is computed in two stages. First, the measurement-update is given by
where ${K}_{i}={P}_{i\mid i}{C}^{\mathrm{T}}{[C{P}_{i\mid i}{C}^{\mathrm{T}}+{R}_{i}]}^{\mathrm{-1}}$ is the Kalman gain matrix. The second stage is the time-update for $X$, which is given by
where ${D}_{i}{U}_{i}$ represents any deterministic control used.
The square root covariance filter algorithm provides a stable method for computing the Kalman gain matrix and the state covariance matrix. The algorithm can be summarised as
where $U$ is an orthogonal transformation triangularizing the left-hand pre-array to produce the right-hand post-array. The triangularization is carried out via Householder transformations exploiting the zero pattern of the pre-array. The relationship between the Kalman gain matrix ${K}_{i}$ and ${G}_{i}$ is given by
In order to exploit the invariant parts of the model to simplify the computation of $U$ the results for the transformed state space ${U}^{*}X$ are computed where ${U}^{*}$ is the transformation that reduces the matrix pair $(A,C)$ to lower observer Hessenberg form. That is, the matrix ${U}^{*}$ is computed such that the compound matrix
is a lower trapezoidal matrix. Further the matrix $B$ is transformed to ${U}^{*}B$. These transformations need only be computed once at the start of a series, and g13ebf will, optionally, compute them. g13ebf returns transformed matrices ${U}^{*}A{U}^{*T}$, ${U}^{*}B$, $C{U}^{*T}$ and ${U}^{*}A{K}_{i}$, the Cholesky factor of the updated transformed state covariance matrix ${S}_{i+1}^{*}$ (where ${U}^{*}{P}_{i+1\mid i}{U}^{*T}={S}_{i+1}^{*}{S}_{i+1}^{*T}$) and the matrix ${H}_{i}^{1/2}$, valid for both transformed and original models, which is used in the computation of the likelihood for the model. Note that the covariance matrices ${Q}_{i}$ and ${R}_{i}$ can be time-varying.
4References
Vanbegin M, van Dooren P and Verhaegen M H G (1989) Algorithm 675: FORTRAN subroutines for computing the square root covariance filter and square root information filter in dense or Hessenberg forms ACM Trans. Math. Software15 243–256
Verhaegen M H G and van Dooren P (1986) Numerical aspects of different Kalman filter implementations IEEE Trans. Auto. Contr.AC-31 907–917
5Arguments
1: $\mathbf{transf}$ – Character(1)Input
On entry: indicates whether to transform the input matrix pair $(A,C)$ to lower observer Hessenberg form. The transformation will only be required on the first call to g13ebf.
${\mathbf{transf}}=\text{'T'}$
The matrices in arrays a and c are transformed to lower observer Hessenberg form and the matrices in b and s are transformed as described in Section 3.
${\mathbf{transf}}=\text{'H'}$
The matrices in arrays a, c and b should be as returned from a previous call to g13ebf with ${\mathbf{transf}}=\text{'T'}$.
Constraint:
${\mathbf{transf}}=\text{'T'}$ or $\text{'H'}$.
2: $\mathbf{n}$ – IntegerInput
On entry: $n$, the size of the state vector.
Constraint:
${\mathbf{n}}\ge 1$.
3: $\mathbf{m}$ – IntegerInput
On entry: $m$, the size of the observation vector.
Constraint:
${\mathbf{m}}\ge 1$.
4: $\mathbf{l}$ – IntegerInput
On entry: $l$, the dimension of the state noise.
Constraint:
${\mathbf{l}}\ge 1$.
5: $\mathbf{a}({\mathbf{lds}},{\mathbf{n}})$ – Real (Kind=nag_wp) arrayInput/Output
On entry: if ${\mathbf{transf}}=\text{'T'}$, the state transition matrix, $A$.
If ${\mathbf{transf}}=\text{'H'}$, the transformed matrix as returned by a previous call to g13ebf with ${\mathbf{transf}}=\text{'T'}$.
On exit: if ${\mathbf{transf}}=\text{'T'}$, the transformed matrix, ${U}^{*}A{U}^{*T}$, otherwise a is unchanged.
6: $\mathbf{lds}$ – IntegerInput
On entry: the first dimension of the arrays a, b, s, k and u as declared in the (sub)program from which g13ebf is called.
Constraint:
${\mathbf{lds}}\ge {\mathbf{n}}$.
7: $\mathbf{b}({\mathbf{lds}},{\mathbf{l}})$ – Real (Kind=nag_wp) arrayInput/Output
On entry: if ${\mathbf{transf}}=\text{'T'}$, the noise coefficient matrix $B$.
If ${\mathbf{transf}}=\text{'H'}$, the transformed matrix as returned by a previous call to g13ebf with ${\mathbf{transf}}=\text{'T'}$.
On exit: if ${\mathbf{transf}}=\text{'T'}$, the transformed matrix, ${U}^{*}B$, otherwise b is unchanged.
8: $\mathbf{stq}$ – LogicalInput
On entry: if ${\mathbf{stq}}=\mathrm{.TRUE.}$, the state noise covariance matrix ${Q}_{i}$ is assumed to be the identity matrix. Otherwise the lower triangular Cholesky factor, ${Q}_{i}^{1/2}$, must be provided in q.
9: $\mathbf{q}({\mathbf{ldq}},*)$ – Real (Kind=nag_wp) arrayInput
Note: the second dimension of the array q
must be at least
${\mathbf{l}}$ if ${\mathbf{stq}}=\mathrm{.FALSE.}$.
On entry: if ${\mathbf{stq}}=\mathrm{.FALSE.}$, q must contain the lower triangular Cholesky factor of the state noise covariance matrix, ${Q}_{i}^{1/2}$. Otherwise q is not referenced.
10: $\mathbf{ldq}$ – IntegerInput
On entry: the first dimension of the array q as declared in the (sub)program from which g13ebf is called.
Constraints:
if ${\mathbf{stq}}=\mathrm{.FALSE.}$, ${\mathbf{ldq}}\ge {\mathbf{l}}$;
otherwise ${\mathbf{ldq}}\ge 1$.
11: $\mathbf{c}({\mathbf{ldm}},{\mathbf{n}})$ – Real (Kind=nag_wp) arrayInput/Output
On entry: if ${\mathbf{transf}}=\text{'T'}$, the measurement coefficient matrix, $C$.
If ${\mathbf{transf}}=\text{'H'}$, the transformed matrix as returned by a previous call to g13ebf with ${\mathbf{transf}}=\text{'T'}$.
On exit: if ${\mathbf{transf}}=\text{'T'}$, the transformed matrix, $C{U}^{*T}$, otherwise c is unchanged.
12: $\mathbf{ldm}$ – IntegerInput
On entry: the first dimension of the arrays c, r and h as declared in the (sub)program from which g13ebf is called.
Constraint:
${\mathbf{ldm}}\ge {\mathbf{m}}$.
13: $\mathbf{r}({\mathbf{ldm}},{\mathbf{m}})$ – Real (Kind=nag_wp) arrayInput
On entry: the lower triangular Cholesky factor of the measurement noise covariance matrix ${R}_{i}^{1/2}$.
14: $\mathbf{s}({\mathbf{lds}},{\mathbf{n}})$ – Real (Kind=nag_wp) arrayInput/Output
On entry: if ${\mathbf{transf}}=\text{'T'}$ the lower triangular Cholesky factor of the state covariance matrix, ${S}_{i}$.
If ${\mathbf{transf}}=\text{'H'}$ the lower triangular Cholesky factor of the covariance matrix of the transformed state vector ${S}_{i}^{*}$ as returned from a previous call to g13ebf with ${\mathbf{transf}}=\text{'T'}$.
On exit: the lower triangular Cholesky factor of the transformed state covariance matrix, ${S}_{i+1}^{*}$.
15: $\mathbf{k}({\mathbf{lds}},{\mathbf{m}})$ – Real (Kind=nag_wp) arrayOutput
On exit: the Kalman gain matrix for the transformed state vector premultiplied by the state transformed transition matrix, ${U}^{*}A{K}_{i}$.
16: $\mathbf{h}({\mathbf{ldm}},{\mathbf{m}})$ – Real (Kind=nag_wp) arrayOutput
On exit: the lower triangular matrix ${H}_{i}^{1/2}$.
17: $\mathbf{u}({\mathbf{lds}},*)$ – Real (Kind=nag_wp) arrayOutput
Note: the second dimension of the array u
must be at least
${\mathbf{n}}$ if ${\mathbf{transf}}=\text{'T'}$.
On exit: if ${\mathbf{transf}}=\text{'T'}$ the $n\times n$ transformation matrix ${U}^{*}$, otherwise u is not referenced.
18: $\mathbf{tol}$ – Real (Kind=nag_wp)Input
On entry: the tolerance used to test for the singularity of ${H}_{i}^{1/2}$. If $0.0\le {\mathbf{tol}}<{m}^{2}\times \mathit{machineprecision}$, then ${m}^{2}\times \mathit{machineprecision}$ is used instead. The inverse of the condition number of ${H}^{1/2}$ is estimated by a call to f07tgf. If this estimate is less than tol then ${H}^{1/2}$ is assumed to be singular.
20: $\mathbf{wk}\left(({\mathbf{n}}+{\mathbf{m}})\times ({\mathbf{n}}+{\mathbf{m}}+{\mathbf{l}})\right)$ – Real (Kind=nag_wp) arrayWorkspace
21: $\mathbf{ifail}$ – IntegerInput/Output
On entry: ifail must be set to $0$, $\mathrm{-1}$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $\mathrm{-1}$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $\mathrm{-1}$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).
6Error Indicators and Warnings
If on entry ${\mathbf{ifail}}=0$ or $\mathrm{-1}$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=1$
On entry, ${\mathbf{l}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{l}}\ge 1$.
On entry, ${\mathbf{ldm}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{m}}=\u27e8\mathit{\text{value}}\u27e9$. Constraint: ${\mathbf{ldm}}\ge {\mathbf{m}}$.
On entry, ${\mathbf{ldq}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ldq}}\ge 1$.
On entry, ${\mathbf{ldq}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{l}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ldq}}\ge {\mathbf{l}}$.
On entry, ${\mathbf{lds}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{lds}}\ge {\mathbf{n}}$.
On entry, ${\mathbf{m}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{m}}\ge 1$.
On entry, ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{n}}\ge 1$.
On entry, ${\mathbf{tol}}=\u27e8\mathit{\text{value}}\u27e9$. Constraint: ${\mathbf{tol}}\ge 0.0$.
On entry, ${\mathbf{transf}}=\u27e8\mathit{\text{value}}\u27e9$. Constraint: ${\mathbf{transf}}=\text{'T'}$ or $\text{'H'}$.
${\mathbf{ifail}}=2$
The matrix ${H}_{i}^{1/2}$ is singular.
${\mathbf{ifail}}=-99$
An unexpected error has been triggered by this routine. Please
contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.
7Accuracy
The use of the square root algorithm improves the stability of the computations as compared with the direct coding of the Kalman filter. The accuracy will depend on the model.
8Parallelism and Performance
g13ebf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g13ebf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
9Further Comments
For models with time-varying $A,B$ and $C$, g13eaf can be used.
The initial estimate of the transformed state vector can be computed from the estimate of the original state vector ${\hat{X}}_{1\mid 0}$, say, by premultiplying it by ${U}^{*}$ as returned by g13ebf with ${\mathbf{transf}}=\text{'T'}$; that is, ${\hat{X}}_{1\mid 0}^{*}={U}^{*}{\hat{X}}_{1\mid 0}$. The estimate of the transformed state vector ${\hat{X}}_{i+1\mid i}^{*}$ can be computed from the previous value ${\hat{X}}_{i\mid i-1}^{*}$ by
are the independent one-step prediction residuals for both the transformed and original model. The estimate of the original state vector can be computed from the transformed state vector as ${U}^{*T}{\hat{X}}_{1+1\mid i}^{*}$. The required matrix-vector multiplications can be performed by f06paf.
If ${W}_{i}$ and ${V}_{i}$ are independent multivariate Normal variates then the log-likelihood for observations $i=1,2,\dots ,t$ is given by
can be specified either with b set to the identity matrix and ${\mathbf{stq}}=\mathrm{.FALSE.}$ and the matrix ${Q}^{1/2}$ input in q or with ${\mathbf{stq}}=\mathrm{.TRUE.}$ and b set to ${Q}^{1/2}$.
The algorithm requires $\frac{1}{6}{n}^{3}+{n}^{2}(\frac{3}{2}m+l)+2n{m}^{2}+\frac{2}{3}{p}^{3}$ operations and is backward stable (see Verhaegen and van Dooren (1986)). The transformation to lower observer Hessenberg form requires $\mathit{O}\left((n+m){n}^{2}\right)$ operations.
10Example
This example first inputs the number of updates to be computed and the problem sizes. The initial state vector and the Cholesky factor of the state covariance matrix are input followed by the model matrices $A,B,C,{R}^{1/2}$ and optionally ${Q}^{1/2}$ (the Cholesky factors of the covariance matrices being input). At the first update the matrices are transformed using the ${\mathbf{transf}}=\text{'T'}$ option and the initial value of the state vector is transformed. At each update the observed values are input and the residuals are computed and printed and the estimate of the transformed state vector, ${\hat{U}}^{*}{X}_{i\mid i-1}$, and the deviance are updated. The deviance is $\mathrm{-2}\times \text{log-likelihood}$ ignoring the constant. After the final update the estimate of the state vector is computed from the transformed state vector and the state covariance matrix is computed from s and these are printed along with the value of the deviance.
The data is for a two-dimensional time series to which a VARMA$(1,1)$ has been fitted. For the specification of a VARMA model as a state space model see the G13 Chapter Introduction. The means of the two series are included as additional states that do not change over time. The initial value of $P$, ${P}_{0}$, is the solution to