g13ddc fits a vector autoregressive moving average (VARMA) model to an observed vector of time series using the method of Maximum Likelihood (ML). Standard errors of parameter estimates are computed along with their appropriate correlation matrix. The function also calculates estimates of the residual series.
The function may be called by the names: g13ddc, nag_tsa_multi_varma_estimate or nag_tsa_varma_estimate.
3Description
Let ${W}_{\mathit{t}}={({w}_{1\mathit{t}},{w}_{2\mathit{t}},\dots ,{w}_{\mathit{k}\mathit{t}})}^{\mathrm{T}}$, for $\mathit{t}=1,2,\dots ,n$, denote a vector of $k$ time series which is assumed to follow a multivariate ARMA model of the form
where ${\epsilon}_{\mathit{t}}={({\epsilon}_{1\mathit{t}},{\epsilon}_{2\mathit{t}},\dots ,{\epsilon}_{k\mathit{t}})}^{\mathrm{T}}$, for $\mathit{t}=1,2,\dots ,n$, is a vector of $k$ residual series assumed to be Normally distributed with zero mean and positive definite covariance matrix $\Sigma $. The components of ${\epsilon}_{t}$ are assumed to be uncorrelated at non-simultaneous lags. The ${\varphi}_{i}$ and ${\theta}_{j}$ are $k\times k$ matrices of parameters. $\left\{{\varphi}_{\mathit{i}}\right\}$, for $\mathit{i}=1,2,\dots ,p$, are called the autoregressive (AR) parameter matrices, and $\left\{{\theta}_{\mathit{i}}\right\}$, for $\mathit{i}=1,2,\dots ,q$, the moving average (MA) parameter matrices. The parameters in the model are thus the $p$ ($k\times k$) $\varphi $-matrices, the $q$ ($k\times k$) $\theta $-matrices, the mean vector, $\mu $, and the residual error covariance matrix $\Sigma $. Let
where $I$ denotes the $k\times k$ identity matrix.
The ARMA model (1) is said to be stationary if the eigenvalues of $A\left(\varphi \right)$ lie inside the unit circle. Similarly, the ARMA model (1) is said to be invertible if the eigenvalues of $B\left(\theta \right)$ lie inside the unit circle.
The method of computing the exact likelihood function (using a Kalman filter algorithm) is discussed in Shea (1987). A quasi-Newton algorithm (see Gill and Murray (1972)) is then used to search for the maximum of the log-likelihood function. Stationarity and invertibility are enforced on the model using the reparameterisation discussed in Ansley and Kohn (1986). Conditional on the maximum likelihood estimates being equal to their true values the estimates of the residual series are uncorrelated with zero mean and constant variance $\Sigma $.
You have the option of setting an argument (exact to Nag_FALSE) so that g13ddc calculates conditional maximum likelihood estimates (conditional on ${W}_{0}={W}_{\mathrm{-1}}=\cdots ={W}_{1-p}={\epsilon}_{0}={\epsilon}_{\mathrm{-1}}=\cdots =\phantom{\rule{0ex}{0ex}}{\epsilon}_{1-q}=0$). This may be useful if the exact maximum likelihood estimates are close to the boundary of the invertibility region.
You also have the option (see Section 5) of requesting g13ddc to constrain elements of the $\varphi $ and $\theta $ matrices and $\mu $ vector to have pre-specified values.
4References
Ansley C F and Kohn R (1986) A note on reparameterising a vector autoregressive moving average model to enforce stationarity J. Statist. Comput. Simulation24 99–106
Gill P E and Murray W (1972) Quasi-Newton methods for unconstrained optimization J. Inst. Math. Appl.9 91–108
Shea B L (1987) Estimation of multivariate time series J. Time Ser. Anal.8 95–110
5Arguments
1: $\mathbf{k}$ – IntegerInput
On entry: $k$, the number of observed time series.
Constraint:
${\mathbf{k}}\ge 1$.
2: $\mathbf{n}$ – IntegerInput
On entry: $n$, the number of observations in each time series.
3: $\mathbf{ip}$ – IntegerInput
On entry: $p$, the number of AR parameter matrices.
Constraint:
${\mathbf{ip}}\ge 0$.
4: $\mathbf{iq}$ – IntegerInput
On entry: $q$, the number of MA parameter matrices.
Constraint:
${\mathbf{iq}}\ge 0$.
${\mathbf{ip}}={\mathbf{iq}}=0$ is notpermitted.
5: $\mathbf{mean}$ – Nag_IncludeMeanInput
On entry: ${\mathbf{mean}}=\mathrm{Nag\_MeanInclude}$, if components of $\mu $ have been estimated and ${\mathbf{mean}}=\mathrm{Nag\_MeanZero}$, if all elements of $\mu $ are to be taken as zero.
Constraint:
${\mathbf{mean}}=\mathrm{Nag\_MeanInclude}$ or $\mathrm{Nag\_MeanZero}$.
On entry: initial parameter estimates read in row by row in the order ${\varphi}_{1},{\varphi}_{2},\dots ,{\varphi}_{p}$, ${\theta}_{1},{\theta}_{2},\dots ,{\theta}_{q},\mu $.
Thus,
if ${\mathbf{ip}}>0$,
${\mathbf{par}}\left[(\mathit{l}-1)\times k\times k+(\mathit{i}-1)\times k+\mathit{j}-1\right]$ must be set equal to an initial estimate of the $(\mathit{i},\mathit{j})$th element of ${\varphi}_{\mathit{l}}$, for $\mathit{l}=1,2,\dots ,p$, $\mathit{i}=1,2,\dots ,k$ and $\mathit{j}=1,2,\dots ,k$;
if ${\mathbf{iq}}>0$, ${\mathbf{par}}\left[p\times k\times k+(l-1)\times k\times k+(i-1)\times k+j-1\right]$ must be set equal to an initial estimate of the $(i,j)$th element of ${\theta}_{l}$, $l=1,2,\dots ,q$ and $i,j=1,2,\dots ,k$;
if ${\mathbf{mean}}=\mathrm{Nag\_MeanInclude}$, ${\mathbf{par}}\left[(p+q)\times k\times k+i-1\right]$ should be set equal to an initial estimate of the $i$th component of $\mu $ ($\mu \left(i\right)$). (If you set ${\mathbf{par}}\left[(p+q)\times k\times k+i-1\right]$ to $0.0$ then g13ddc will calculate the mean of the $i$th series and use this as an initial estimate of $\mu \left(i\right)$.)
The first $p\times k\times k$ elements of par must satisfy the stationarity condition and the next $q\times k\times k$ elements of par must satisfy the invertibility condition.
On entry: ${\mathbf{qq}}\left[{\mathbf{kmax}}\times \left(\mathit{j}-1\right)+\mathit{i}-1\right]$ must be set equal to an initial estimate of the $(\mathit{i},\mathit{j})$th element of $\Sigma $. The lower triangle only is needed. qq must be positive definite. It is strongly recommended that on entry the elements of qq are of the same order of magnitude as at the solution point. If you set ${\mathbf{qq}}\left[{\mathbf{kmax}}\times \left(\mathit{j}-1\right)+\mathit{i}-1\right]=0.0$, for $\mathit{i}=1,2,\dots ,k$ and $\mathit{j}=1,2,\dots ,i$, then g13ddc will calculate the covariance matrix between the $k$ time series and use this as an initial estimate of $\Sigma $.
On exit: if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV, NE_G13D_MAX_LOGLIK, NE_G13D_MAXCAL or NE_HESS_NOT_POS_DEF then ${\mathbf{qq}}\left[{\mathbf{kmax}}\times \left(j-1\right)+i-1\right]$ will contain the latest estimate of the $(i,j)$th element of $\Sigma $. The lower triangle only is returned.
9: $\mathbf{kmax}$ – IntegerInput
On entry: stride seperating row elements in qq, w and v.
On entry: ${\mathbf{w}}\left[{\mathbf{kmax}}\times \left(\mathit{t}-1\right)+\mathit{i}-1\right]$ must be set equal to the $\mathit{i}$th component of ${W}_{\mathit{t}}$, for $\mathit{i}=1,2,\dots ,k$ and $\mathit{t}=1,2,\dots ,n$.
On entry: ${\mathbf{parhld}}\left[\mathit{i}-1\right]$ must be set to Nag_TRUE if ${\mathbf{par}}\left[\mathit{i}-1\right]$ is to be held constant at its input value and Nag_FALSE if ${\mathbf{par}}\left[\mathit{i}-1\right]$ is a free parameter, for $\mathit{i}=1,2,\dots ,{\mathbf{npar}}$.
If in doubt try setting all elements of parhld to Nag_FALSE.
12: $\mathbf{exact}$ – Nag_BooleanInput
On entry: must be set equal to Nag_TRUE if you wish g13ddc to compute exact maximum likelihood estimates. exact must be set equal to Nag_FALSE if only conditional likelihood estimates are required.
13: $\mathbf{iprint}$ – IntegerInput
On entry: the frequency with which the automatic monitoring function is to be called.
${\mathbf{iprint}}>0$
The ML search procedure is monitored once every iprint iterations and just before exit from the search function.
${\mathbf{iprint}}=0$
The search function is monitored once at the final point.
${\mathbf{iprint}}<0$
The search function is not monitored at all.
14: $\mathbf{cgetol}$ – doubleInput
On entry: the accuracy to which the solution in par and qq is required.
If cgetol is set to ${10}^{-l}$ and on exit ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV or NE_HESS_NOT_POS_DEF, then all the elements in par and qq should be accurate to approximately $l$ decimal places. For most practical purposes the value ${10}^{\mathrm{-4}}$ should suffice. You should be wary of setting cgetol too small since the convergence criteria may then have become too strict for the machine to handle.
If cgetol has been set to a value which is less than the machine precision, $\epsilon $, then g13ddc will use the value $10.0\times \sqrt{\epsilon}$ instead.
15: $\mathbf{maxcal}$ – IntegerInput
On entry: the maximum number of likelihood evaluations to be permitted by the search procedure.
On entry: the name of a file to which diagnostic output will be directed. If outfile is NULL the diagnostic output will be directed to standard output.
On exit: if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV, NE_G13D_MAX_LOGLIK, NE_G13D_MAXCAL or NE_HESS_NOT_POS_DEF then
${\mathbf{v}}\left[{\mathbf{kmax}}\times \left(\mathit{t}-1\right)+\mathit{i}-1\right]$ will contain an estimate of the $\mathit{i}$th component of ${\epsilon}_{\mathit{t}}$, for $\mathit{i}=1,2,\dots ,k$ and $\mathit{t}=1,2,\dots ,n$, corresponding to the final point held in par and qq.
On exit: if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV, NE_G13D_MAX_LOGLIK, NE_G13D_MAXCAL or NE_HESS_NOT_POS_DEF then ${\mathbf{g}}\left[i-1\right]$ will contain the estimated first derivative of the log-likelihood function with respect to the $i$th element in the array par. If the gradient cannot be computed then all the elements of g are returned as zero.
On exit: if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV, NE_G13D_MAX_LOGLIK, NE_G13D_MAXCAL or NE_HESS_NOT_POS_DEF then ${\mathbf{cm}}\left[{\mathbf{pdcm}}\times \left(j-1\right)+i-1\right]$ will contain an estimate of the correlation coefficient between the $i$th and $j$th elements in the par array for $1\le i\le {\mathbf{npar}}$, $1\le j\le {\mathbf{npar}}$. If $i=j$, then ${\mathbf{cm}}\left[{\mathbf{pdcm}}\times \left(j-1\right)+i-1\right]$ will contain the estimated standard error of ${\mathbf{par}}\left[i-1\right]$. If the $l$th component of par has been held constant, i.e., ${\mathbf{parhld}}\left[l-1\right]$ was set to Nag_TRUE, then the $l$th row and column of cm will be set to zero. If the second derivative matrix cannot be computed then all the elements of cm are returned as zero.
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).
6Error Indicators and Warnings
NE_ALLOC_FAIL
Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_BAD_PARAM
On entry, argument $\u27e8\mathit{\text{value}}\u27e9$ had an illegal value.
NE_G13D_AR
The initial AR parameter estimates are outside the stationarity region. To proceed you must try a different starting point.
NE_G13D_ARMA
On entry, ${\mathbf{ip}}=0$ and ${\mathbf{iq}}=0$.
NE_G13D_BOUND
The ML solution is so close to the boundary of either the stationarity region or the invertibility region that g13ddc cannot evaluate the Hessian matrix. The elements of cm are set to zero, as are the elements of g. All other output quantities are correct.
NE_G13D_DERIV
An estimate of the second derivative matrix and the gradient vector at the solution point was computed. Either the Hessian matrix was found to be too ill-conditioned to be evaluated accurately or the gradient vector could not be computed to an acceptable degree of accuracy. The elements of cm are set to zero, as are the elements of g. All other output quantities are correct.
NE_G13D_GRAD
The function cannot compute a sufficiently accurate estimate of the gradient vector at the user-supplied starting point. This usually occurs if either the initial parameter estimates are very close to the ML parameter estimates, or you have supplied a very poor estimate of $\Sigma $, or the starting point is very close to the boundary of the stationarity or invertibility region. To proceed you must try a different starting point.
NE_G13D_MA
The initial MA parameter estimates are outside the invertibility region. To proceed you must try a different starting point.
NE_G13D_MAX_LOGLIK
The conditions for a solution have not all been met, but a point at which the log-likelihood took a larger value could not be found.
Provided that the estimated first derivatives are sufficiently small, and that the estimated condition number of the second derivative (Hessian) matrix, as printed when ${\mathbf{iprint}}\ge 0$, is not too large, this error exit may simply mean that, although it has not been possible to satisfy the specified requirements, the algorithm has in fact found the solution as far as the accuracy of the machine permits.
Such a condition can arise, for instance, if cgetol has been set so small that rounding error in evaluating the likelihood function makes attainment of the convergence conditions impossible.
If the estimated condition number at the final point is large, it could be that the final point is a solution but that the smallest eigenvalue of the Hessian matrix is so close to zero at the solution that it is not possible to recognize it as a solution. Output quantities were computed at the final point held in par and qq, except that if g or cm could not be computed, in which case they are set to zero.
NE_G13D_MAXCAL
There have been maxcal log-likelihood evaluations made in the function.
If steady increases in the log-likelihood function were monitored up to the point where this exit occurred, then the exit probably simply occurred because maxcal was set too small, so the calculations should be restarted from the final point held in par and qq. This type of exit may also indicate that there is no maximum to the likelihood surface. Output quantities were computed at the final point held in par and qq, except that if g or cm could not be computed, in which case they are set to zero.
NE_G13D_START
The starting point is too close to the boundary of the admissibility region. To proceed you must try a different starting point.
NE_HESS_NOT_POS_DEF
The second-derivative matrix at the solution point is not positive definite. The elements of cm are set to zero. All other output quantities are correct.
NE_INT
On entry, ${\mathbf{ip}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ip}}\ge 0$.
On entry, ${\mathbf{iq}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{iq}}\ge 0$.
On entry, ${\mathbf{ishow}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: $0\le {\mathbf{ishow}}\le 2$.
On entry, ${\mathbf{k}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{k}}\ge 1$.
On entry, ${\mathbf{maxcal}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{maxcal}}\ge 1$.
On entry, ${\mathbf{npar}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{npar}}=\u27e8\mathit{\text{value}}\u27e9$.
On entry, ${\mathbf{npar}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{npar}}\ge 0$.
NE_INT_2
On entry, ${\mathbf{kmax}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{k}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{kmax}}\ge {\mathbf{k}}$.
On entry, ${\mathbf{pdcm}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{npar}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{pdcm}}\ge {\mathbf{npar}}$.
NE_INT_3
On entry, ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$, ${\mathbf{k}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{npar}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{n}}\times {\mathbf{k}}>{\mathbf{npar}}+{\mathbf{k}}\times ({\mathbf{k}}+1)/2$.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_NO_LICENCE
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
NE_NOT_CLOSE_FILE
Cannot close file $\u27e8\mathit{\text{value}}\u27e9$.
NE_NOT_POS_DEF
The initial estimate of $\Sigma $ is not positive definite. To proceed you must try a different starting point.
NE_NOT_WRITE_FILE
Cannot open file $\u27e8\mathit{\text{value}}\u27e9$ for writing.
7Accuracy
On exit from g13ddc, if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV or NE_HESS_NOT_POS_DEF and cgetol has been set to ${10}^{-l}$, then all the parameters should be accurate to approximately $l$ decimal places. If cgetol was set equal to a value less than the machine precision, $\epsilon $, then all the parameters should be accurate to approximately $10.0\times \sqrt{\epsilon}$.
If ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_MAXCAL on exit (i.e., maxcal likelihood evaluations have been made but the convergence conditions of the search function have not been satisfied), then the elements in par and qq may still be good approximations to the ML estimates. Inspection of the elements of g may help you determine whether this is likely.
8Parallelism and Performance
g13ddc is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g13ddc makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
9Further Comments
9.1Memory Usage
Let $r=\mathrm{max}\phantom{\rule{0.125em}{0ex}}({\mathbf{ip}},{\mathbf{iq}})$ and $s={\mathbf{npar}}+{\mathbf{k}}\times ({\mathbf{k}}+1)/2$. Local workspace arrays of fixed lengths are allocated internally by g13ddc. The total size of these arrays amounts to $s+{\mathbf{k}}\times r+52$ Integer elements and $2\times {s}^{2}+s\times (s-1)/2+15\times s+{{\mathbf{k}}}^{2}\times (2\times {\mathbf{ip}}+{\mathbf{iq}}+{(r+3)}^{2})+{\mathbf{k}}\times (2\times {r}^{2}+2\times \phantom{\rule{0ex}{0ex}}r+3\times {\mathbf{n}}+4)+10$ double elements.
9.2Timing
The number of iterations required depends upon the number of parameters in the model and the distance of the user-supplied starting point from the solution.
9.3Constraining for Stationarity and Invertibility
If the solution lies on the boundary of the admissibility region (stationarity and invertibility region) then g13ddc may get into difficulty and exit with ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_MAX_LOGLIK. If this exit occurs you are advised to either try a different starting point or a different setting for exact. If this still continues to occur then you are urged to try fitting a more parsimonious model.
9.4Over-parameterisation
You are advised to try and avoid fitting models with an excessive number of parameters since over-parameterisation can cause the maximization problem to become ill-conditioned.
9.5Standardizing the Residual Series
The standardized estimates of the residual series ${\epsilon}_{t}$ (denoted by ${\hat{e}}_{t}$) can easily be calculated by forming the Cholesky decomposition of $\Sigma $, e.g., $G{G}^{\mathrm{T}}$ and setting ${\hat{e}}_{t}={G}^{-1}{\hat{\epsilon}}_{t}$. f07fdc may be used to calculate the array g. The components of ${\hat{e}}_{t}$ which are now uncorrelated at all lags can sometimes be more easily interpreted.
9.6Assessing the Fit of the Model
If your time series model provides a good fit to the data then the residual series should be approximately white noise, i.e., exhibit no serial cross-correlation. An examination of the residual cross-correlation matrices should confirm whether this is likely to be so. You are advised to call g13dsc to provide information for diagnostic checking. g13dsc returns the residual cross-correlation matrices along with their asymptotic standard errors. g13dsc also computes a portmanteau statistic and its asymptotic significance level for testing model adequacy. If ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or
${\mathbf{fail}}\mathbf{.}\mathbf{code}=$NE_G13D_BOUND, NE_G13D_DERIV, NE_G13D_MAX_LOGLIK or NE_HESS_NOT_POS_DEF
on exit from g13ddc then the quantities output k, n, v, kmax, ip, iq, par, parhld, and qq will be suitable for input to g13dsc.
10Example
This example shows how to fit a bivariate AR(1) model to two series each of length $48$. $\mu $ will be estimated and ${\varphi}_{1}(2,1)$ will be constrained to be zero.