g13aef fits a seasonal autoregressive integrated moving average (ARIMA) model to an observed time series, using a nonlinear least squares procedure incorporating backforecasting. Parameter estimates are obtained, together with appropriate standard errors. The residual series is returned, and information for use in forecasting the time series is produced for use by the routines g13agfandg13ahf.
The estimation procedure is iterative, starting with initial parameter values such as may be obtained using g13adf. It continues until a specified convergence criterion is satisfied, or until a specified number of iterations has been carried out. The progress of the procedure can be monitored by means of a user-supplied routine.
The routine may be called by the names g13aef or nagf_tsa_uni_arima_estim.
The time series supplied to g13aef is assumed to follow a seasonal autoregressive integrated moving average (ARIMA) model defined as follows:
where is the result of applying non-seasonal differencing of order and seasonal differencing of seasonality and order to the series , as outlined in the description of g13aaf. The differenced series is then of length , where is the generalized order of differencing. The scalar is the expected value of the differenced series, and the series follows a zero-mean stationary autoregressive moving average (ARMA) model defined by a pair of recurrence equations. These express in terms of an uncorrelated series , via an intermediate series . The first equation describes the seasonal structure:
The second equation describes the non-seasonal structure. If the model is purely non-seasonal the first equation is redundant and above is equated with :
Estimates of the model parameters defined by
and (optionally) are obtained by minimizing a quadratic form in the vector .
This is , where is the covariance matrix of , and is a function of the model parameters. This matrix is not explicitly evaluated, since may be expressed as a ‘sum of squares’ function. When moving average parameters or are present, so that the generalized moving average order is positive, backforecasts are introduced as nuisance parameters. The ‘sum of squares’ function may then be written as
where is a combined vector of parameters, consisting of the backforecasts followed by the ARMA model parameters.
The terms correspond to the ARMA model residual series , and is the generalized autoregressive order. The terms are only present if autoregressive parameters are in the model, and serve to correct for transient errors introduced at the start of the autoregression.
The equations defining and are precisely:
For all four of these equations, the following conditions hold:
Minimization of with respect to uses an extension of the algorithm of Marquardt (1963).
The first derivatives of with respect to the parameters are calculated as
where and are derivatives of and with respect to the th parameter.
The second derivative of is approximated by
Successive parameter iterates are obtained by calculating a vector of corrections by solving the equations
where is a vector with elements , is a matrix with elements , is a scalar used to control the search and is the diagonal matrix of .
The new parameter values are then .
The scalar controls the step size, to which it is inversely related.
If a step results in new parameter values which give a reduced value of , then is reduced by a factor . If a step results in new parameter values which give an increased value of , or in ARMA model parameters which in any way contravene the stationarity and invertibility conditions, then the new parameters are rejected, is increased by the factor , and the revised equations are solved for a new parameter correction.
This action is repeated until either a reduced value of is obtained, or reaches the limit of , which is used to indicate a failure of the search procedure.
This failure may be due to a badly conditioned sum of squares function or to too strict a convergence criterion. Convergence is deemed to have occurred if the fractional reduction in the residual sum of squares in successive iterations is less than a value , while .
The stationarity and invertibility conditions are tested to within a specified tolerance multiple of machine accuracy. Upon convergence, or completion of the specified maximum number of iterations without convergence, statistical properties of the estimates are derived. In the latter case the sequence of iterates should be checked to ensure that convergence is adequate for practical purposes, otherwise these properties are not reliable.
The estimated residual variance is
where is the final value of , and the residual number of degrees of freedom is given by
The covariance matrix of the vector of estimates is given by
where is evaluated at the final parameter values.
From this expression are derived the vector of standard deviations, and the correlation matrix for the whole parameter set. These are asymptotic approximations.
The differenced series (now uncorrected for the constant), intermediate series and residual series are all available upon completion of the iterations over the range (extended by backforecasts)
The values can only properly be interpreted as residuals for , as the earlier values are corrupted by transients if .
In consequence of the manner in which differencing is implemented, the residual is the one step ahead forecast error for .
For convenient application in forecasting, the following quantities constitute the ‘state set’, which contains the minimum amount of time series information needed to construct forecasts:
(i)the differenced series , for ,
(ii)the values required to reconstitute the original series from the differenced series ,
(iii)the intermediate series , for
(iv)the residual series , for .
This state set is available upon completion of the iterations. The routine may be used purely for the construction of this state set, given a previously estimated model and time series , by requesting zero iterations. Backforecasts are estimated, but the model parameter values are unchanged. If later observations become available and it is desired to update the state set, g13agf can be used.
Box G E P and Jenkins G M (1976) Time Series Analysis: Forecasting and Control (Revised Edition) Holden–Day
Marquardt D W (1963) An algorithm for least squares estimation of nonlinear parameters J. Soc. Indust. Appl. Math.11 431
1: – Integer arrayInput
On entry: the orders vector of the ARIMA model whose parameters are to be estimated. , , and refer respectively to the number of autoregressive (), moving average , seasonal autoregressive () and seasonal moving average () parameters. , and refer respectively to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.
, , , , , , ;
if , ;
if , ;
2: – Real (Kind=nag_wp) arrayInput/Output
On entry: the initial estimates of the values of the parameters, the values of the parameters, the values of the parameters and the values of the parameters, in that order.
On exit: the latest values of the estimates of these parameters.
3: – IntegerInput
On entry: the total number of , , and parameters to be estimated.
4: – Real (Kind=nag_wp)Input/Output
On entry: if , c must contain the expected value, , of the differenced series.
If the routine exits because of a faulty input parameter, the contents of ex will be indeterminate.
10: – Real (Kind=nag_wp) arrayOutput
On exit: the values of the model residuals which is made up of:
residuals corresponding to the backforecasts in the differenced series.
residuals corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the routine exits with ifail holding a value other than or , the contents of exr will be indeterminate.
11: – Real (Kind=nag_wp) arrayOutput
On exit: the intermediate series which is made up of:
intermediate series values corresponding to the backforecasts in the differenced series.
intermediate series values corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the routine exits with , the contents of al will be indeterminate.
12: – IntegerInput
On entry: the dimension of the arrays ex, exr and al as declared in the (sub)program from which g13aef is called.
, which is equivalent to the exit value of .
13: – Real (Kind=nag_wp)Output
On exit: the residual sum of squares after the latest series of parameter estimates has been incorporated into the model. If the routine exits with a faulty input parameter, s contains zero.
14: – Real (Kind=nag_wp) arrayOutput
On exit: the latest value of the derivatives of with respect to each of the parameters being estimated (backforecasts, par parameters, and where relevant the constant – in that order). The contents of g will be indeterminate if the routine exits with a faulty input parameter.
15: – IntegerInput
On entry: the dimension of the arrays g and sd and the second dimension of the arrays h and hc as declared in the (sub)program from which g13aef is called.
which is equivalent to the exit value of .
16: – Real (Kind=nag_wp) arrayOutput
On exit: the standard deviations corresponding to each of the parameters being estimated (backforecasts, par parameters, and where relevant the constant, in that order).
If the routine exits with ifail containing a value other than or , or if the required number of iterations is zero, the contents of sd will be indeterminate.
17: – Real (Kind=nag_wp) arrayOutput
On exit: the second derivative of and correlation coefficients.
(a)the latest values of an approximation to the second derivative of with respect to each of the parameters being estimated (backforecasts, par parameters, and where relevant the constant – in that order), and
(b)the correlation coefficients relating to each pair of these parameters.
These are held in a matrix defined by the first rows and the first columns of h. (Note that contains the value of this expression.) The values of (a) are contained in the upper triangle, and the values of (b) in the strictly lower triangle.
These correlation coefficients are zero during intermediate printout using piv, and indeterminate if ifail contains on exit a value other than or .
All the contents of h are indeterminate if the required number of iterations are zero. The th row of h is used internally as workspace.
On entry: must be nonzero if the progress of the optimization is to be monitored using piv. Otherwise kpiv must contain .
24: – IntegerInput
On entry: the maximum number of iterations to be performed.
25: – IntegerOutput
On exit: the number of iterations performed.
26: – Real (Kind=nag_wp) arrayInput/Output
On entry: when , the first four elements of zsp must contain the four values used to guide the search procedure. These are as follows.
contains , the value used to constrain the magnitude of the search procedure steps.
contains , the multiplier which regulates the value .
contains , the value of the stationarity and invertibility test tolerance factor.
contains , the value of the convergence criterion.
If on entry, default values for zsp are supplied by the routine.
These are , , and respectively.
On exit: zsp contains the values, default or otherwise, used by the routine.
27: – IntegerInput
On entry: the value if the routine is to use the input values of zsp, and any other value if the default values of zsp are to be used.
28: – Integer arrayOutput
On exit: contains success/failure indicators, one for each of the four types of parameter in the model (autoregressive, moving average, seasonal autoregressive, seasonal moving average), in that order.
Each indicator has the interpretation:
On entry parameters of this type have initial estimates which do not satisfy the stationarity or invertibility test conditions.
The search procedure has failed to converge because the latest set of parameter estimates of this type is invalid.
No parameter of this type is in the model.
Valid final estimates for parameters of this type have been obtained.
29: – Real (Kind=nag_wp) arrayWorkspace
30: – IntegerInput
On entry: the dimension of the array wa as declared in the (sub)program from which g13aef is called.
if , ;
if , , ;
if , , , ;
31: – Real (Kind=nag_wp) arrayWorkspace
32: – IntegerInput/Output
On entry: ifail must be set to , or to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of means that an error message is printed while a value of means that it is not.
If halting is not appropriate, the value or is recommended. If message printing is undesirable, then the value is recommended. Otherwise, the value is recommended since useful values can be provided in some output arguments even when on exit. When the value or is used it is essential to test the value of ifail on exit.
On exit: unless the routine detects an error or a warning has been flagged (see Section 6).
6Error Indicators and Warnings
If on entry or , explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
Note: in some cases g13aef may return useful information.
The model is over-parameterised. The number of parameters in the model is greater than the number of terms in the differenced series, i.e., .
On entry, .
On entry, .
On entry, .
On entry, .
On entry, .
On entry, and the minimum size (the minimum size required is returned in nst).
On entry, and the minimum size . Constraint: .
On entry, .
On entry, .
On entry, .
A failure in the search procedure has occurred. Some output arguments may contain meaningful values.
Failure to invert . Some output arguments may contain meaningful values.
Unable to calculate the latest estimates of the backforecasts.
Some output arguments may contain meaningful values.
Satisfactory parameter estimates could not be obtained for all parameter types in the model. Inspect array isf for futher information on the parameter type(s) in error.
An unexpected error has been triggered by this routine. Please
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.
The computations are believed to be stable.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
g13aef is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g13aef makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
The time taken by g13aef is approximately proportional to .
The following program reads observations from a time series relating to the rate of the earth's rotation about its polar axis. Differencing of order is applied, and the number of non-seasonal parameters is , one autoregressive , and two moving average . No seasonal effects are taken into account.
The constant is estimated. Up to iterations are allowed.