NAG C Library Function Document
nag_tsa_multi_inp_model_estim (g13bec) fits a time series model to one output series relating it to any input series with a choice of three different estimation criteria – nonlinear least-squares, exact likelihood and marginal likelihood. When no input series are present, nag_tsa_multi_inp_model_estim (g13bec) fits a univariate ARIMA model.
void nag_tsa_multi_inp_model_estim||(Nag_ArimaOrder arimav, Integer nseries, Nag_TransfOrder transfv, double para, Integer npara, Integer nxxy, const double xxy, Integer tdxxy, double sd, double *rss, double *objf, double *df, Nag_G13_Opt *options, NagError *fail)
3.1 The Multi-input Model
The output series , for , is assumed to be the sum of (unobserved) components which are due respectively to the inputs , for .
Thus where is the error, or output noise component.
A typical component may be either:
- A simple regression component, (here is called a simple input) or
- A transfer function model component which allows for the effect of lagged values of the variable, related to by
The noise is assumed to follow a (possibly seasonal) ARIMA model, i.e., may be represented in terms of an uncorrelated series, , by the hierarchy of equations:
Note: the orders appearing in each of the transfer function models and the ARIMA model are not necessarily the same; is the result of applying non-seasonal differencing of order and seasonal differencing of seasonality and order to the series , the differenced series is then of length ; the constant term parameter may optionally be held fixed at its initial value (usually, but not necessarily zero) rather than being estimated.
For the purpose of defining an estimation criterion it is assumed that the series is a sequence of independent Normal variates having mean 0 and variance . An allowance has to be made for the effects of unobserved data prior to the observation period. For the noise component an allowance is always made using a form of backforecasting.
For each transfer function input, the user has to decide what values are to be assumed for the pre-period terms and which are in theory necessary to re-create the component series , during the estimation procedure.
The first choice is to assume that all these values are zero. In this case in order to avoid undesirable transient distortion of the early values the user is advised first to correct the input series by subtracting from all the terms a suitable constant to make the early values close to zero. The series mean is one possibility, but for a series with strong trend, the constant might be simply .
The second choice is to treat the unknown pre-period terms as nuisance parameters and estimate them along with the other parameters. This choice should be used with caution. For example, if and , it is equivalent to fitting to the data a decaying geometric curve of the form , for , along with the other inputs, this being the form of the transient. If the output contains a strong trend of this form, which is not otherwise represented in the model, it will have a tendency to influence the estimate of away from the value appropriate to the transfer function model.
In most applications the first choice should be adequate, with the option possibly being used as a refinement at the end of the modelling process. The number of nuisance parameters is then , with a corresponding loss of degrees of freedom in the residuals. If the user aligns the input with the output by using in its place the shifted series , then setting in the transfer function model, there is some improvement in efficiency. On some occasions when the model contains two or more inputs, each with estimation of pre-period nuisance parameters, these parameters may be co-linear and lead to failure of the function. The option must then be ‘switched off’ for one or more inputs.
3.2 The Estimation Criterion
This is a measure of how well a proposed set of parameters in the transfer function and noise ARIMA models, matches the data. The estimation function searches for parameter values which minimize this criterion. For a proposed set of parameter values it is derived by calculating:
- the components as the responses to the input series using the equations (a) or (b) above,
- the discrepancy between the output and the sum of these components, as the noise
- the residual series from by reversing the recursive equations (c), (d) and (e) above.
This last step again requires treatment of the effect of unknown pre-period values of
and other terms in the equations regenerating
. One approach is to use a sum of squares function as the estimation criteria, which is equivalent to taking the infinite set of past values
as (linear) nuisance parameters. There is no loss of degrees of freedom however, because the sum of squares function
may be expressed as including the corresponding set of past residuals – see Box and Jenkins (1976)
page 273, who prove that
The function is the first of the three possible criteria, and is quite adequate for moderate to long series with no seasonal parameters. The second is the exact likelihood criterion which considers the past set , not as simple nuisance parameters, but as unobserved random variables with known distribution. Calculation of the likelihood of the observed set requires theoretical integration over the range of the past set. Fortunately this yields a criterion of the form (whose minimization is equivalent to maximizing the exact likelihood of the data), where is exactly as before, and the multiplier is a function calculated from the ARIMA model parameters. The value of is always , and tends to 1 for any fixed parameter set as the sample size tends to . There is a moderate computational overhead in using this option, but its use avoids appreciable bias in the ARIMA model parameters and yields a better conditioned estimation problem.
The third criterion of marginal likelihood treats the coefficients of the simple inputs in a manner analogous to that given to the past set . These coefficients, together with the constant term used to represent the mean of , are in effect treated as random variables with highly dispersed distributions. This leads to the criterion again, but with a different value of which now depends on the simple input series values . In the presence of a moderate to large number of simple inputs, the marginal likelihood criterion can counteract bias in the ARIMA model parameters which is caused by estimation of the simple inputs. This is particularly important in relatively short series.
nag_tsa_multi_inp_model_estim (g13bec) can be used with no input series present, to estimate a univariate ARIMA model for the ouput alone. The marginal likelihood criterion is then distinct from exact likelihood only if a constant term is being estimated in the model, because this is treated as an implicit simple input.
3.3 The Estimation Procedure
This is the minimization of the estimation criterion or objective function
(for deviance). The function uses an extension of the algorithm of Marquardt (1963)
. The step size in the minimization is inversely related to a parameter
, which is increased or decreased by a factor
at successive iterations, depending on the progress of the minimization. Convergence is deemed to have occurred if the fractional reduction of
in successive iterations is less than a value
Certain model parameters (in fact all excluding the 's) are subject to stability constraints which are checked throughout to within a specified tolerance multiple of machine accuracy. Using the least-squares criterion, the minimization may halt prematurely when some parameters ‘stick’ at a constraint boundary. This can happen particularly with short seasonal series (with a small number of whole seasons). It will not happen using the exact likelihood criterion, although convergence to a point on the boundary may sometimes be rather slow, because the criterion function may be very flat in such a region. There is also a smaller risk of a premature halt at a constraint boundary when marginal likelihood is used.
A positive, or zero number of iterations can be specified. In either case, the value of the objective function at iteration zero is computed at the initial parameter values, except for the estimation of any pre-period terms for the input series, backforecasts for the noise series, and the coefficients of any simple inputs, and the constant term (unless this is held fixed).
At any later iteration, the value of is computed after re-estimation of the backforecasts to their optimal values, corresponding to the model parameters presented at that iteration. This is not true for any pre-period terms for the input series which, although they are updated from the previous iteration, may not be precisely optimal for the parameter values presented, unless convergence of those parameters has occurred. However, in the case of marginal likelihood being specified, the coefficients of the simple inputs and the constant term are also re-estimated together with the backforecasts at each iteration, to values which are optimal for the other parameter values presented.
3.4 Further Results
The residual variance is taken as where (total number of parameters estimated), is the residual degrees of freedom (for definition of see Section 3.2 and for definition of see Section 3.1). The pre-period nuisance parameters for the input series are included in the reduction of , as is the constant if it is estimated.
The covariance matrix of the vector of model parameter estimates is given by
where is the linearised least-squares matrix taken from the final iteration of the algorithm of Marquardt. From this expression are derived the vector of standard deviations, and the correlation matrix of parameter estimates. These are approximations which are only valid asymptotically, and must be treated with great caution when the parameter estimates are close to their constraint boundaries.
The residual series is available upon completion of the iterations over the range corresponding to the differenced noise series .
Because of the algorithm used for backforecasting, these are only true residuals for , provided this is positive. Estimation of pre-period terms for the inputs will also tend to reduce the magnitude of the early residuals, sometimes severely.
The model component series and may optionally be returned in order to assess the effects of the various inputs on the output.
Box G E P and Jenkins G M (1976) Time Series Analysis: Forecasting and Control (Revised Edition) Holden–Day
Marquardt D W (1963) An algorithm for least-squares estimation of nonlinear parameters J. Soc. Indust. Appl. Math. 11 431
arimav – Nag_ArimaOrder
Pointer to structure of type Nag_ArimaOrder with the following members:
- p – Integer
- d – Integer Input
- q – Integer Input
- bigp – Integer Input
- bigd – Integer Input
- bigq – Integer Input
- s – Integer Input
: these seven members of arimav
must specify the orders vector
, respectively, of the ARIMA model for the output noise component.
, , and refer, respectively, to the number of autoregressive , moving average , seasonal autoregressive and seasonal moving average parameters.
, and refer, respectively, to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.
nseries – Integer Input
On entry: the total number of input and output series. There may be any number of input series (including none), but always one output series.
if there are no parameters in the model (that is and ), otherwise
transfv – Nag_TransfOrder
Pointer to structure of type Nag_TransfOrder with the following members:
- b – Integer *Input/Output
- q – Integer *Input/Output
- p – Integer *
- r – Integer *Input/Output
: before use these member pointers must
be allocated memory by calling nag_tsa_transf_orders (g13byc)
elements to each pointer. The memory allocated to these pointers must be given the transfer function model orders
of each of the input series. The order parameters for input series
are held in the
th element of the allocated memory for each pointer.
holds the value
holds the value
holds the value
. For a simple input,
holds the value
for a simple input,
for a transfer function input for which no allowance is to be made for pre-observation period effects, and
for a transfer function input for which pre-observation period effects will be treated by estimation of appropriate nuisance parameters. When
, any non-zero contents of the
th element of b
The memory allocated to the members of transfv must be freed by a call to nag_tsa_trans_free (g13bzc)
para[npara] – double Input/Output
: initial values of the multi-input model parameters. These are in order, firstly the ARIMA model parameters:
parameters. These are followed by initial values of the transfer function model parameters
for the first of any input series and similarly for each subsequent input series. The final component of para
is the initial value of the constant
, whether it is fixed or is to be estimated.
On exit: the latest values of the estimates of these parameters.
npara – Integer Input
On entry: the exact number of , , , , , and parameters.
, the summation being over all the input series. ( must be included, whether fixed or estimated.)
nxxy – Integer Input
On entry: the (common) length of the original, undifferenced input and output time series.
xxy[nxxy][tdxxy] – const double Input
: the columns of xxy
must contain the nxxy
original, undifferenced values of each of the input series,
, and the output series,
, in that order.
tdxxy – Integer Input
: the second dimension of the array xxy
as declared in the subroutine from which nag_tsa_multi_inp_model_estim (g13bec) is called.
sd[npara] – double Output
: the npara
values of the standard deviations corresponding to each of the parameters in para
. When the constant is fixed its standard deviation is returned as zero. When the values of para
are valid, the values of sd
are usually also valid unless the function fails to invert the second derivative matrix in which case fail
will have an exit value of NE_MAT_NOT_POS_DEF
rss – double *Output
On exit: the residual sum of squares, , at the latest set of valid parameter estimates.
objf – double *Output
On exit: the objective function, , at the latest set of valid parameter estimates.
df – double *Output
On exit: the degrees of freedom associated with .
options – Nag_G13_Opt *Input/Output
: a pointer to a structure of type Nag_G13_Opt
whose members are optional parameters for nag_tsa_multi_inp_model_estim (g13bec). If the optional parameters are not required, then the null pointer, G13_DEFAULT
, can be used in the function call to nag_tsa_multi_inp_model_estim (g13bec). Details of the optional parameters and their types are given below in Section 10.2
fail – NagError *Input/Output
The NAG error parameter, see the Essential Introduction
6 Error Indicators and Warnings
On entry, the option structure, options
, has not been initialised using nag_tsa_options_init (g13bxc)
given to transfv
not valid. Correct range for elements of transfv
On entry, the orders array structure, transfv
, has not been successfully initialised using function nag_tsa_transf_orders (g13byc)
On entry, parameter options.cfixed
had an illegal value.
On entry, parameter options.criteria
had an illegal value.
On entry, parameter options.print_level
had an illegal value.
On entry, nseries
must not be less than 1:
On entry, options.max_iter
must not be less than 0:
On entry while . These parameters must satisfy .
On entry, options.alpha
must not be less than or equal to 0.0:
On entry, options.beta
must not be less than or equal to 1.0:
On entry, options.delta
must not be less than 1.0:
On entry, options.gamma
must not be less than 0.0:
On entry, options.gamma
must not be greater than or equal to 1.0:
Memory allocation failed.
On entry, and there are no parameters in the model, i.e., ( and ).
Value of nseries
passed to nag_tsa_transf_orders (g13byc)
which is not equal to the value
passed in this function.
On entry, there is inconsistency between npara
on the one hand and the elements in the orders structures, arimav
on the other.
On entry, or during execution, one or more sets of parameters do not satisfy the stationarity or invertibility test conditions.
Iterative refinement has failed to improve the solution of the equations giving the latest estimates of the parameters. This occurred because the matrix of the set of equations is too ill-conditioned.
Attempt to invert the second derivative matrix needed in the calculation of the covariance matrix of the parameter estimates has failed. The matrix is not positive-definite, possibly due to rounding errors.
On entry, or during execution, one or more sets of the ARIMA (, , or ) parameters do not satisfy the stationarity or invertibility test conditions.
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please consult NAG for assistance.
The function has failed to converge after options.max_iter
. If steady decreases in the objective function,
, were monitored up to the point where this exit occurred, see the optional parameter options.print_level
, then options.max_iter
was probably set too small. If so the calculations should be restarted from the final point held in para
The orders of differencing specified in the structure arimav must satisfy
- If the intermediate results of optimization are written to a file using the optional parameter options.outfile, then the following errors could also occur.
Cannot open file for appending.
Error occurred when writing to file .
Cannot close file .
The computation used is believed to be stable.
8 Further Comments
The time taken by nag_tsa_multi_inp_model_estim (g13bec) is approximately proportional to .
This example illustrates the use of the default option G13_DEFAULT
in a call to nag_tsa_multi_inp_model_estim (g13bec). An example showing the use of optional parameters is given in Section 11
There is one example program file, the main program of which calls both examples. The main program is given below.
9.1 Example 1
This example illustrates the use of the default option G13_DEFAULT in a call to nag_tsa_multi_inp_model_estim (g13bec).
The data in the example relate to 40 observations of an output time series and of a single input time series. The noise series has one autoregressive and one seasonal moving average parameter (both of which are initially set to zero) for which the seasonal period is 4. The input series is defined by orders , , , , so that it has one (initially set to 2.0) and one (initially set to 0.5), and allows for pre-observation period effects.
After the successful call to nag_tsa_multi_inp_model_estim (g13bec), the following are computed and printed out: the number of full iterations required to obtain satisfactory results, the final values of the para
parameters and their standard errors sd
, the residual sum of squares rss
, the objective function objf
and the degrees of freedom.
9.1.1 Program Text
9.1.2 Program Data
9.1.3 Program Results
10 Optional Parameters
A number of optional input and output parameters to nag_tsa_multi_inp_model_estim (g13bec) are available through the structure argument options
of type Nag_G13_Opt
. A parameter may be selected by assigning an appropriate value to the relevant structure member. Those parameters not selected will be assigned default values. If no use is to be made of any of the optional parameters the user should use the null pointer, G13_DEFAULT
, in place of options
when calling nag_tsa_multi_inp_model_estim (g13bec); the default settings will then be used for all parameters.
Before assigning values to options
the structure must be initialised by a call to the function nag_tsa_options_init (g13bxc)
. Values may then be assigned directly to the structure members in the normal C manner.
Options selected are checked within nag_tsa_multi_inp_model_estim (g13bec) for being within the required range, if outside the range, an error message is generated.
When all calls to nag_tsa_multi_inp_model_estim (g13bec) have been completed and the results contained in the options structure are no longer required; then nag_tsa_free (g13xzc)
should be called to free the NAG allocated memory from options
10.1 Optional Parameters Checklist and Default Values
For easy reference, the following list shows the input and output members of options
which are valid for nag_tsa_multi_inp_model_estim (g13bec) together with their default values where relevant.
is the machine precision
10.2 Description of Optional Parameters
On entry: if then the parameter settings which are used in the call to nag_tsa_multi_inp_model_estim (g13bec) will be printed.
Nag_PrintType ||Input|On entry
: the level of results produced by nag_tsa_multi_inp_model_estim (g13bec). The following values are available.
||The final solution.
||One line of output for each iteration.
||The final solution and one line of output for each iteration.
||The final solution and detailed printout at each iteration.
Details of each level of results printout are described in Section 7.3.
Nag_PrintNotSet, Nag_Soln, Nag_Iter, Nag_Soln_Iter or Nag_Soln_Iter_Full.
On entry: name of file to which the results of monitoring the course of the optimization should be printed. If then the stdout stream is used.
pointer to function ||Input|On entry
: printing function defined by the user; the prototype of print_fun
void (*print_fun)(const Nag_UserPrintFun *bfx, Nag_Comm *Comm);
See Section 10.3.1
below for further details.
must be set to TRUE
if the constant
is to remain fixed at its initial value, and to FALSE
if it is to be estimated.
Nag_Likelihood ||Input|On entry
: indicates the likelihood option for the estimation criterion. criteria
must be set to Nag_LeastSquares
, to select the least-squares, exact or marginal likelihood, respectively.
Nag_LeastSquares, Nag_Exact or Nag_Marginal.
: the maximum required number of iterations. If
, no change is made to any of the model parameters in array para
except that the constant
) and any
relating to simple input series are estimated. (Apart from these, estimates are always derived for the nuisance parameters relating to any backforecasts and any pre-observation period effects for transfer function inputs.)
, the value used to constrain the magnitude of the search procedure steps (see Section 3.3
, the multiplier which regulates the value of
(see Section 3.3
, the value of the stationarity and invertibility test tolerance factor (see Section 3.3
, the convergence criterion (see Section 3.3
: the number of iterations carried out. A value of
on exit indicates that the only estimates obtained up to this point have been for the nuisance parameters relating to backforecasts, unless the marginal likelihood option is used in which case estimates have also been obtained for simple input coefficients
and for the constant
). This value of iter
usually indicates a failure in a consequent step of estimating transfer function input pre-observation period nuisance parameters. A value of
on exit indicates that estimates have been obtained up to this point for the constant
), for simple input coefficients
and for the nuisance parameters relating to the backforecasts and to transfer function input pre-observation period effects.
: this pointer is allocated memory internally with
elements corresponding to npara
rows by npara
columns. The npara
rows and columns of cm
contain the correlation coefficients relating to each pair of parameters in para
. All coefficients relating to the constant will be zero if the constant is fixed. However, if the function fails to invert the second derivative matrix, in which case fail
will have an exit value of NE_MAT_NOT_POS_DEF
, then the contents of cm
will be indeterminate.
: the values of the residuals relating to the differenced values of the output series. This pointer is allocated memory internally with options.lenres
: the length of options.res
: this pointer is allocated memory internally with
elements corresponding to nxxy
columns. The columns of zt
hold the values of the input component series
: this pointer is allocated memory internally with nxxy
elements. It holds the output noise component
10.3 Description of Printed Output
The level of printed output can be controlled by the user with the structure members options.list
, see section 7.2. If
then the parameter values to nag_tsa_multi_inp_model_estim (g13bec) are listed, whereas the printout of results is governed by the value of options.print_level
. The default of Nag_Soln
which provides a printout of the final solution. This section describes all of the possible levels of results printout available from nag_tsa_multi_inp_model_estim (g13bec). When Nag_Iter
a single line of output is produced at each iteration, this gives the following values.
||the current iteration number, options.iter.
||the residual sum of squares, rss.
||the objective function at the latest set of parameter estimates.
a description and value for each of the parameters in the para array is output. The descriptions are phi for
, theta for
, sphi for
. stheta for
, omega/si for
in a simple input, omega for
in a transfer function input, options.delta
and constant for
. In addition series 1, series 2, etc, indicate the input series relevant to the omega and options.delta
If Nag_Soln, Nag_Soln_Iter or Nag_Soln_Iter_Full the final solution is printed out. This consists of:
||the parameter number.
||the values of the parameter.
||the standard deviations.
||the number of iterations carried out.
||the residual sum of squares.
||the objective function.
||the degrees of freedom.
If Nag_NoPrint then printout will be suppressed; the user can print the final solution when nag_tsa_multi_inp_model_estim (g13bec) returns to the calling program.
10.3.1 Output of results via a user defined printing function
The user may also specify their own print function for output of iteration results and the final solution by use of the options.print_fun
function pointer, prototype
void (*print_fun) (const Nag_UserPrintFun *bfx, Nag_Comm *Comm);
The rest of this section can be skipped if the default printing facilities provide the required functionality. When a user-defined function is assigned to options.print_fun
this will be called in preference to the internal print function of nag_tsa_multi_inp_model_estim (g13bec). Calls to the user-defined function are again controlled by means of the options.print_level
member. Information is provided through two structure arguments to options.print_fun
, the structure of type Nag_UserPrintFun
contains the following members relevant to nag_tsa_multi_inp_model_estim (g13bec):
- itc – Integer *
the number of the particular iteration being monitored.
- rss – double
the residual sum of squares, , at the latest set of valid parameter estimates.
- objf – double
the objective function, , at the latest set of valid parameter estimates.
- para – double
the pointer to memory containing npara
latest values of the estimates of the multi-input model parameters.
- npara – Integer *
the exact number of , , , , , and parameters.
- npe – Integer *
the number of ARIMA (
parameters being estimated.
- mtyp – Integer *
- mser – Integer *
the pointers to memory, each with npe
elements. The value of each element in mtyp
corresponds to the description of each parameter estimated in para
. The following should be read in conjuction with the description of the parameter print
. The relevant description for the value of para
for . For the phi, theta, sphi, stheta and constant parameters, .
- sd – double
the pointer to memory containing the npara values of the standard deviations.
- df – double
the number of degrees of freedom associated with S.
11 Example 2
This example illustrates the use of the options
parameter in a call to nag_tsa_multi_inp_model_estim (g13bec).
The data in the example relate to the same 40 observations of an output time series and of a single input time series as in Example 1. The noise series has one autoregressive
and one seasonal moving average
parameter (both of which are initially set to zero) for which the seasonal period is 4. The input series is defined by orders
, so that it has one
(initially set to 2.0) and one
(initially set to 0.5), and allows for pre-observation period effects. The constant (initially set to zero) is to be estimated so that the flag for the constant
, remains unchanged from its default value of FALSE
. Default values of zsp
are used. Up to 20 iterations are allowed so that options.max_iter
is set to 20, and the progress of these is monitored and solution output by setting Nag_Soln_Iter_Full
. Marginal likelihood is the chosen estimation criterion so that Nag_Marginal
After the successful call to nag_tsa_multi_inp_model_estim (g13bec), the following are computed and printed out: the correlation matrix, the residuals for the 36 differenced values and the values of and .
11.1 Program Text
11.2 Program Data
11.3 Program Results
© The Numerical Algorithms Group Ltd, Oxford, UK. 2004