The routine may be called by the names g02dcf or nagf_correg_linregm_obs_edit.
3Description
g02daf fits a general linear regression model to a dataset. You may wish to change the model by either adding or deleting an observation from the dataset. g02dcf takes the results from g02daf and makes the required changes to the vector $c$ and the upper triangular matrix $R$ produced by g02daf. The regression coefficients, standard errors and the variance-covariance matrix of the regression coefficients can be obtained from g02ddf after all required changes to the dataset have been made.
g02daf performs a $QR$ decomposition on the (weighted) $X$ matrix of independent variables. To add a new observation to a model with $p$ parameters, the upper triangular matrix $R$ and vector ${c}_{1}$ (the first $p$ elements of $c$) are augmented by the new observation on independent variables in ${x}^{\mathrm{T}}$ and dependent variable ${y}_{\text{new}}$. Givens rotations are then used to restore the upper triangular form.
On entry: if
${\mathbf{isx}}\left(\mathit{j}\right)$ is greater than $0$, the value contained in ${\mathbf{x}}\left((\mathit{j}-1)\times {\mathbf{ix}}+1\right)$ is to be included as a value of ${x}^{\mathrm{T}}$, for $\mathit{j}=1,2,\dots ,{\mathbf{m}}$.
Constraint:
if ${\mathbf{mean}}=\text{'M'}$, exactly ${\mathbf{ip}}-1$ elements of isx must be $\text{}>0$ and if ${\mathbf{mean}}=\text{'Z'}$, exactly ip elements of isx must be $\text{}>0$.
6: $\mathbf{q}({\mathbf{ldq}},{\mathbf{ip}}+1)$ – Real (Kind=nag_wp) arrayInput/Output
On exit: the first ip elements of the first column of q will contain ${c}_{1}^{*}$ the upper triangular part of columns $2$ to ${\mathbf{ip}}+1$ will contain ${R}^{*}$ the remainder is unchanged.
7: $\mathbf{ldq}$ – IntegerInput
On entry: the first dimension of the array q as declared in the (sub)program from which g02dcf is called.
Constraint:
${\mathbf{ldq}}\ge {\mathbf{ip}}$.
8: $\mathbf{ip}$ – IntegerInput
On entry: the number of linear terms in general linear regression model (including mean if there is one).
Constraint:
${\mathbf{ip}}\ge 1$.
9: $\mathbf{x}\left(*\right)$ – Real (Kind=nag_wp) arrayInput
Note: the dimension of the array x
must be at least
$({\mathbf{m}}-1)\times {\mathbf{ix}}+1$.
On entry: the ip values for the dependent variables of the new observation, ${x}^{\mathrm{T}}$. The positions will depend on the value of ix.
10: $\mathbf{ix}$ – IntegerInput
On entry: the increment for elements of x. Two situations are common:
${\mathbf{ix}}=1$
The values of $x$ are to be chosen from consecutive locations in x, i.e., ${\mathbf{x}}\left(1\right),{\mathbf{x}}\left(2\right),\dots ,{\mathbf{x}}\left({\mathbf{m}}\right)$.
${\mathbf{ix}}={\mathbf{ldx}}$
The values of $x$ are to be chosen from a row of a two-dimensional array with first dimension ldx, i.e., ${\mathbf{x}}\left(1\right),{\mathbf{x}}\left({\mathbf{ldx}}+1\right),\dots ,{\mathbf{x}}\left(({\mathbf{m}}-1){\mathbf{ldx}}+1\right)$.
Constraint:
${\mathbf{ix}}\ge 1$.
11: $\mathbf{y}$ – Real (Kind=nag_wp)Input
On entry: the value of the dependent variable for the new observation, ${y}_{\text{new}}$.
12: $\mathbf{wt}$ – Real (Kind=nag_wp)Input
On entry: if ${\mathbf{weight}}=\text{'W'}$, wt must contain the weight to be used with the new observation.
If ${\mathbf{wt}}=0.0$, the observation is not included in the model.
If ${\mathbf{weight}}=\text{'U'}$, wt is not referenced.
Constraint:
if ${\mathbf{weight}}=\text{'W'}$, ${\mathbf{wt}}\ge 0.0$.
13: $\mathbf{rss}$ – Real (Kind=nag_wp)Input/Output
On entry: the value of the residual sums of squares for the original set of observations.
Constraint:
${\mathbf{rss}}\ge 0.0$.
On exit: the updated values of the residual sums of squares.
Note: this will only be valid if the model is of full rank.
14: $\mathbf{wk}\left(3\times {\mathbf{ip}}\right)$ – Real (Kind=nag_wp) arrayWorkspace
15: $\mathbf{ifail}$ – IntegerInput/Output
On entry: ifail must be set to $0$, $\mathrm{-1}$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $\mathrm{-1}$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $\mathrm{-1}$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).
6Error Indicators and Warnings
If on entry ${\mathbf{ifail}}=0$ or $\mathrm{-1}$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=1$
On entry, $\u27e8\mathit{\text{value}}\u27e9$ elements of ${\mathbf{isx}}>0$ instead of ${\mathbf{ip}}=\u27e8\mathit{\text{value}}\u27e9$.
On entry, $\u27e8\mathit{\text{value}}\u27e9$ elements of ${\mathbf{isx}}>0$ instead of ${\mathbf{ip}}-1$ (for mean) $\text{}=\u27e8\mathit{\text{value}}\u27e9$.
On entry, ${\mathbf{ip}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ip}}\ge 1$.
On entry, ${\mathbf{ix}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ix}}\ge 1$.
On entry, ${\mathbf{ldq}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{ip}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ldq}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{mean}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{mean}}=\text{'M'}$ or $\text{'Z'}$.
On entry, ${\mathbf{rss}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{rss}}\ge 0.0$.
On entry, ${\mathbf{update}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{update}}=\text{'A'}$ or $\text{'D'}$.
On entry, ${\mathbf{weight}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{weight}}=\text{'U'}$ or $\text{'W'}$.
${\mathbf{ifail}}=2$
On entry, ${\mathbf{wt}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{wt}}\ge 0.0$.
${\mathbf{ifail}}=3$
The $R$ matrix could not be updated. This may occur if an attempt is made to delete an observation which was not in the original dataset or to add an observation to a $R$ matrix with a zero diagonal element. This error is also possible when removing an observation which reduces the rank of design matrix. In such cases the model should be recomputed using g02daf.
${\mathbf{ifail}}=4$
The residual sums of squares cannot be updated. This will occur if the input residual sum of squares is less than the calculated decrease in residual sum of squares when the new observation is deleted.
${\mathbf{ifail}}=-99$
An unexpected error has been triggered by this routine. Please
contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.
7Accuracy
Higher accuracy is achieved by updating the $R$ matrix rather than the traditional methods of updating ${X}^{\prime}X$.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
g02dcf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
9Further Comments
Care should be taken with the use of g02dcf.
(a)It is possible to delete observations which were not included in the original model.
(b)If several additions/deletions have been performed you are advised to recompute the regression using g02daf.
(c)Adding or deleting observations can alter the rank of the model. Such changes will only be detected when a call to g02ddf has been made. g02ddf should also be used to compute the new residual sum of squares when the model is not of full rank.
A dataset consisting of $12$ observations with four independent variables is read in and a general linear regression model fitted by g02daf and parameter estimates printed. The last observation is then dropped and the parameter estimates recalculated, using g02ddf, and printed. Finally a new observation is added and new parameter estimates computed and printed.