Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_correg_linregm_fit_newvar (g02dg)

## Purpose

nag_correg_linregm_fit_newvar (g02dg) calculates the estimates of the arguments of a general linear regression model for a new dependent variable after a call to nag_correg_linregm_fit (g02da).

## Syntax

[rss, covar, q, b, se, res, ifail] = g02dg(rss, ip, irank, covar, q, svd, p, y, wk, 'n', n, 'wt', wt)
[rss, covar, q, b, se, res, ifail] = nag_correg_linregm_fit_newvar(rss, ip, irank, covar, q, svd, p, y, wk, 'n', n, 'wt', wt)
Note: the interface to this routine has changed since earlier releases of the toolbox:
 At Mark 23: weight was removed from the interface; wt was made optional

## Description

nag_correg_linregm_fit_newvar (g02dg) uses the results given by nag_correg_linregm_fit (g02da) to fit the same set of independent variables to a new dependent variable.
nag_correg_linregm_fit (g02da) computes a $QR$ decomposition of the matrix of $p$ independent variables and also, if the model is not of full rank, a singular value decomposition (SVD). These results can be used to compute estimates of the arguments for a general linear model with a new dependent variable. The $QR$ decomposition leads to the formation of an upper triangular $p$ by $p$ matrix $R$ and an $n$ by $n$ orthogonal matrix $Q$. In addition the vector $c={Q}^{\mathrm{T}}y$ (or ${Q}^{\mathrm{T}}{W}^{1/2}y$) is computed. For a new dependent variable, ${y}_{\mathrm{new}}$, nag_correg_linregm_fit_newvar (g02dg) computes a new value of $c={Q}^{\mathrm{T}}{y}_{\text{new}}$ or ${Q}^{\mathrm{T}}{W}^{1/2}{y}_{\text{new}}$.
If $R$ is of full rank, then the least squares parameter estimates, $\stackrel{^}{\beta }$, are the solution to
 $Rβ^=c1,$
where ${c}_{1}$ is the first $p$ elements of $c$.
If $R$ is not of full rank, then nag_correg_linregm_fit (g02da) will have computed an SVD of $R$,
 $R=Q* D 0 0 0 PT,$
where $D$ is a $k$ by $k$ diagonal matrix with nonzero diagonal elements, $k$ being the rank of $R$, and ${Q}_{*}$ and $P$ are $p$ by $p$ orthogonal matrices. This gives the solution
 $β^=P1D-1 Q*1T c1,$
${P}_{1}$ being the first $k$ columns of $P$, i.e., $P=\left({P}_{1}{P}_{0}\right)$, and ${Q}_{{*}_{1}}$ being the first $k$ columns of ${Q}_{*}$. Details of the SVD are made available by nag_correg_linregm_fit (g02da) in the form of the matrix ${P}^{*}$:
 $P*= D-1 P1T P0T .$
The matrix ${Q}_{*}$ is made available through the workspace of nag_correg_linregm_fit (g02da).
In addition to parameter estimates, the new residuals are computed and the variance-covariance matrix of the parameter estimates are found by scaling the variance-covariance matrix for the original regression.

## References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25
Searle S R (1971) Linear Models Wiley

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{rss}$ – double scalar
The residual sum of squares for the original dependent variable.
Constraint: ${\mathbf{rss}}>0.0$.
2:     $\mathrm{ip}$int64int32nag_int scalar
$p$, the number of independent variables (including the mean if fitted).
Constraint: $1\le {\mathbf{ip}}\le {\mathbf{n}}$.
3:     $\mathrm{irank}$int64int32nag_int scalar
The rank of the independent variables, as given by nag_correg_linregm_fit (g02da).
Constraint: ${\mathbf{irank}}>0$, and if ${\mathbf{svd}}=\mathit{false}$, then ${\mathbf{irank}}={\mathbf{ip}}$, else ${\mathbf{irank}}\le {\mathbf{ip}}$.
4:     $\mathrm{covar}\left({\mathbf{ip}}×\left({\mathbf{ip}}+1\right)/2\right)$ – double array
The covariance matrix of the parameter estimates as given by nag_correg_linregm_fit (g02da).
5:     $\mathrm{q}\left(\mathit{ldq},{\mathbf{ip}}+1\right)$ – double array
ldq, the first dimension of the array, must satisfy the constraint $\mathit{ldq}\ge {\mathbf{n}}$.
The results of the $QR$ decomposition as returned by nag_correg_linregm_fit (g02da).
6:     $\mathrm{svd}$ – logical scalar
Indicates if a singular value decomposition was used by nag_correg_linregm_fit (g02da).
${\mathbf{svd}}=\mathit{true}$
A singular value decomposition was used by nag_correg_linregm_fit (g02da).
${\mathbf{svd}}=\mathit{false}$
A singular value decomposition was not used by nag_correg_linregm_fit (g02da).
7:     $\mathrm{p}\left(:\right)$ – double array
The dimension of the array p must be at least ${\mathbf{ip}}$ if ${\mathbf{svd}}=\mathit{false}$, and at least ${\mathbf{ip}}×{\mathbf{ip}}+2×{\mathbf{ip}}$ otherwise
Details of the $QR$ decomposition and SVD, if used, as returned in array p by nag_correg_linregm_fit (g02da).
If ${\mathbf{svd}}=\mathit{false}$, only the first ip elements of p are used; these contain the zeta values for the $QR$ decomposition (see nag_lapack_dgeqrf (f08ae) for details).
If ${\mathbf{svd}}=\mathit{true}$, the first ip elements of p contain the zeta values for the $QR$ decomposition (see nag_lapack_dgeqrf (f08ae) for details) and the next ${\mathbf{ip}}×{\mathbf{ip}}+{\mathbf{ip}}$ elements of p contain details of the singular value decomposition.
8:     $\mathrm{y}\left({\mathbf{n}}\right)$ – double array
The new dependent variable, ${y}_{\text{new}}$.
9:     $\mathrm{wk}\left(5×\left({\mathbf{ip}}-1\right)+{\mathbf{ip}}×{\mathbf{ip}}\right)$ – double array
If ${\mathbf{svd}}=\mathit{true}$, wk must be unaltered from the previous call to nag_correg_linregm_fit (g02da) or nag_correg_linregm_fit_newvar (g02dg).
If ${\mathbf{svd}}=\mathit{false}$, wk is used as workspace.

### Optional Input Parameters

1:     $\mathrm{n}$int64int32nag_int scalar
Default: the dimension of the array y and the first dimension of the array q. (An error is raised if these dimensions are not equal.)
$n$, the number of observations.
Constraint: ${\mathbf{n}}\ge {\mathbf{ip}}$.
2:     $\mathrm{wt}\left(:\right)$ – double array
The dimension of the array wt must be at least ${\mathbf{n}}$ if $\mathit{weight}=\text{'W'}$, and at least $1$ otherwise
If provided>, wt must contain the weights to be used in the weighted regression.
If ${\mathbf{wt}}\left(i\right)=0.0$, the $i$th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.
If wt is not provided the effective number of observations is $n$.
Constraint: if $\mathit{weight}=\text{'W'}$, ${\mathbf{wt}}\left(\mathit{i}\right)\ge 0.0$, for $\mathit{i}=1,2,\dots ,n$.

### Output Parameters

1:     $\mathrm{rss}$ – double scalar
The residual sum of squares for the new dependent variable.
2:     $\mathrm{covar}\left({\mathbf{ip}}×\left({\mathbf{ip}}+1\right)/2\right)$ – double array
The upper triangular part of the variance-covariance matrix of the ip parameter estimates given in b. They are stored packed by column, i.e., the covariance between the parameter estimate given in ${\mathbf{b}}\left(i\right)$ and the parameter estimate given in ${\mathbf{b}}\left(j\right)$, $j\ge i$, is stored in ${\mathbf{covar}}\left(\left(j×\left(j-1\right)/2+i\right)\right)$.
3:     $\mathrm{q}\left(\mathit{ldq},{\mathbf{ip}}+1\right)$ – double array
The first column of q contains the new values of $c$, the remainder of q will be unchanged.
4:     $\mathrm{b}\left({\mathbf{ip}}\right)$ – double array
The least squares estimates of the parameters of the regression model, $\stackrel{^}{\beta }$.
5:     $\mathrm{se}\left({\mathbf{ip}}\right)$ – double array
The standard error of the estimates of the parameters.
6:     $\mathrm{res}\left({\mathbf{n}}\right)$ – double array
The residuals for the new regression model.
7:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{ip}}<1$, or ${\mathbf{n}}<{\mathbf{ip}}$, or ${\mathbf{irank}}\le 0$, or ${\mathbf{svd}}=\mathit{false}$ and ${\mathbf{irank}}\ne {\mathbf{ip}}$, or ${\mathbf{svd}}=\mathit{true}$ and ${\mathbf{irank}}>{\mathbf{ip}}$, or $\mathit{ldq}<{\mathbf{n}}$, or ${\mathbf{rss}}\le 0.0$, or $\mathit{weight}\ne \text{'U'}$ or $\text{'W'}$.
${\mathbf{ifail}}=2$
 On entry, $\mathit{weight}=\text{'W'}$ and a value of ${\mathbf{wt}}<0.0$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

The same accuracy as nag_correg_linregm_fit (g02da) is obtained.

The values of the leverages, ${h}_{i}$, are unaltered by a change in the dependent variable so a call to nag_correg_linregm_stat_resinf (g02fa) can be made using the value of h from nag_correg_linregm_fit (g02da).

## Example

A dataset consisting of $12$ observations with four independent variables and two dependent variables are read in. A model with all four independent variables is fitted to the first dependent variable by nag_correg_linregm_fit (g02da) and the results printed. The model is then fitted to the second dependent variable by nag_correg_linregm_fit_newvar (g02dg) and those results printed.
```function g02dg_example

fprintf('g02dg example results\n\n');

x = [1, 0, 0, 0;
0, 0, 0, 1;
0, 1, 0, 0;
0, 0, 1, 0;
0, 0, 0, 1;
0, 1, 0, 0;
0, 0, 0, 1;
1, 0, 0, 0;
0, 0, 1, 0;
1, 0, 0, 0;
0, 0, 1, 0;
0, 1, 0, 0];
y = [33.63;     39.62;     38.18;     41.46;     38.02;     35.83;
35.99;     36.58;     42.92;     37.80;     40.43;     37.89];
ynew = [63; 69; 68; 71; 68; 65; 65; 66; 72; 67; 70; 67];

[n,m]  = size(x);
isx    = ones(m,1,'int64');
mean_p = 'M';
ip     = int64(m+1);

% Fit general linear regression model to y
[rss, idf, b, se, covar, res, h, q, svd, irank, p, wk, ifail] = ...
g02da(mean_p, x, isx, ip, y);

% Display results for y
fprintf('Results for original y-variable using g02da\n\n');
if svd
fprintf('Model not of full rank\n\n');
end
fprintf('Residual sum of squares = %12.4e\n', rss);
fprintf('Degrees of freedom      = %4d\n', idf);
fprintf('\nVariable   Parameter estimate   Standard error\n\n');
ivar = double([1:ip]');
fprintf('%6d%20.4e%20.4e\n',[ivar b se]');

% Fit same model to ynew
[rss, covar, q, b, se, res, ifail] = ...
g02dg( ...
rss, ip, irank, covar, q, svd, p, ynew, wk);

% Display results for ynew
fprintf('\nResults for second y-variable using g02dg\n\n');
fprintf('Residual sum of squares = %12.4e\n', rss);
fprintf('Degrees of freedom      = %4d\n', idf);
fprintf('\nVariable   Parameter estimate   Standard error\n\n');
ivar = double([1:ip]');
fprintf('%6d%20.4e%20.4e\n',[ivar b se]');

```
```g02dg example results

Results for original y-variable using g02da

Model not of full rank

Residual sum of squares =   2.2227e+01
Degrees of freedom      =    8

Variable   Parameter estimate   Standard error

1          3.0557e+01          3.8494e-01
2          5.4467e+00          8.3896e-01
3          6.7433e+00          8.3896e-01
4          1.1047e+01          8.3896e-01
5          7.3200e+00          8.3896e-01

Results for second y-variable using g02dg

Residual sum of squares =   2.4000e+01
Degrees of freedom      =    8

Variable   Parameter estimate   Standard error

1          5.4067e+01          4.0000e-01
2          1.1267e+01          8.7178e-01
3          1.2600e+01          8.7178e-01
4          1.6933e+01          8.7178e-01
5          1.3267e+01          8.7178e-01
```