# NAG FL Interfaceg02dgf (linregm_​fit_​newvar)

## 1Purpose

g02dgf calculates the estimates of the parameters of a general linear regression model for a new dependent variable after a call to g02daf.

## 2Specification

Fortran Interface
 Subroutine g02dgf ( n, wt, rss, ip, cov, q, ldq, svd, p, y, b, se, res, wk,
 Integer, Intent (In) :: n, ip, irank, ldq Integer, Intent (Inout) :: ifail Real (Kind=nag_wp), Intent (In) :: wt(*), p(*), y(n), wk(5*(ip-1)+ip*ip) Real (Kind=nag_wp), Intent (Inout) :: rss, cov(ip*(ip+1)/2), q(ldq,ip+1) Real (Kind=nag_wp), Intent (Out) :: b(ip), se(ip), res(n) Logical, Intent (In) :: svd Character (1), Intent (In) :: weight
#include <nag.h>
 void g02dgf_ (const char *weight, const Integer *n, const double wt[], double *rss, const Integer *ip, const Integer *irank, double cov[], double q[], const Integer *ldq, const logical *svd, const double p[], const double y[], double b[], double se[], double res[], const double wk[], Integer *ifail, const Charlen length_weight)
The routine may be called by the names g02dgf or nagf_correg_linregm_fit_newvar.

## 3Description

g02dgf uses the results given by g02daf to fit the same set of independent variables to a new dependent variable.
g02daf computes a $QR$ decomposition of the matrix of $p$ independent variables and also, if the model is not of full rank, a singular value decomposition (SVD). These results can be used to compute estimates of the parameters for a general linear model with a new dependent variable. The $QR$ decomposition leads to the formation of an upper triangular $p$ by $p$ matrix $R$ and an $n$ by $n$ orthogonal matrix $Q$. In addition the vector $c={Q}^{\mathrm{T}}y$ (or ${Q}^{\mathrm{T}}{W}^{1/2}y$) is computed. For a new dependent variable, ${y}_{\mathrm{new}}$, g02dgf computes a new value of $c={Q}^{\mathrm{T}}{y}_{\text{new}}$ or ${Q}^{\mathrm{T}}{W}^{1/2}{y}_{\text{new}}$.
If $R$ is of full rank, then the least squares parameter estimates, $\stackrel{^}{\beta }$, are the solution to
 $Rβ^=c1,$
where ${c}_{1}$ is the first $p$ elements of $c$.
If $R$ is not of full rank, then g02daf will have computed an SVD of $R$,
 $R=Q* D 0 0 0 PT,$
where $D$ is a $k$ by $k$ diagonal matrix with nonzero diagonal elements, $k$ being the rank of $R$, and ${Q}_{*}$ and $P$ are $p$ by $p$ orthogonal matrices. This gives the solution
 $β^=P1D-1 Q*1T c1,$
${P}_{1}$ being the first $k$ columns of $P$, i.e., $P=\left({P}_{1}{P}_{0}\right)$, and ${Q}_{{*}_{1}}$ being the first $k$ columns of ${Q}_{*}$. Details of the SVD are made available by g02daf in the form of the matrix ${P}^{*}$:
 $P*= D-1 P1T P0T .$
The matrix ${Q}_{*}$ is made available through the workspace of g02daf.
In addition to parameter estimates, the new residuals are computed and the variance-covariance matrix of the parameter estimates are found by scaling the variance-covariance matrix for the original regression.

## 4References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25
Searle S R (1971) Linear Models Wiley

## 5Arguments

1: $\mathbf{weight}$Character(1) Input
On entry: indicates if weights are to be used.
${\mathbf{weight}}=\text{'U'}$
Least squares estimation is used.
${\mathbf{weight}}=\text{'W'}$
Weighted least squares is used and weights must be supplied in array wt.
Constraint: ${\mathbf{weight}}=\text{'U'}$ or $\text{'W'}$.
2: $\mathbf{n}$Integer Input
On entry: $n$, the number of observations.
Constraint: ${\mathbf{n}}\ge {\mathbf{ip}}$.
3: $\mathbf{wt}\left(*\right)$Real (Kind=nag_wp) array Input
Note: the dimension of the array wt must be at least ${\mathbf{n}}$ if ${\mathbf{weight}}=\text{'W'}$.
On entry: if ${\mathbf{weight}}=\text{'W'}$ wt must contain the weights to be used with the model.
If ${\mathbf{wt}}\left(i\right)=0.0$, the $i$th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.
If ${\mathbf{weight}}=\text{'U'}$, wt is not referenced and the effective number of observations is $n$.
Constraint: if ${\mathbf{weight}}=\text{'W'}$, ${\mathbf{wt}}\left(\mathit{i}\right)\ge 0.0$, for $\mathit{i}=1,2,\dots ,n$.
4: $\mathbf{rss}$Real (Kind=nag_wp) Input/Output
On entry: the residual sum of squares for the original dependent variable.
On exit: the residual sum of squares for the new dependent variable.
Constraint: ${\mathbf{rss}}>0.0$.
5: $\mathbf{ip}$Integer Input
On entry: $p$, the number of independent variables (including the mean if fitted).
Constraint: $1\le {\mathbf{ip}}\le {\mathbf{n}}$.
6: $\mathbf{irank}$Integer Input
On entry: the rank of the independent variables, as given by g02daf.
Constraint: ${\mathbf{irank}}>0$, and if ${\mathbf{svd}}=\mathrm{.FALSE.}$, then ${\mathbf{irank}}={\mathbf{ip}}$, else ${\mathbf{irank}}\le {\mathbf{ip}}$.
7: $\mathbf{cov}\left({\mathbf{ip}}×\left({\mathbf{ip}}+1\right)/2\right)$Real (Kind=nag_wp) array Input/Output
On entry: the covariance matrix of the parameter estimates as given by g02daf.
On exit: the upper triangular part of the variance-covariance matrix of the ip parameter estimates given in b. They are stored packed by column, i.e., the covariance between the parameter estimate given in ${\mathbf{b}}\left(i\right)$ and the parameter estimate given in ${\mathbf{b}}\left(j\right)$, $j\ge i$, is stored in ${\mathbf{cov}}\left(\left(j×\left(j-1\right)/2+i\right)\right)$.
8: $\mathbf{q}\left({\mathbf{ldq}},{\mathbf{ip}}+1\right)$Real (Kind=nag_wp) array Input/Output
On entry: the results of the $QR$ decomposition as returned by g02daf.
On exit: the first column of q contains the new values of $c$, the remainder of q will be unchanged.
9: $\mathbf{ldq}$Integer Input
On entry: the first dimension of the array q as declared in the (sub)program from which g02dgf is called.
Constraint: ${\mathbf{ldq}}\ge {\mathbf{n}}$.
10: $\mathbf{svd}$Logical Input
On entry: indicates if a singular value decomposition was used by g02daf.
${\mathbf{svd}}=\mathrm{.TRUE.}$
A singular value decomposition was used by g02daf.
${\mathbf{svd}}=\mathrm{.FALSE.}$
A singular value decomposition was not used by g02daf.
11: $\mathbf{p}\left(*\right)$Real (Kind=nag_wp) array Input
Note: the dimension of the array p must be at least ${\mathbf{ip}}$ if ${\mathbf{svd}}=\mathrm{.FALSE.}$, and at least ${\mathbf{ip}}×{\mathbf{ip}}+2×{\mathbf{ip}}$ otherwise.
On entry: details of the $QR$ decomposition and SVD, if used, as returned in array p by g02daf.
If ${\mathbf{svd}}=\mathrm{.FALSE.}$, only the first ip elements of p are used; these contain the zeta values for the $QR$ decomposition (see f08aef for details).
If ${\mathbf{svd}}=\mathrm{.TRUE.}$, the first ip elements of p contain the zeta values for the $QR$ decomposition (see f08aef for details) and the next ${\mathbf{ip}}×{\mathbf{ip}}+{\mathbf{ip}}$ elements of p contain details of the singular value decomposition.
12: $\mathbf{y}\left({\mathbf{n}}\right)$Real (Kind=nag_wp) array Input
On entry: the new dependent variable, ${y}_{\text{new}}$.
13: $\mathbf{b}\left({\mathbf{ip}}\right)$Real (Kind=nag_wp) array Output
On exit: the least squares estimates of the parameters of the regression model, $\stackrel{^}{\beta }$.
14: $\mathbf{se}\left({\mathbf{ip}}\right)$Real (Kind=nag_wp) array Output
On exit: the standard error of the estimates of the parameters.
15: $\mathbf{res}\left({\mathbf{n}}\right)$Real (Kind=nag_wp) array Output
On exit: the residuals for the new regression model.
16: $\mathbf{wk}\left(5×\left({\mathbf{ip}}-1\right)+{\mathbf{ip}}×{\mathbf{ip}}\right)$Real (Kind=nag_wp) array Input
On entry: if ${\mathbf{svd}}=\mathrm{.TRUE.}$, wk must be unaltered from the previous call to g02daf or g02dgf.
If ${\mathbf{svd}}=\mathrm{.FALSE.}$, wk is used as workspace.
17: $\mathbf{ifail}$Integer Input/Output
On entry: ifail must be set to $0$, $-1$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $-1$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $-1$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).

## 6Error Indicators and Warnings

If on entry ${\mathbf{ifail}}=0$ or $-1$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=1$
On entry, ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ip}}\ge 1$.
On entry, ${\mathbf{irank}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{irank}}>0$.
On entry, ${\mathbf{irank}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{svd}}=\mathrm{.FALSE.}$, ${\mathbf{irank}}={\mathbf{ip}}$.
On entry, ${\mathbf{irank}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{svd}}=\mathrm{.TRUE.}$, ${\mathbf{irank}}\le {\mathbf{ip}}$.
On entry, ${\mathbf{ldq}}=〈\mathit{\text{value}}〉$ and ${\mathbf{n}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ldq}}\ge {\mathbf{n}}$.
On entry, ${\mathbf{n}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{n}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{rss}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{rss}}>0.0$.
On entry, ${\mathbf{weight}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{weight}}=\text{'W'}$ or $\text{'U'}$.
${\mathbf{ifail}}=2$
On entry, ${\mathbf{wt}}\left(〈\mathit{\text{value}}〉\right)<0.0$.
Constraint: ${\mathbf{wt}}\left(i\right)\ge 0.0$, for $i=1,2,\dots ,n$.
${\mathbf{ifail}}=-99$
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

## 7Accuracy

The same accuracy as g02daf is obtained.

## 8Parallelism and Performance

g02dgf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g02dgf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

The values of the leverages, ${h}_{i}$, are unaltered by a change in the dependent variable so a call to g02faf can be made using the value of h from g02daf.

## 10Example

A dataset consisting of $12$ observations with four independent variables and two dependent variables are read in. A model with all four independent variables is fitted to the first dependent variable by g02daf and the results printed. The model is then fitted to the second dependent variable by g02dgf and those results printed.

### 10.1Program Text

Program Text (g02dgfe.f90)

### 10.2Program Data

Program Data (g02dgfe.d)

### 10.3Program Results

Program Results (g02dgfe.r)