# NAG Toolbox: nag_linsys_real_gen_lsq_covmat (f04ya)

## Purpose

nag_linsys_real_gen_lsq_covmat (f04ya) returns elements of the estimated variance-covariance matrix of the sample regression coefficients for the solution of a linear least squares problem.
The function can be used to find the estimated variances of the sample regression coefficients.

## Syntax

[a, cj, ifail] = f04ya(job, sigma, a, svd, irank, sv, 'p', p)
[a, cj, ifail] = nag_linsys_real_gen_lsq_covmat(job, sigma, a, svd, irank, sv, 'p', p)

## Description

The estimated variance-covariance matrix $C$ of the sample regression coefficients is given by
 $C = σ 2 XT X - 1 , XT X nonsingular,$
where ${X}^{\mathrm{T}}X$ is the normal matrix for the linear least squares regression problem
 $min:y-Xb2,$ (1)
${\sigma }^{2}$ is the estimated variance of the residual vector $r=y-Xb$, and $X$ is an $n$ by $p$ observation matrix.
When ${X}^{\mathrm{T}}X$ is singular, $C$ is taken to be
 $C=σ2 XTX †,$
where ${\left({X}^{\mathrm{T}}X\right)}^{†}$ is the pseudo-inverse of ${X}^{\mathrm{T}}X$; this assumes that the minimal least squares solution of (1) has been found.
The diagonal elements of $C$ are the estimated variances of the sample regression coefficients, $b$.
The function can be used to find either the diagonal elements of $C$, or the elements of the $j$th column of $C$, or the upper triangular part of $C$.
This function must be preceded by a function that returns either the upper triangular matrix $U$ of the $QU$ factorization of $X$ or of the Cholesky factorization of ${X}^{\mathrm{T}}X$, or the singular values and right singular vectors of $X$. In particular this function can be preceded by one of the functions nag_linsys_real_gen_solve (f04jg) or nag_lapack_dgelss (f08ka), which return the arguments irank, sigma, a and sv in the required form. nag_linsys_real_gen_solve (f04jg) returns the argument svd, but when this function is used following function nag_lapack_dgelss (f08ka) the argument svd should be set to true. The argument p of this function corresponds to the argument n in functions nag_linsys_real_gen_solve (f04jg) and nag_lapack_dgelss (f08ka).

## References

Anderson T W (1958) An Introduction to Multivariate Statistical Analysis Wiley
Lawson C L and Hanson R J (1974) Solving Least Squares Problems Prentice–Hall

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{job}$int64int32nag_int scalar
Specifies which elements of $C$ are required.
${\mathbf{job}}=-1$
The upper triangular part of $C$ is required.
${\mathbf{job}}=0$
The diagonal elements of $C$ are required.
${\mathbf{job}}>0$
The elements of column job of $C$ are required.
Constraint: $-1\le {\mathbf{job}}\le {\mathbf{p}}$.
2:     $\mathrm{sigma}$ – double scalar
$\sigma$, the standard error of the residual vector given by
 $σ=rTr/n-k,n>kσ=0,n=k,$
where $k$ is the rank of $X$.
Constraint: ${\mathbf{sigma}}\ge 0.0$.
3:     $\mathrm{a}\left(\mathit{lda},{\mathbf{p}}\right)$ – double array
lda, the first dimension of the array, must satisfy the constraint
• if ${\mathbf{svd}}=\mathit{false}$ or ${\mathbf{job}}=-1$, $\mathit{lda}\ge {\mathbf{p}}$;
• if ${\mathbf{svd}}=\mathit{true}$ and ${\mathbf{job}}\ge 0$, $\mathit{lda}\ge \mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1,{\mathbf{irank}}\right)$.
If ${\mathbf{svd}}=\mathit{false}$, a must contain the upper triangular matrix $U$ of the $QU$ factorization of $X$, or of the Cholesky factorization of ${X}^{\mathrm{T}}X$; elements of the array below the diagonal need not be set.
If ${\mathbf{svd}}=\mathit{true}$, $A$ must contain the first $k$ rows of the matrix ${V}^{\mathrm{T}}$, where $k$ is the rank of $X$ and $V$ is the right-hand orthogonal matrix of the singular value decomposition of $X$. Thus the $i$th row must contain the $i$th right-hand singular vector of $X$.
4:     $\mathrm{svd}$ – logical scalar
Must be true if the least squares solution was obtained from a singular value decomposition of $X$. svd must be false if the least squares solution was obtained from either a $QU$ factorization of $X$ or a Cholesky factorization of ${X}^{\mathrm{T}}X$. In the latter case the rank of $X$ is assumed to be $p$ and so is applicable only to full rank problems with $n\ge p$.
5:     $\mathrm{irank}$int64int32nag_int scalar
If ${\mathbf{svd}}=\mathit{true}$, irank must specify the rank $k$ of the matrix $X$.
If ${\mathbf{svd}}=\mathit{false}$, irank is not referenced and the rank of $X$ is assumed to be $p$.
Constraint: $0<{\mathbf{irank}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(n,{\mathbf{p}}\right)$.
6:     $\mathrm{sv}\left({\mathbf{p}}\right)$ – double array
If ${\mathbf{svd}}=\mathit{true}$, sv must contain the first irank singular values of $X$.
If ${\mathbf{svd}}=\mathit{false}$, sv is not referenced.

### Optional Input Parameters

1:     $\mathrm{p}$int64int32nag_int scalar
Default: the dimension of the array sv and the second dimension of the array a. (An error is raised if these dimensions are not equal.)
$p$, the order of the variance-covariance matrix $C$.
Constraint: ${\mathbf{p}}\ge 1$.

### Output Parameters

1:     $\mathrm{a}\left(\mathit{lda},{\mathbf{p}}\right)$ – double array
If ${\mathbf{job}}\ge 0$, $A$ is unchanged.
If ${\mathbf{job}}=-1$, a contains the upper triangle of the symmetric matrix $C$.
If ${\mathbf{svd}}=\mathit{true}$, elements of the array below the diagonal are used as workspace.
If ${\mathbf{svd}}=\mathit{false}$, they are unchanged.
2:     $\mathrm{cj}\left({\mathbf{p}}\right)$ – double array
If ${\mathbf{job}}=0$, cj returns the diagonal elements of $C$.
If ${\mathbf{job}}=j>0$, cj returns the $j$th column of $C$.
If ${\mathbf{job}}=-1$, cj is not referenced.
3:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{p}}<1$, or ${\mathbf{sigma}}<0.0$, or ${\mathbf{job}}<-1$, or ${\mathbf{job}}>{\mathbf{p}}$, or ${\mathbf{svd}}=\mathit{true}$ and (${\mathbf{irank}}<0$ or ${\mathbf{irank}}>{\mathbf{p}}$) or (${\mathbf{job}}\ge 0$ and $\mathit{lda}<\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1,{\mathbf{irank}}\right)$) or (${\mathbf{job}}=-1$ and $\mathit{lda}<{\mathbf{p}}$)), or ${\mathbf{svd}}=\mathit{false}$ and $\mathit{lda}<{\mathbf{p}}$.
${\mathbf{ifail}}=2$
On entry, ${\mathbf{svd}}=\mathit{true}$ and ${\mathbf{irank}}=0$.
${\mathbf{ifail}}=3$
On entry, ${\mathbf{svd}}=\mathit{false}$ and overflow would occur in computing an element of $C$. The upper triangular matrix $U$ must be very nearly singular.
${\mathbf{ifail}}=4$
On entry, ${\mathbf{svd}}=\mathit{true}$ and one of the first irank singular values is zero. Either the first irank singular values or irank must be incorrect.
$\mathbf{\text{overflow}}$
If overflow occurs then either an element of $C$ is very large, or more likely, either the rank, or the upper triangular matrix, or the singular values or vectors have been incorrectly supplied.
${\mathbf{ifail}}=-99$
An unexpected error has been triggered by this routine. Please contact NAG.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

The computed elements of $C$ will be the exact covariances of a closely neighbouring least squares problem, so long as a numerically stable method has been used in the solution of the least squares problem.

When ${\mathbf{job}}=-1$ the time taken by nag_linsys_real_gen_lsq_covmat (f04ya) is approximately proportional to $p{k}^{2}$, where $k$ is the rank of $X$. When ${\mathbf{job}}=0$ and ${\mathbf{svd}}=\mathit{false}$, the time taken by the function is approximately proportional to $p{k}^{2}$, otherwise the time taken is approximately proportional to $pk$.

## Example

This example finds the estimated variances of the sample regression coefficients (the diagonal elements of $C$) for the linear least squares problem
 $min⁡rTr , where ​ r=y-Xb and$
 $X= 0.6 1.2 3.9 5.0 4.0 2.5 1.0 -4.0 -5.5 -1.0 -2.0 -6.5 -4.2 -8.4 -4.8 , b= 3.0 4.0 -1.0 -5.0 -1.0 ,$
following a solution obtained by nag_linsys_real_gen_solve (f04jg). See the function document for nag_linsys_real_gen_solve (f04jg) for further information.
```function f04ya_example

fprintf('f04ya example results\n\n');

% Solve linear least squares problem Ax = b for general A
a = [ 0.6  1.2  3.9;
5.0  4.0  2.5;
1.0 -4.0 -5.5;
-1.0 -2.0 -6.5;
-4.2 -8.4 -4.8];
b = [3;    4;    -1;    -5;    -1];
[n,p] = size(a);
tol = 0.0005;
lwork = int64(32);
[a, x, svd, sigma, irank, work, ifail] = ...
f04jg(a, b, tol, lwork);

fprintf('Standard error = %6.3f, rank = %2d\n\n', sigma, irank);
if svd==1
fprintf('Solution obtained from SVD of A\n');
else
fprintf('Solution obtained from QU factorization of A\n');
end
disp(x(1:p)');

job = int64(0);
[a, cj, ifail] = f04ya( ...
job, sigma, a, svd, irank, work(1:p));
disp('Estimated variances of regression coefficients');
disp(cj');

```
```f04ya example results

Standard error =  0.412, rank =  3

Solution obtained from QU factorization of A
0.9533   -0.8433    0.9067

Estimated variances of regression coefficients
0.0106    0.0093    0.0045

```

