Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_correg_lars_param (g02mc)

## Purpose

nag_correg_lars_param (g02mc) calculates additional parameter estimates following Least Angle Regression (LARS), forward stagewise linear regression or Least Absolute Shrinkage and Selection Operator (LASSO) as performed by nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb).

## Syntax

[nb, ifail] = g02mc(b, fitsum, ktype, nk, 'nstep', nstep, 'ip', ip, 'lnk', lnk)
[nb, ifail] = nag_correg_lars_param(b, fitsum, ktype, nk, 'nstep', nstep, 'ip', ip, 'lnk', lnk)

## Description

nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb) fit either a LARS, forward stagewise linear regression, LASSO or positive LASSO model to a vector of $n$ observed values, $y=\left\{{y}_{i}:i=1,2,\dots ,n\right\}$ and an $n×p$ design matrix $X$, where the $j$th column of $X$ is given by the $j$th independent variable ${x}_{j}$. The models are fit using the LARS algorithm of Efron et al. (2004).
Figure 1
The full solution path for all four of these models follow a similar pattern where the parameter estimate for a given variable is piecewise linear. One such path, for a LARS model with six variables $\left(p=6\right)$ can be seen in Figure 1. Both nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb) return the vector of $p$ parameter estimates, ${\beta }_{k}$, at $K$ points along this path (so $k=1,2,\dots ,K$). Each point corresponds to a step of the LARS algorithm. The number of steps taken depends on the model being fitted. In the case of a LARS model, $K=p$ and each step corresponds to a new variable being included in the model. In the case of the LASSO models, each step corresponds to either a new variable being included in the model or an existing variable being removed from the model; the value of $K$ is therefore no longer bound by the number of parameters. For forward stagewise linear regression, each step no longer corresponds to the addition or removal of a variable; therefore the number of possible steps is often markedly greater than for a corresponding LASSO model.
nag_correg_lars_param (g02mc) uses the piecewise linear nature of the solution path to predict the parameter estimates, $\stackrel{~}{\beta }$, at a different point on this path. The location of the solution can either be defined in terms of a (fractional) step number or a function of the ${L}_{1}$ norm of the parameter estimates.

## References

Efron B, Hastie T, Johnstone I and Tibshirani R (2004) Least Angle Regression The Annals of Statistics (Volume 32) 2 407–499
Hastie T, Tibshirani R and Friedman J (2001) The Elements of Statistical Learning: Data Mining, Inference and Prediction Springer (New York)
Tibshirani R (1996) Regression Shrinkage and Selection via the Lasso Journal of the Royal Statistics Society, Series B (Methodological) (Volume 58) 1 267–288
Weisberg S (1985) Applied Linear Regression Wiley

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{b}\left(\mathit{ldb},:\right)$ – double array
The first dimension of the array b must be at least ${\mathbf{ip}}$.
The second dimension of the array b must be at least ${\mathbf{nstep}}+1$.
$\beta$ the parameter estimates, as returned by nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb), with ${\mathbf{b}}\left(\mathit{j},k\right)={\beta }_{k\mathit{j}}$, the parameter estimate for the $\mathit{j}$th variable, for $\mathit{j}=1,2,\dots ,p$, at the $k$th step of the model fitting process.
Constraint: b should be unchanged since the last call to nag_correg_lars (g02ma) or nag_correg_lars_xtx (g02mb).
2:     $\mathrm{fitsum}\left(6,{\mathbf{nstep}}+1\right)$ – double array
Summaries of the model fitting process, as returned by nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb).
Constraint: fitsum should be unchanged since the last call to nag_correg_lars (g02ma) or nag_correg_lars_xtx (g02mb)..
3:     $\mathrm{ktype}$int64int32nag_int scalar
Indicates what target values are held in nk.
${\mathbf{ktype}}=1$
nk holds (fractional) LARS step numbers.
${\mathbf{ktype}}=2$
nk holds values for ${L}_{1}$ norm of the (scaled) parameters.
${\mathbf{ktype}}=3$
nk holds ratios with respect to the largest (scaled) ${L}_{1}$ norm.
${\mathbf{ktype}}=4$
nk holds values for the ${L}_{1}$ norm of the (unscaled) parameters.
${\mathbf{ktype}}=5$
nk holds ratios with respect to the largest (unscaled) ${L}_{1}$ norm.
If nag_correg_lars (g02ma) was called with ${\mathbf{pred}}=0$ or $1$ or nag_correg_lars_xtx (g02mb) was called with ${\mathbf{pred}}=0$ then the model fitting routine did not rescale the independent variables, $X$, prior to fitting the model and therefore there is no difference between ${\mathbf{ktype}}=2$ or $3$ and ${\mathbf{ktype}}=4$ or $5$.
Constraint: ${\mathbf{ktype}}=1$, $2$, $3$, $4$ or $5$.
4:     $\mathrm{nk}\left({\mathbf{lnk}}\right)$ – double array
Target values used for predicting the new set of parameter estimates.
Constraints:
• if ${\mathbf{ktype}}=1$, $0\le {\mathbf{nk}}\left(\mathit{i}\right)\le {\mathbf{nstep}}$, for $\mathit{i}=1,2,\dots ,{\mathbf{lnk}}$;
• if ${\mathbf{ktype}}=2$, $0\le {\mathbf{nk}}\left(\mathit{i}\right)\le {\mathbf{fitsum}}\left(1,{\mathbf{nstep}}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{lnk}}$;
• if ${\mathbf{ktype}}=3$ or $5$, $0\le {\mathbf{nk}}\left(\mathit{i}\right)\le 1$, for $\mathit{i}=1,2,\dots ,{\mathbf{lnk}}$;
• if ${\mathbf{ktype}}=4$, $0\le {\mathbf{nk}}\left(\mathit{i}\right)\le {‖{\beta }_{K}‖}_{1}$, for $\mathit{i}=1,2,\dots ,{\mathbf{lnk}}$.

### Optional Input Parameters

1:     $\mathrm{nstep}$int64int32nag_int scalar
Default:
$K$, the number of steps carried out in the model fitting process.
Constraint: ${\mathbf{nstep}}\ge 0$.
2:     $\mathrm{ip}$int64int32nag_int scalar
Default: the first dimension of the array b.
$p$, number of parameter estimates.
Constraint: ${\mathbf{ip}}\ge 1$.
3:     $\mathrm{lnk}$int64int32nag_int scalar
Default: the dimension of the array nk.
Number of values supplied in nk.
Constraint: ${\mathbf{lnk}}\ge 1$.

### Output Parameters

1:     $\mathrm{nb}\left(\mathit{ldnb},:\right)$ – double array
The first dimension of the array nb will be ${\mathbf{ip}}$.
The second dimension of the array nb will be ${\mathbf{lnk}}$.
$\stackrel{~}{\beta }$ the predicted parameter estimates, with ${\mathbf{b}}\left(j,i\right)={\stackrel{~}{\beta }}_{ij}$, the parameter estimate for variable $j$, $j=1,2,\dots ,p$ at the point in the fitting process associated with ${\mathbf{nk}}\left(i\right)$, $i=1,2,\dots ,{\mathbf{lnk}}$.
2:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Note: nag_correg_lars_param (g02mc) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:
${\mathbf{ifail}}=11$
Constraint: ${\mathbf{nstep}}\ge 0$.
${\mathbf{ifail}}=21$
Constraint: ${\mathbf{ip}}\ge 1$.
${\mathbf{ifail}}=31$
b has been corrupted since the last call to nag_correg_lars (g02ma) or nag_correg_lars_xtx (g02mb).
${\mathbf{ifail}}=41$
Constraint: $\mathit{ldb}\ge {\mathbf{ip}}$.
${\mathbf{ifail}}=51$
fitsum has been corrupted since the last call to nag_correg_lars (g02ma) or nag_correg_lars_xtx (g02mb).
${\mathbf{ifail}}=61$
Constraint: ${\mathbf{ktype}}=1$, $2$, $3$, $4$ or $5$.
${\mathbf{ifail}}=71$
Constraint: $0\le {\mathbf{nk}}\left(i\right)\le {\mathbf{nstep}}$ for all $i$.
${\mathbf{ifail}}=72$
Constraint: $0\le {\mathbf{nk}}\left(i\right)\le {\mathbf{fitsum}}\left(1,{\mathbf{nstep}}\right)$ for all $i$.
${\mathbf{ifail}}=73$
Constraint: $0\le {\mathbf{nk}}\left(i\right)\le 1$ for all $i$.
${\mathbf{ifail}}=74$
Constraint: $0\le {\mathbf{nk}}\left(i\right)\le {‖{\beta }_{K}‖}_{1}$ for all $i$.
${\mathbf{ifail}}=81$
Constraint: ${\mathbf{lnk}}\ge 1$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

Not applicable.

None.

## Example

This example performs a LARS on a set a simulated dataset with $20$ observations and $6$ independent variables.
Additional parameter estimates are obtained corresponding to a LARS step number of $0.2,1.2,3.2,4.5$ and $5.2$. Where, for example, $4.5$ corresponds to the solution halfway between that obtained at step $4$ and that obtained at step $5$.
```function g02mc_example

fprintf('g02mc example results\n\n');

% Going to be fitting a LAR model via g02ma and getting g02ma
% to mean center y and normalise X around the mean
mtype = int64(1);
pred = int64(3);
prey = int64(1);

% Independent variables
d = [10.28  1.77  9.69 15.58  8.23 10.44;
9.08  8.99 11.53  6.57 15.89 12.58;
17.98 13.10  1.04 10.45 10.12 16.68;
14.82 13.79 12.23  7.00  8.14  7.79;
17.53  9.41  6.24  3.75 13.12 17.08;
7.78 10.38  9.83  2.58 10.13  4.25;
11.95 21.71  8.83 11.00 12.59 10.52;
14.60 10.09 -2.70  9.89 14.67  6.49;
3.63  9.07 12.59 14.09  9.06  8.19;
6.35  9.79  9.40 12.79  8.38 16.79;
4.66  3.55 16.82 13.83 21.39 13.88;
8.32 14.04 17.17  7.93  7.39 -1.09;
10.86 13.68  5.75 10.44 10.36 10.06;
4.76  4.92 17.83  2.90  7.58 11.97;
5.05 10.41  9.89  9.04  7.90 13.12;
5.41  9.32  5.27 15.53  5.06 19.84;
9.77  2.37  9.54 20.23  9.33  8.82;
14.28  4.34 14.23 14.95 18.16 11.03;
10.17  6.80  3.17  8.57 16.07 15.93;
5.39  2.67  6.37 13.56 10.68  7.35];

% Dependent variable
y = [-46.47; -35.80; -129.22;  -42.44; -73.51;
-26.61; -63.90;  -76.73;  -32.64; -83.29;
-16.31;  -5.82;  -47.75;   18.38; -54.71;
-55.62; -45.28;  -22.76; -104.32; -55.94];

% g02ma can issue warnings, but return sensible results,
% so save current warning state and turn warnings on
warn_state = nag_issue_warnings();
nag_issue_warnings(true);

% Call the model fitting routine
[b,fitsum,ifail] = g02ma(mtype,d,y);

% Reset the warning state to its initial value
nag_issue_warnings(warn_state);

% Set how the additional estimates will be specified

% Location of additional parameter estimates (as defined by the
% LARS step number)
ktype = int64(1);
nk = [0.2; 1.2; 3.2; 4.5; 5.2];

% Calculate the additional parameter estimates
[nb,ifail] = g02mc(b,fitsum,ktype,nk);

% Print the results
ip = size(b,1);
K = size(b,2) - 2;
lnk = size(nk,1);

fprintf(' Parameter Estimates from g02ma\n');
fprintf('  Step %s Parameter Estimate\n ',repmat(' ',1,max(ip-2,0)*5));
fprintf(repmat('-',1,5+ip*10));
fprintf('\n');
for k = 1:K
fprintf('  %3d',k);
for j = 1:ip
fprintf(' %9.3f',b(j,k));
end
fprintf('\n');
end
fprintf('\n');

fprintf(' Additional Parameter Estimates from g02mc\n');
fprintf('   nk  %s Parameter Estimate\n ',repmat(' ',1,max(ip-2,0)*5));
fprintf(repmat('-',1,5+ip*10));
fprintf('\n');
for k = 1:lnk
fprintf('  %4.1f',nk(k));
for j = 1:ip
fprintf(' %9.3f',nb(j,k));
end
fprintf('\n');
end

```
```g02mc example results

Parameter Estimates from g02ma
Step                      Parameter Estimate
-----------------------------------------------------------------
1     0.000     0.000     3.125     0.000     0.000     0.000
2     0.000     0.000     3.792     0.000     0.000    -0.713
3    -0.446     0.000     3.998     0.000     0.000    -1.151
4    -0.628    -0.295     4.098     0.000     0.000    -1.466
5    -1.060    -1.056     4.110    -0.864     0.000    -1.948
6    -1.073    -1.132     4.118    -0.935    -0.059    -1.981

nk                       Parameter Estimate
-----------------------------------------------------------------
0.2     0.000     0.000     0.625     0.000     0.000     0.000
1.2     0.000     0.000     3.258     0.000     0.000    -0.143
3.2    -0.483    -0.059     4.018     0.000     0.000    -1.214
4.5    -0.844    -0.676     4.104    -0.432     0.000    -1.707
5.2    -1.062    -1.071     4.112    -0.878    -0.012    -1.955
```