# NAG Library Routine Document

## 1Purpose

g02lcf calculates parameter estimates for a given number of factors given the output from an orthogonal scores PLS regression (g02laf or g02lbf).

## 2Specification

Fortran Interface
 Subroutine g02lcf ( ip, my, p, ldp, c, ldc, w, ldw, b, ldb, orig, xbar, ybar, xstd, ystd, ob, ldob, ycv, vip,
 Integer, Intent (In) :: ip, my, maxfac, nfact, ldp, ldc, ldw, ldb, orig, iscale, ldob, vipopt, ldycv, ldvip Integer, Intent (Inout) :: ifail Real (Kind=nag_wp), Intent (In) :: p(ldp,maxfac), c(ldc,maxfac), w(ldw,maxfac), rcond, xbar(ip), ybar(my), xstd(ip), ystd(my), ycv(ldycv,my) Real (Kind=nag_wp), Intent (Inout) :: b(ldb,my), ob(ldob,my), vip(ldvip,vipopt)
#include nagmk26.h
 void g02lcf_ (const Integer *ip, const Integer *my, const Integer *maxfac, const Integer *nfact, const double p[], const Integer *ldp, const double c[], const Integer *ldc, const double w[], const Integer *ldw, const double *rcond, double b[], const Integer *ldb, const Integer *orig, const double xbar[], const double ybar[], const Integer *iscale, const double xstd[], const double ystd[], double ob[], const Integer *ldob, const Integer *vipopt, const double ycv[], const Integer *ldycv, double vip[], const Integer *ldvip, Integer *ifail)

## 3Description

The parameter estimates $B$ for a $l$-factor orthogonal scores PLS model with $m$ predictor variables and $r$ response variables are given by,
 $B=W PTW-1 CT , B∈ ℝm×r ,$
where $W$ is the $m$ by $k$ ($\ge l$) matrix of $x$-weights; $P$ is the $m$ by $k$ matrix of $x$-loadings; and $C$ is the $r$ by $k$ matrix of $y$-loadings for a fitted PLS model.
The parameter estimates $B$ are for centred, and possibly scaled, predictor data ${X}_{1}$ and response data ${Y}_{1}$. Parameter estimates may also be given for the predictor data $X$ and response data $Y$.
Optionally, g02lcf will calculate variable influence on projection (VIP) statistics, see Wold (1994).

## 4References

Wold S (1994) PLS for multivariate linear modelling QSAR: chemometric methods in molecular design Methods and Principles in Medicinal Chemistry (ed van de Waterbeemd H) Verlag-Chemie

## 5Arguments

1:     $\mathbf{ip}$ – IntegerInput
On entry: $m$, the number of predictor variables in the fitted model.
Constraint: ${\mathbf{ip}}>1$.
2:     $\mathbf{my}$ – IntegerInput
On entry: $r$, the number of response variables.
Constraint: ${\mathbf{my}}\ge 1$.
3:     $\mathbf{maxfac}$ – IntegerInput
On entry: $k$, the number of factors available in the PLS model.
Constraint: $1\le {\mathbf{maxfac}}\le {\mathbf{ip}}$.
4:     $\mathbf{nfact}$ – IntegerInput
On entry: $l$, the number of factors to include in the calculation of parameter estimates.
Constraint: $1\le {\mathbf{nfact}}\le {\mathbf{maxfac}}$.
5:     $\mathbf{p}\left({\mathbf{ldp}},{\mathbf{maxfac}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: $x$-loadings as returned from g02laf and g02lbf.
6:     $\mathbf{ldp}$ – IntegerInput
On entry: the first dimension of the array p as declared in the (sub)program from which g02lcf is called.
Constraint: ${\mathbf{ldp}}\ge {\mathbf{ip}}$.
7:     $\mathbf{c}\left({\mathbf{ldc}},{\mathbf{maxfac}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: $y$-loadings as returned from g02laf and g02lbf.
8:     $\mathbf{ldc}$ – IntegerInput
On entry: the first dimension of the array c as declared in the (sub)program from which g02lcf is called.
Constraint: ${\mathbf{ldc}}\ge {\mathbf{my}}$.
9:     $\mathbf{w}\left({\mathbf{ldw}},{\mathbf{maxfac}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: $x$-weights as returned from g02laf and g02lbf.
10:   $\mathbf{ldw}$ – IntegerInput
On entry: the first dimension of the array w as declared in the (sub)program from which g02lcf is called.
Constraint: ${\mathbf{ldw}}\ge {\mathbf{ip}}$.
11:   $\mathbf{rcond}$ – Real (Kind=nag_wp)Input
On entry: singular values of ${P}^{\mathrm{T}}W$ less than rcond times the maximum singular value are treated as zero when calculating parameter estimates. If rcond is negative, a value of $0.005$ is used.
12:   $\mathbf{b}\left({\mathbf{ldb}},{\mathbf{my}}\right)$ – Real (Kind=nag_wp) arrayOutput
On exit: ${\mathbf{b}}\left(\mathit{i},\mathit{j}\right)$ contains the parameter estimate for the $\mathit{i}$th predictor variable in the model for the $\mathit{j}$th response variable, for $\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{my}}$.
13:   $\mathbf{ldb}$ – IntegerInput
On entry: the first dimension of the array b as declared in the (sub)program from which g02lcf is called.
Constraint: ${\mathbf{ldb}}\ge {\mathbf{ip}}$.
14:   $\mathbf{orig}$ – IntegerInput
On entry: indicates how parameter estimates are calculated.
${\mathbf{orig}}=-1$
Parameter estimates for the centred, and possibly, scaled data.
${\mathbf{orig}}=1$
Parameter estimates for the original data.
Constraint: ${\mathbf{orig}}=-1$ or $1$.
15:   $\mathbf{xbar}\left({\mathbf{ip}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: if ${\mathbf{orig}}=1$, mean values of predictor variables in the model; otherwise xbar is not referenced.
16:   $\mathbf{ybar}\left({\mathbf{my}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: if ${\mathbf{orig}}=1$, mean value of each response variable in the model; otherwise ybar is not referenced.
17:   $\mathbf{iscale}$ – IntegerInput
On entry: if ${\mathbf{orig}}=1$, iscale must take the value supplied to either g02laf or g02lbf; otherwise iscale is not referenced.
Constraint: if ${\mathbf{orig}}=1$, ${\mathbf{iscale}}=-1$, $1$ or $2$.
18:   $\mathbf{xstd}\left({\mathbf{ip}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: if ${\mathbf{orig}}=1$ and ${\mathbf{iscale}}\ne -1$, the scalings of predictor variables in the model as returned from either g02laf or g02lbf; otherwise xstd is not referenced.
19:   $\mathbf{ystd}\left({\mathbf{my}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: if ${\mathbf{orig}}=1$ and ${\mathbf{iscale}}\ne -1$, the scalings of response variables as returned from either g02laf or g02lbf; otherwise ystd is not referenced.
20:   $\mathbf{ob}\left({\mathbf{ldob}},{\mathbf{my}}\right)$ – Real (Kind=nag_wp) arrayOutput
On exit: if ${\mathbf{orig}}=1$, ${\mathbf{ob}}\left(1,\mathit{j}\right)$ contains the intercept value for the $\mathit{j}$th response variable, and ${\mathbf{ob}}\left(\mathit{i}+1,\mathit{j}\right)$ contains the parameter estimate on the original scale for the $\mathit{i}$th predictor variable in the model, for $\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{my}}$. Otherwise ob is not referenced.
21:   $\mathbf{ldob}$ – IntegerInput
On entry: the first dimension of the array ob as declared in the (sub)program from which g02lcf is called.
Constraints:
• if ${\mathbf{orig}}=1$, ${\mathbf{ldob}}\ge {\mathbf{ip}}+1$;
• otherwise ${\mathbf{ldob}}\ge 1$.
22:   $\mathbf{vipopt}$ – IntegerInput
On entry: a flag that determines variable influence on projections (VIP) options.
${\mathbf{vipopt}}=0$
VIP are not calculated.
${\mathbf{vipopt}}=1$
VIP are calculated for predictor variables using the mean explained variance in responses.
${\mathbf{vipopt}}={\mathbf{my}}$
VIP are calculated for predictor variables for each response variable in the model.
Note that setting ${\mathbf{vipopt}}={\mathbf{my}}$ when ${\mathbf{my}}=1$ gives the same result as setting ${\mathbf{vipopt}}=1$ directly.
Constraint: ${\mathbf{vipopt}}=0$, $1$ or ${\mathbf{my}}$.
23:   $\mathbf{ycv}\left({\mathbf{ldycv}},{\mathbf{my}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: if ${\mathbf{vipopt}}\ne 0$, ${\mathbf{ycv}}\left(\mathit{i},\mathit{j}\right)$ is the cumulative percentage of variance of the $\mathit{j}$th response variable explained by the first $\mathit{i}$ factors, for $\mathit{i}=1,2,\dots ,{\mathbf{nfact}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{my}}$; otherwise ycv is not referenced.
24:   $\mathbf{ldycv}$ – IntegerInput
On entry: the first dimension of the array ycv as declared in the (sub)program from which g02lcf is called.
Constraint: if ${\mathbf{vipopt}}\ne 0$, ${\mathbf{ldycv}}\ge {\mathbf{nfact}}$.
25:   $\mathbf{vip}\left({\mathbf{ldvip}},{\mathbf{vipopt}}\right)$ – Real (Kind=nag_wp) arrayOutput
On exit: if ${\mathbf{vipopt}}=1$, ${\mathbf{vip}}\left(\mathit{i},1\right)$ contains the VIP statistic for the $\mathit{i}$th predictor variable in the model for all response variables, for $\mathit{i}=1,2,\dots ,{\mathbf{ip}}$.
If ${\mathbf{vipopt}}={\mathbf{my}}$, ${\mathbf{vip}}\left(\mathit{i},\mathit{j}\right)$ contains the VIP statistic for the $\mathit{i}$th predictor variable in the model for the $\mathit{j}$th response variable, for $\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{my}}$.
Otherwise vip is not referenced.
26:   $\mathbf{ldvip}$ – IntegerInput
On entry: the first dimension of the array vip as declared in the (sub)program from which g02lcf is called.
Constraint: if ${\mathbf{vipopt}}\ne 0$, ${\mathbf{ldvip}}\ge {\mathbf{ip}}$.
27:   $\mathbf{ifail}$ – IntegerInput/Output
On entry: ifail must be set to $0$, $-1\text{​ or ​}1$. If you are unfamiliar with this argument you should refer to Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value $-1\text{​ or ​}1$ is recommended. If the output of error messages is undesirable, then the value $1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is $0$. When the value $-\mathbf{1}\text{​ or ​}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).

## 6Error Indicators and Warnings

If on entry ${\mathbf{ifail}}=0$ or $-1$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=1$
On entry, ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ip}}>1$.
On entry, ${\mathbf{iscale}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{orig}}=1$, ${\mathbf{iscale}}=-1$ or $1$.
On entry, ${\mathbf{my}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{my}}\ge 1$.
On entry, ${\mathbf{orig}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{orig}}=-1$ or $1$.
On entry, ${\mathbf{vipopt}}=〈\mathit{\text{value}}〉$ and ${\mathbf{my}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{vipopt}}=0$, $1$ or ${\mathbf{my}}$.
${\mathbf{ifail}}=2$
On entry, ${\mathbf{ldb}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ldb}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{ldc}}=〈\mathit{\text{value}}〉$ and ${\mathbf{my}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ldc}}\ge {\mathbf{my}}$.
On entry, ${\mathbf{ldob}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{orig}}=1$, ${\mathbf{ldob}}\ge {\mathbf{ip}}+1$.
On entry, ${\mathbf{ldp}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ldp}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{ldvip}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{vipopt}}\ne 0$, ${\mathbf{ldvip}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{ldw}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ldw}}\ge {\mathbf{ip}}$.
On entry, ${\mathbf{ldycv}}=〈\mathit{\text{value}}〉$ and ${\mathbf{nfact}}=〈\mathit{\text{value}}〉$.
Constraint: if ${\mathbf{vipopt}}\ne 0$, ${\mathbf{ldycv}}\ge {\mathbf{nfact}}$.
On entry, ${\mathbf{maxfac}}=〈\mathit{\text{value}}〉$ and ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: $1\le {\mathbf{maxfac}}\le {\mathbf{ip}}$.
On entry, ${\mathbf{nfact}}=〈\mathit{\text{value}}〉$ and ${\mathbf{maxfac}}=〈\mathit{\text{value}}〉$.
Constraint: $1\le {\mathbf{nfact}}\le {\mathbf{maxfac}}$.
${\mathbf{ifail}}=-99$
See Section 3.9 in How to Use the NAG Library and its Documentation for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 3.8 in How to Use the NAG Library and its Documentation for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 3.7 in How to Use the NAG Library and its Documentation for further information.

## 7Accuracy

The calculations are based on the singular value decomposition of ${P}^{\mathrm{T}}W$.

## 8Parallelism and Performance

g02lcf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g02lcf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

g02lcf allocates internally $l\left(l+r+4\right)+\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(2l,r\right)$ elements of real storage.

## 10Example

This example reads in details of a PLS model, and a set of parameter estimates are calculated along with their VIP statistics.

### 10.1Program Text

Program Text (g02lcfe.f90)

### 10.2Program Data

Program Data (g02lcfe.d)

### 10.3Program Results

Program Results (g02lcfe.r)

© The Numerical Algorithms Group Ltd, Oxford, UK. 2017