NAG FL Interface
g02laf (pls_​svd)

Settings help

FL Name Style:


FL Specification Language:


1 Purpose

g02laf fits an orthogonal scores partial least squares (PLS) regression by using singular value decomposition.

2 Specification

Fortran Interface
Subroutine g02laf ( n, mx, x, ldx, isx, ip, my, y, ldy, xbar, ybar, iscale, xstd, ystd, maxfac, xres, ldxres, yres, ldyres, w, ldw, p, ldp, t, ldt, c, ldc, u, ldu, xcv, ycv, ldycv, ifail)
Integer, Intent (In) :: n, mx, ldx, isx(mx), ip, my, ldy, iscale, maxfac, ldxres, ldyres, ldw, ldp, ldt, ldc, ldu, ldycv
Integer, Intent (Inout) :: ifail
Real (Kind=nag_wp), Intent (In) :: x(ldx,mx), y(ldy,my)
Real (Kind=nag_wp), Intent (Inout) :: xstd(ip), ystd(my), xres(ldxres,ip), yres(ldyres,my), w(ldw,maxfac), p(ldp,maxfac), t(ldt,maxfac), c(ldc,maxfac), u(ldu,maxfac), ycv(ldycv,my)
Real (Kind=nag_wp), Intent (Out) :: xbar(ip), ybar(my), xcv(maxfac)
C Header Interface
#include <nag.h>
void  g02laf_ (const Integer *n, const Integer *mx, const double x[], const Integer *ldx, const Integer isx[], const Integer *ip, const Integer *my, const double y[], const Integer *ldy, double xbar[], double ybar[], const Integer *iscale, double xstd[], double ystd[], const Integer *maxfac, double xres[], const Integer *ldxres, double yres[], const Integer *ldyres, double w[], const Integer *ldw, double p[], const Integer *ldp, double t[], const Integer *ldt, double c[], const Integer *ldc, double u[], const Integer *ldu, double xcv[], double ycv[], const Integer *ldycv, Integer *ifail)
The routine may be called by the names g02laf or nagf_correg_pls_svd.

3 Description

Let X1 be the mean-centred n×m data matrix X of n observations on m predictor variables. Let Y1 be the mean-centred n×r data matrix Y of n observations on r response variables.
The first of the k factors PLS methods extract from the data predicts both X1 and Y1 by regressing on t1 a column vector of n scores:
X^1 = t1 p1T Y^1 = t1 c1T , with ​ t1T t1 = 1 ,  
where the column vectors of m x-loadings p1 and r y-loadings c1 are calculated in the least squares sense:
p1T = t1T X1 c1T = t1T Y1 .  
The x-score vector t1=X1w1 is the linear combination of predictor data X1 that has maximum covariance with the y-scores u1=Y1c1, where the x-weights vector w1 is the normalised first left singular vector of X1T Y1.
The method extracts subsequent PLS factors by repeating the above process with the residual matrices:
Xi = Xi-1 - X^ i-1 Yi = Yi-1 - Y^ i-1 , i=2,3,,k ,  
and with orthogonal scores:
tiT tj = 0 , j=1,2,,i-1 .  
Optionally, in addition to being mean-centred, the data matrices X1 and Y1 may be scaled by standard deviations of the variables. If data are supplied mean-centred, the calculations are not affected within numerical accuracy.

4 References

None.

5 Arguments

1: n Integer Input
On entry: n, the number of observations.
Constraint: n>1.
2: mx Integer Input
On entry: the number of predictor variables.
Constraint: mx>1.
3: x(ldx,mx) Real (Kind=nag_wp) array Input
On entry: x(i,j) must contain the ith observation on the jth predictor variable, for i=1,2,,n and j=1,2,,mx.
4: ldx Integer Input
On entry: the first dimension of the array x as declared in the (sub)program from which g02laf is called.
Constraint: ldxn.
5: isx(mx) Integer array Input
On entry: indicates which predictor variables are to be included in the model.
isx(j)=1
The jth predictor variable (with variates in the jth column of X) is included in the model.
isx(j)=0
Otherwise.
Constraint: the sum of elements in isx must equal ip.
6: ip Integer Input
On entry: m, the number of predictor variables in the model.
Constraint: 1<ipmx.
7: my Integer Input
On entry: r, the number of response variables.
Constraint: my1.
8: y(ldy,my) Real (Kind=nag_wp) array Input
On entry: y(i,j) must contain the ith observation for the jth response variable, for i=1,2,,n and j=1,2,,my.
9: ldy Integer Input
On entry: the first dimension of the array y as declared in the (sub)program from which g02laf is called.
Constraint: ldyn.
10: xbar(ip) Real (Kind=nag_wp) array Output
On exit: mean values of predictor variables in the model.
11: ybar(my) Real (Kind=nag_wp) array Output
On exit: the mean value of each response variable.
12: iscale Integer Input
On entry: indicates how predictor variables are scaled.
iscale=1
Data are scaled by the standard deviation of variables.
iscale=2
Data are scaled by user-supplied scalings.
iscale=-1
No scaling.
Constraint: iscale=-1, 1 or 2.
13: xstd(ip) Real (Kind=nag_wp) array Input/Output
On entry: if iscale=2, xstd(j) must contain the user-supplied scaling for the jth predictor variable in the model, for j=1,2,,ip. Otherwise xstd need not be set.
On exit: if iscale=1, standard deviations of predictor variables in the model. Otherwise xstd is not changed.
14: ystd(my) Real (Kind=nag_wp) array Input/Output
On entry: if iscale=2, ystd(j) must contain the user-supplied scaling for the jth response variable in the model, for j=1,2,,my. Otherwise ystd need not be set.
On exit: if iscale=1, the standard deviation of each response variable. Otherwise ystd is not changed.
15: maxfac Integer Input
On entry: k, the number of latent variables to calculate.
Constraint: 1maxfacip.
16: xres(ldxres,ip) Real (Kind=nag_wp) array Output
On exit: the predictor variables' residual matrix Xk.
17: ldxres Integer Input
On entry: the first dimension of the array xres as declared in the (sub)program from which g02laf is called.
Constraint: ldxresn.
18: yres(ldyres,my) Real (Kind=nag_wp) array Output
On exit: the residuals for each response variable, Yk.
19: ldyres Integer Input
On entry: the first dimension of the array yres as declared in the (sub)program from which g02laf is called.
Constraint: ldyresn.
20: w(ldw,maxfac) Real (Kind=nag_wp) array Output
On exit: the jth column of W contains the x-weights wj, for j=1,2,,maxfac.
21: ldw Integer Input
On entry: the first dimension of the array w as declared in the (sub)program from which g02laf is called.
Constraint: ldwip.
22: p(ldp,maxfac) Real (Kind=nag_wp) array Output
On exit: the jth column of P contains the x-loadings pj, for j=1,2,,maxfac.
23: ldp Integer Input
On entry: the first dimension of the array p as declared in the (sub)program from which g02laf is called.
Constraint: ldpip.
24: t(ldt,maxfac) Real (Kind=nag_wp) array Output
On exit: the jth column of T contains the x-scores tj, for j=1,2,,maxfac.
25: ldt Integer Input
On entry: the first dimension of the array t as declared in the (sub)program from which g02laf is called.
Constraint: ldtn.
26: c(ldc,maxfac) Real (Kind=nag_wp) array Output
On exit: the jth column of C contains the y-loadings cj, for j=1,2,,maxfac.
27: ldc Integer Input
On entry: the first dimension of the array c as declared in the (sub)program from which g02laf is called.
Constraint: ldcmy.
28: u(ldu,maxfac) Real (Kind=nag_wp) array Output
On exit: the jth column of U contains the y-scores uj, for j=1,2,,maxfac.
29: ldu Integer Input
On entry: the first dimension of the array u as declared in the (sub)program from which g02laf is called.
Constraint: ldun.
30: xcv(maxfac) Real (Kind=nag_wp) array Output
On exit: xcv(j) contains the cumulative percentage of variance in the predictor variables explained by the first j factors, for j=1,2,,maxfac.
31: ycv(ldycv,my) Real (Kind=nag_wp) array Output
On exit: ycv(i,j) is the cumulative percentage of variance of the jth response variable explained by the first i factors, for i=1,2,,maxfac and j=1,2,,my.
32: ldycv Integer Input
On entry: the first dimension of the array ycv as declared in the (sub)program from which g02laf is called.
Constraint: ldycvmaxfac.
33: ifail Integer Input/Output
On entry: ifail must be set to 0, -1 or 1 to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of 0 causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of -1 means that an error message is printed while a value of 1 means that it is not.
If halting is not appropriate, the value -1 or 1 is recommended. If message printing is undesirable, then the value 1 is recommended. Otherwise, the value 0 is recommended. When the value -1 or 1 is used it is essential to test the value of ifail on exit.
On exit: ifail=0 unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry ifail=0 or -1, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
ifail=1
On entry, iscale=value.
Constraint: iscale=-1 or 1.
On entry, isx(value) is invalid.
Constraint: isx(j)=0 or 1, for all j.
On entry, mx=value.
Constraint: mx>1.
On entry, my=value.
Constraint: my1.
On entry, n=value.
Constraint: n>1.
ifail=2
On entry, ip=value and mx=value.
Constraint: 1<ipmx.
On entry, ldc=value and my=value.
Constraint: ldcmy.
On entry, ldp=value and ip=value.
Constraint: ldpip.
On entry, ldt=value and n=value.
Constraint: ldtn.
On entry, ldu=value and n=value.
Constraint: ldun.
On entry, ldw=value and ip=value.
Constraint: ldwip.
On entry, ldx=value and n=value.
Constraint: ldxn.
On entry, ldxres=value and n=value.
Constraint: ldxresn.
On entry, ldy=value and n=value.
Constraint: ldyn.
On entry, ldycv=value and maxfac=value.
Constraint: ldycvmaxfac.
On entry, ldyres=value and n=value.
Constraint: ldyresn.
On entry, maxfac=value and ip=value.
Constraint: 1maxfacip.
ifail=3
On entry, ip=value and sum(isx)=value.
Constraint: the sum of elements in isx must equal ip.
ifail=-99
An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
ifail=-399
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
ifail=-999
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

The computed singular value decomposition is nearly the exact singular value decomposition for a nearby matrix (A+E) , where
E2 = O(ε) A2 ,  
and ε is the machine precision.

8 Parallelism and Performance

g02laf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

g02laf allocates internally 2mr + A + max(3(A+B),5A) + r elements of real storage, where A=min(m,r) and B=max(m,r).

10 Example

This example reads in data from an experiment to measure the biological activity in a chemical compound, and a PLS model is estimated.

10.1 Program Text

Program Text (g02lafe.f90)

10.2 Program Data

Program Data (g02lafe.d)

10.3 Program Results

Program Results (g02lafe.r)