nag_pls_orth_scores_wold (g02lbc) (PDF version)
g02 Chapter Contents
g02 Chapter Introduction
NAG C Library Manual

NAG Library Function Document

nag_pls_orth_scores_wold (g02lbc)

+ Contents

    1  Purpose
    7  Accuracy

1  Purpose

nag_pls_orth_scores_wold (g02lbc) fits an orthogonal scores partial least squares (PLS) regression by using Wold's iterative method.

2  Specification

#include <nag.h>
#include <nagg02.h>
void  nag_pls_orth_scores_wold (Nag_OrderType order, Integer n, Integer mx, const double x[], Integer pdx, const Integer isx[], Integer ip, Integer my, const double y[], Integer pdy, double xbar[], double ybar[], Nag_ScalePredictor iscale, double xstd[], double ystd[], Integer maxfac, Integer maxit, double tau, double xres[], Integer pdxres, double yres[], Integer pdyres, double w[], Integer pdw, double p[], Integer pdp, double t[], Integer pdt, double c[], Integer pdc, double u[], Integer pdu, double xcv[], double ycv[], Integer pdycv, NagError *fail)

3  Description

Let X1 be the mean-centred n by m data matrix X of n observations on m predictor variables. Let Y1 be the mean-centred n by r data matrix Y of n observations on r response variables.
The first of the k factors PLS methods extract from the data predicts both X1 and Y1 by regressing on a t1 column vector of n scores:
X^1 = t1 p1T Y^1 = t1 c1T , with ​ t1T t1 = 1 ,
where the column vectors of m x-loadings p1 and r y-loadings c1 are calculated in the least squares sense:
p1T = t1T X1 c1T = t1T Y1 .
The x-score vector t1=X1w1 is the linear combination of predictor data X1 that has maximum covariance with the y-scores u1=Y1c1, where the x-weights vector w1 is the normalised first left singular vector of X1T Y1.
The method extracts subsequent PLS factors by repeating the above process with the residual matrices:
Xi = Xi-1 - X^ i-1 Yi = Yi-1 - Y^ i-1 , i=2,3,,k ,
and with orthogonal scores:
tiT tj = 0 , j=1,2,,i-1 .
Optionally, in addition to being mean-centred, the data matrices X1 and Y1 may be scaled by standard deviations of the variables. If data are supplied mean-centred, the calculations are not affected within numerical accuracy.

4  References

Wold H (1966) Estimation of principal components and related models by iterative least-squares In: Multivariate Analysis (ed P R Krishnaiah) 391–420 Academic Press NY

5  Arguments

1:     orderNag_OrderTypeInput
On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by order=Nag_RowMajor. See Section 3.2.1.3 in the Essential Introduction for a more detailed explanation of the use of this argument.
Constraint: order=Nag_RowMajor or Nag_ColMajor.
2:     nIntegerInput
On entry: n, the number of observations.
Constraint: n>1.
3:     mxIntegerInput
On entry: the number of predictor variables.
Constraint: mx>1.
4:     x[dim]const doubleInput
Note: the dimension, dim, of the array x must be at least
  • max1,pdx×mx when order=Nag_ColMajor;
  • max1,n×pdx when order=Nag_RowMajor.
Where Xi,j appears in this document, it refers to the array element
  • x[j-1×pdx+i-1] when order=Nag_ColMajor;
  • x[i-1×pdx+j-1] when order=Nag_RowMajor.
On entry: Xi,j must contain the ith observation on the jth predictor variable, for i=1,2,,n and j=1,2,,mx.
5:     pdxIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array x.
Constraints:
  • if order=Nag_ColMajor, pdxn;
  • if order=Nag_RowMajor, pdxmx.
6:     isx[mx]const IntegerInput
On entry: indicates which predictor variables are to be included in the model.
isx[j-1]=1
The jth predictor variable (with variates in the jth column of X) is included in the model.
isx[j-1]=0
Otherwise.
Constraint: the sum of elements in isx must equal ip.
7:     ipIntegerInput
On entry: m, the number of predictor variables in the model.
Constraint: 1<ipmx.
8:     myIntegerInput
On entry: r, the number of response variables.
Constraint: my1.
9:     y[dim]const doubleInput
Note: the dimension, dim, of the array y must be at least
  • max1,pdy×my when order=Nag_ColMajor;
  • max1,n×pdy when order=Nag_RowMajor.
Where Yi,j appears in this document, it refers to the array element
  • y[j-1×pdy+i-1] when order=Nag_ColMajor;
  • y[i-1×pdy+j-1] when order=Nag_RowMajor.
On entry: Yi,j must contain the ith observation for the jth response variable, for i=1,2,,n and j=1,2,,my.
10:   pdyIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array y.
Constraints:
  • if order=Nag_ColMajor, pdyn;
  • if order=Nag_RowMajor, pdymy.
11:   xbar[ip]doubleOutput
On exit: mean values of predictor variables in the model.
12:   ybar[my]doubleOutput
On exit: the mean value of each response variable.
13:   iscaleNag_ScalePredictorInput
On entry: indicates how predictor variables are scaled.
iscale=Nag_PredStdScale
Data are scaled by the standard deviation of variables.
iscale=Nag_PredUserScale
Data are scaled by user-supplied scalings.
iscale=Nag_PredNoScale
No scaling.
Constraint: iscale=Nag_PredNoScale, Nag_PredStdScale or Nag_PredUserScale.
14:   xstd[ip]doubleInput/Output
On entry: if iscale=Nag_PredUserScale, xstd[j-1] must contain the user-supplied scaling for the jth predictor variable in the model, for j=1,2,,ip. Otherwise xstd need not be set.
On exit: if iscale=Nag_PredStdScale, standard deviations of predictor variables in the model. Otherwise xstd is not changed.
15:   ystd[my]doubleInput/Output
On entry: if iscale=Nag_PredUserScale, ystd[j-1] must contain the user-supplied scaling for the jth response variable in the model, for j=1,2,,my. Otherwise ystd need not be set.
On exit: if iscale=Nag_PredStdScale, the standard deviation of each response variable. Otherwise ystd is not changed.
16:   maxfacIntegerInput
On entry: k, the number of latent variables to calculate.
Constraint: 1maxfacip.
17:   maxitIntegerInput
On entry: if my=1, maxit is not referenced; otherwise the maximum number of iterations used to calculate the x-weights.
Suggested value: maxit=200.
Constraint: if my>1, maxit>1.
18:   taudoubleInput
On entry: if my=1, tau is not referenced; otherwise the iterative procedure used to calculate the x-weights will halt if the Euclidean distance between two subsequent estimates is less than or equal to tau.
Suggested value: tau=1.0e−4.
Constraint: if my>1, tau>0.0.
19:   xres[dim]doubleOutput
Note: the dimension, dim, of the array xres must be at least
  • max1,pdxres×ip when order=Nag_ColMajor;
  • max1,n×pdxres when order=Nag_RowMajor.
The i,jth element of the matrix is stored in
  • xres[j-1×pdxres+i-1] when order=Nag_ColMajor;
  • xres[i-1×pdxres+j-1] when order=Nag_RowMajor.
On exit: the predictor variables' residual matrix Xk.
20:   pdxresIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array xres.
Constraints:
  • if order=Nag_ColMajor, pdxresn;
  • if order=Nag_RowMajor, pdxresip.
21:   yres[dim]doubleOutput
Note: the dimension, dim, of the array yres must be at least
  • max1,pdyres×my when order=Nag_ColMajor;
  • max1,n×pdyres when order=Nag_RowMajor.
The i,jth element of the matrix is stored in
  • yres[j-1×pdyres+i-1] when order=Nag_ColMajor;
  • yres[i-1×pdyres+j-1] when order=Nag_RowMajor.
On exit: the residuals for each response variable, Yk.
22:   pdyresIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array yres.
Constraints:
  • if order=Nag_ColMajor, pdyresn;
  • if order=Nag_RowMajor, pdyresmy.
23:   w[dim]doubleOutput
Note: the dimension, dim, of the array w must be at least
  • max1,pdw×maxfac when order=Nag_ColMajor;
  • max1,ip×pdw when order=Nag_RowMajor.
The i,jth element of the matrix W is stored in
  • w[j-1×pdw+i-1] when order=Nag_ColMajor;
  • w[i-1×pdw+j-1] when order=Nag_RowMajor.
On exit: the jth column of W contains the x-weights wj, for j=1,2,,maxfac.
24:   pdwIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array w.
Constraints:
  • if order=Nag_ColMajor, pdwip;
  • if order=Nag_RowMajor, pdwmaxfac.
25:   p[dim]doubleOutput
Note: the dimension, dim, of the array p must be at least
  • max1,pdp×maxfac when order=Nag_ColMajor;
  • max1,ip×pdp when order=Nag_RowMajor.
The i,jth element of the matrix P is stored in
  • p[j-1×pdp+i-1] when order=Nag_ColMajor;
  • p[i-1×pdp+j-1] when order=Nag_RowMajor.
On exit: the jth column of P contains the x-loadings pj, for j=1,2,,maxfac.
26:   pdpIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array p.
Constraints:
  • if order=Nag_ColMajor, pdpip;
  • if order=Nag_RowMajor, pdpmaxfac.
27:   t[dim]doubleOutput
Note: the dimension, dim, of the array t must be at least
  • max1,pdt×maxfac when order=Nag_ColMajor;
  • max1,n×pdt when order=Nag_RowMajor.
The i,jth element of the matrix T is stored in
  • t[j-1×pdt+i-1] when order=Nag_ColMajor;
  • t[i-1×pdt+j-1] when order=Nag_RowMajor.
On exit: the jth column of T contains the x-scores tj, for j=1,2,,maxfac.
28:   pdtIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array t.
Constraints:
  • if order=Nag_ColMajor, pdtn;
  • if order=Nag_RowMajor, pdtmaxfac.
29:   c[dim]doubleOutput
Note: the dimension, dim, of the array c must be at least
  • max1,pdc×maxfac when order=Nag_ColMajor;
  • max1,my×pdc when order=Nag_RowMajor.
The i,jth element of the matrix C is stored in
  • c[j-1×pdc+i-1] when order=Nag_ColMajor;
  • c[i-1×pdc+j-1] when order=Nag_RowMajor.
On exit: the jth column of C contains the y-loadings cj, for j=1,2,,maxfac.
30:   pdcIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array c.
Constraints:
  • if order=Nag_ColMajor, pdcmy;
  • if order=Nag_RowMajor, pdcmaxfac.
31:   u[dim]doubleOutput
Note: the dimension, dim, of the array u must be at least
  • max1,pdu×maxfac when order=Nag_ColMajor;
  • max1,n×pdu when order=Nag_RowMajor.
The i,jth element of the matrix U is stored in
  • u[j-1×pdu+i-1] when order=Nag_ColMajor;
  • u[i-1×pdu+j-1] when order=Nag_RowMajor.
On exit: the jth column of U contains the y-scores uj, for j=1,2,,maxfac.
32:   pduIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array u.
Constraints:
  • if order=Nag_ColMajor, pdun;
  • if order=Nag_RowMajor, pdumaxfac.
33:   xcv[maxfac]doubleOutput
On exit: xcv[j-1] contains the cumulative percentage of variance in the predictor variables explained by the first j factors, for j=1,2,,maxfac.
34:   ycv[dim]doubleOutput
Note: the dimension, dim, of the array ycv must be at least
  • max1,pdycv×my when order=Nag_ColMajor;
  • max1,maxfac×pdycv when order=Nag_RowMajor.
Where YCVi,j appears in this document, it refers to the array element
  • ycv[j-1×pdycv+i-1] when order=Nag_ColMajor;
  • ycv[i-1×pdycv+j-1] when order=Nag_RowMajor.
On exit: YCVi,j is the cumulative percentage of variance of the jth response variable explained by the first i factors, for i=1,2,,maxfac and j=1,2,,my.
35:   pdycvIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array ycv.
Constraints:
  • if order=Nag_ColMajor, pdycvmaxfac;
  • if order=Nag_RowMajor, pdycvmy.
36:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument value had an illegal value.
NE_INT
On entry, mx=value.
Constraint: mx>1.
On entry, my=value.
Constraint: my1.
On entry, n=value.
Constraint: n>1.
On entry, pdc=value.
Constraint: pdc>0.
On entry, pdp=value.
Constraint: pdp>0.
On entry, pdt=value.
Constraint: pdt>0.
On entry, pdu=value.
Constraint: pdu>0.
On entry, pdw=value.
Constraint: pdw>0.
On entry, pdx=value.
Constraint: pdx>0.
On entry, pdxres=value.
Constraint: pdxres>0.
On entry, pdy=value.
Constraint: pdy>0.
On entry, pdycv=value.
Constraint: pdycv>0.
On entry, pdyres=value.
Constraint: pdyres>0.
NE_INT_2
On entry, ip=value and mx=value.
Constraint: 1<ipmx.
On entry, maxfac=value and ip=value.
Constraint: 1maxfacip.
On entry, my=value and maxit=value.
Constraint: if my>1, maxit>1.
On entry, pdc=value and maxfac=value.
Constraint: pdcmaxfac.
On entry, pdc=value and my=value.
Constraint: pdcmy.
On entry, pdp=value and ip=value.
Constraint: pdpip.
On entry, pdp=value and maxfac=value.
Constraint: pdpmaxfac.
On entry, pdt=value and maxfac=value.
Constraint: pdtmaxfac.
On entry, pdt=value and n=value.
Constraint: pdtn.
On entry, pdu=value and maxfac=value.
Constraint: pdumaxfac.
On entry, pdu=value and n=value.
Constraint: pdun.
On entry, pdw=value and ip=value.
Constraint: pdwip.
On entry, pdw=value and maxfac=value.
Constraint: pdwmaxfac.
On entry, pdx=value and mx=value.
Constraint: pdxmx.
On entry, pdx=value and n=value.
Constraint: pdxn.
On entry, pdxres=value and ip=value.
Constraint: pdxresip.
On entry, pdxres=value and n=value.
Constraint: pdxresn.
On entry, pdy=value and my=value.
Constraint: pdymy.
On entry, pdy=value and n=value.
Constraint: pdyn.
On entry, pdycv=value and maxfac=value.
Constraint: pdycvmaxfac.
On entry, pdycv=value and my=value.
Constraint: pdycvmy.
On entry, pdyres=value and my=value.
Constraint: pdyresmy.
On entry, pdyres=value and n=value.
Constraint: pdyres<n.
NE_INT_ARG_CONS
On entry, ip is not equal to the sum of isx elements: ip=value, sumisx=value.
NE_INT_ARRAY_VAL_1_OR_2
On entry, element value of isx is invalid.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_REAL
On entry, tau=value.
Constraint: if my>1, tau>0.0.

7  Accuracy

In general, the iterative method used in the calculations is less accurate (but faster) than the singular value decomposition approach adopted by nag_pls_orth_scores_svd (g02lac).

8  Further Comments

nag_pls_orth_scores_wold (g02lbc) allocates internally (n+r) elements of double storage.

9  Example

This example reads in data from an experiment to measure the biological activity in a chemical compound, and a PLS model is estimated.

9.1  Program Text

Program Text (g02lbce.c)

9.2  Program Data

Program Data (g02lbce.d)

9.3  Program Results

Program Results (g02lbce.r)


nag_pls_orth_scores_wold (g02lbc) (PDF version)
g02 Chapter Contents
g02 Chapter Introduction
NAG C Library Manual

© The Numerical Algorithms Group Ltd, Oxford, UK. 2012