hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_correg_mixeff_reml (g02ja)

Purpose

nag_correg_mixeff_reml (g02ja) fits a linear mixed effects regression model using restricted maximum likelihood (REML).

Syntax

[nff, nrf, df, reml, b, se, gamma, warn, ifail] = g02ja(nvpr, levels, yvid, fvid, rvid, svid, cwid, vpr, dat, fint, rint, lb, gamma, 'n', n, 'ncol', ncol, 'nfv', nfv, 'nrv', nrv, 'maxit', maxit, 'tol', tol)
[nff, nrf, df, reml, b, se, gamma, warn, ifail] = nag_correg_mixeff_reml(nvpr, levels, yvid, fvid, rvid, svid, cwid, vpr, dat, fint, rint, lb, gamma, 'n', n, 'ncol', ncol, 'nfv', nfv, 'nrv', nrv, 'maxit', maxit, 'tol', tol)
Note: the interface to this routine has changed since earlier releases of the toolbox:
Mark 22: n has been made optional
Mark 23: maxit, tol optional
.

Description

nag_correg_mixeff_reml (g02ja) fits a model of the form:
y = Xβ + Zν + ε
y=Xβ+Zν+ε
where and
Both ν ν  and ε ε  are assumed to have a Gaussian distribution with expectation zero and
Var
[ ν ε ]
=
[ G 0 0 R ]
Var[ ν ε ] = [ G 0 0 R ]
where R = σR2 I R= σ R 2 I , I I  is the n × n n×n  identity matrix and G G  is a diagonal matrix. It is assumed that the random variables, Z Z , can be subdivided into g q g q  groups with each group being identically distributed with expectations zero and variance σi2 σi2 . The diagonal elements of matrix G G  therefore take one of the values {σi2 : i = 1,2,,g} { σi2 : i=1,2,,g } , depending on which group the associated random variable belongs to.
The model therefore contains three sets of unknowns, the fixed effects, β β , the random effects ν ν  and a vector of g + 1 g+1  variance components, γ γ , where γ = {σ12,σ22,,σg12,σg2,σR2} γ = {σ12,σ22,, σ g-1 2 ,σg2,σR2} . Rather than working directly with γ γ , nag_correg_mixeff_reml (g02ja) uses an iterative process to estimate γ* = { σ12 / σR2 , σ22 / σR2 ,, σg12 / σR2 , σg2 / σR2 ,1} γ* = { σ12 / σR2 , σ22 / σR2 ,, σg-12 / σR2 , σg2 / σR2 ,1} . Due to the iterative nature of the estimation a set of initial values, γ0 γ0 , for γ* γ*  is required. nag_correg_mixeff_reml (g02ja) allows these initial values either to be supplied by you or calculated from the data using the minimum variance quadratic unbiased estimators (MIVQUE0) suggested by Rao (1972).
nag_correg_mixeff_reml (g02ja) fits the model using a quasi-Newton algorithm to maximize the restricted log-likelihood function:
2 lR = log(|V|) + (np) log(rV1r) + log|XV1X| + (np) (1 + log(2π / (np)))
-2 l R = log( |V| ) + ( n-p ) log( r V-1 r ) + log| X V-1 X | + ( n-p ) ( 1+ log( 2 π / ( n-p ) ) )
where
V = ZG Z + R,   r = yXb   and   b = (XV1X)1 X V1 y .
V = ZG Z + R,   r=y-Xb   and   b = ( X V-1 X ) -1 X V-1 y .
Once the final estimates for γ* γ *  have been obtained, the value of σR2 σR2  is given by:
σR2 = (rV1r) / (np) .
σR2 = ( r V-1 r ) / ( n - p ) .
Case weights, Wc Wc , can be incorporated into the model by replacing XX XX  and ZZ ZZ  with XWcX XWcX  and ZWcZ ZWcZ  respectively, for a diagonal weight matrix Wc Wc .
The log-likelihood, lRlR, is calculated using the sweep algorithm detailed in Wolfinger et al. (1994).

References

Goodnight J H (1979) A tutorial on the SWEEP operator The American Statistician 33(3) 149–158
Harville D A (1977) Maximum likelihood approaches to variance component estimation and to related problems JASA 72 320–340
Rao C R (1972) Estimation of variance and covariance components in a linear model J. Am. Stat. Assoc. 67 112–115
Stroup W W (1989) Predictable functions and prediction space in the mixed model procedure Applications of Mixed Models in Agriculture and Related Disciplines Southern Cooperative Series Bulletin No. 343 39–48
Wolfinger R, Tobias R and Sall J (1994) Computing Gaussian likelihoods and their derivatives for general linear mixed models SIAM Sci. Statist. Comput. 15 1294–1310

Parameters

Compulsory Input Parameters

1:     nvpr – int64int32nag_int scalar
If rint = 1rint=1 and svid0svid0, nvpr is the number of variance components being estimated2estimated-2, (g1g-1), else nvpr = gnvpr=g.
If nrv = 0nrv=0, nvprnvpr is not referenced.
Constraint: if nrv0nrv0, 1nvprnrv1nvprnrv.
2:     levels(ncol) – int64int32nag_int array
ncol, the dimension of the array, must satisfy the constraint ncol1ncol1.
levels(i)levelsi contains the number of levels associated with the iith variable of the data matrix dat. If this variable is continuous or binary (i.e., only takes the values zero or one) then levels(i)levelsi should be 11; if the variable is discrete then levels(i)levelsi is the number of levels associated with it and DAT(j,i)DATji is assumed to take the values 11 to levels(i)levelsi, for j = 1,2,,nj=1,2,,n.
Constraint: levels(i)1levelsi1, for i = 1,2,,ncoli=1,2,,ncol.
3:     yvid – int64int32nag_int scalar
The column of dat holding the dependent, yy, variable.
Constraint: 1yvidncol1yvidncol.
4:     fvid(nfv) – int64int32nag_int array
nfv, the dimension of the array, must satisfy the constraint 0nfv < ncol0nfv<ncol.
The columns of the data matrix dat holding the fixed independent variables with fvid(i) fvidi  holding the column number corresponding to the i i th fixed variable.
Constraint: 1fvid(i)ncol1fvidincol, for i = 1,2,,nfvi=1,2,,nfv.
5:     rvid(nrv) – int64int32nag_int array
nrv, the dimension of the array, must satisfy the constraint
  • 0nrv < ncol0nrv<ncol
  • nrv + rint > 0nrv+rint>0
  • .
    The columns of the data matrix DATDAT holding the random independent variables with rvid(i) rvidi  holding the column number corresponding to the i i th random variable.
    Constraint: 1rvid(i)ncol1rvidincol, for i = 1,2,,nrvi=1,2,,nrv.
    6:     svid – int64int32nag_int scalar
    The column of dat holding the subject variable.
    If svid = 0svid=0, no subject variable is used.
    Specifying a subject variable is equivalent to specifying the interaction between that variable and all of the random-effects. Letting the notation Z1 × ZS Z1 × ZS  denote the interaction between variables Z1 Z1  and ZS ZS , fitting a model with rint = 0 rint = 0 , random-effects Z1 + Z2 Z1 + Z2  and subject variable ZS ZS  is equivalent to fitting a model with random-effects Z1 × ZS + Z2 × ZS Z1 × ZS + Z2 × ZS  and no subject variable. If rint = 1 rint = 1  the model is equivalent to fitting ZS + Z1 × ZS + Z2 × ZS ZS + Z1 × ZS + Z2 × ZS  and no subject variable.
    Constraint: 0svidncol0svidncol.
    7:     cwid – int64int32nag_int scalar
    The column of dat holding the case weights.
    If cwid = 0cwid=0, no weights are used.
    Constraint: 0cwidncol0cwidncol.
    8:     vpr(nrv) – int64int32nag_int array
    nrv, the dimension of the array, must satisfy the constraint
  • 0nrv < ncol0nrv<ncol
  • nrv + rint > 0nrv+rint>0
  • .
    vpr(i) vpri  holds a flag indicating the variance of the i i th random variable. The variance of the i i th random variable is σj2 σ j 2 , where j = vpr(i) + 1 j = vpri + 1  if rint = 1rint=1 and svid0svid0 and j = vpr(i) j = vpri  otherwise. Random variables with the same value of jj are assumed to be taken from the same distribution.
    Constraint: 1vpr(i)nvpr1vprinvpr, for i = 1,2,,nrvi=1,2,,nrv.
    9:     dat(lddat,ncol) – double array
    lddat, the first dimension of the array, must satisfy the constraint lddatnlddatn.
    Array containing all of the data. For the iith observation:
    • DAT(i,yvid)DATiyvid holds the dependent variable, yy;
    • if cwid0cwid0, DAT(i,cwid)DATicwid holds the case weights;
    • if svid0svid0, DAT(i,svid)DATisvid holds the subject variable.
    The remaining columns hold the values of the independent variables.
    Constraints:
    • if cwid0cwid0, DAT(i,cwid)0.0DATicwid0.0;
    • if levels(j)1levelsj1, 1DAT(i,j)levels(j)1DATijlevelsj.
    10:   fint – int64int32nag_int scalar
    Flag indicating whether a fixed intercept is included (fint = 1fint=1).
    Constraint: fint = 0fint=0 or 11.
    11:   rint – int64int32nag_int scalar
    Flag indicating whether a random intercept is included (rint = 1rint=1).
    If svid = 0svid=0, rint is not referenced.
    Constraint: rint = 0rint=0 or 11.
    12:   lb – int64int32nag_int scalar
    The size of the array b.
    Constraint: lb fint + i = 1nfv max (levels(fvid(i))1,1) + LS × (rint + i = 1nrvlevels(rvid(i))) lb fint + i=1 nfv max(levelsfvidi-1,1) + LS × ( rint + i=1 nrv levelsrvidi )  where LS = levels(svid) LS = levelssvid  if svid0svid0 and 11 otherwise.
    13:   gamma(nvpr + 2nvpr+2) – double array
    Holds the initial values of the variance components, γ0 γ0 , with gamma(i)gammai the initial value for σi2 / σR2σi2/σR2, for i = 1,2,,gi=1,2,,g. If rint = 1rint=1 and svid0svid0, g = nvpr + 1g=nvpr+1, else g = nvprg=nvpr.
    If gamma(1) = 1.0gamma1=-1.0, the remaining elements of gamma are ignored and the initial values for the variance components are estimated from the data using MIVQUE0.
    Constraint: gamma(1) = 1.0 ​ or ​ gamma(i)0.0gamma1=-1.0 ​ or ​ gammai0.0, for i = 1,2,,gi=1,2,,g.

    Optional Input Parameters

    1:     n – int64int32nag_int scalar
    Default: The first dimension of the array dat.
    nn, the number of observations.
    Constraint: n1n1.
    2:     ncol – int64int32nag_int scalar
    Default: The dimension of the array levels and the second dimension of the array dat. (An error is raised if these dimensions are not equal.)
    The number of columns in the data matrix, dat.
    Constraint: ncol1ncol1.
    3:     nfv – int64int32nag_int scalar
    Default: The dimension of the array fvid.
    The number of independent variables in the model which are to be treated as being fixed.
    Constraint: 0nfv < ncol0nfv<ncol.
    4:     nrv – int64int32nag_int scalar
    Default: The dimension of the arrays rvid, vpr. (An error is raised if these dimensions are not equal.)
    The number of independent variables in the model which are to be treated as being random.
    Constraints:
    5:     maxit – int64int32nag_int scalar
    The maximum number of iterations.
    If maxit < 0 maxit < 0 , the default value of 100 100  is used.
    If maxit = 0maxit=0, the parameter estimates (β,ν) (β,ν)  and corresponding standard errors are calculated based on the value of γ0 γ0  supplied in gamma.
    Default: -1-1
    6:     tol – double scalar
    The tolerance used to assess convergence.
    If tol0.0tol0.0, the default value of ε0.7ε0.7 is used, where εε is the machine precision.
    Default: 00

    Input Parameters Omitted from the MATLAB Interface

    lddat

    Output Parameters

    1:     nff – int64int32nag_int scalar
    The number of fixed effects estimated (i.e., the number of columns, pp, in the design matrix XX).
    2:     nrf – int64int32nag_int scalar
    The number of random effects estimated (i.e., the number of columns, qq, in the design matrix ZZ).
    3:     df – int64int32nag_int scalar
    The degrees of freedom.
    4:     reml – double scalar
    2 lR (γ̂) - 2 lR (γ^) where lR lR is the log of the restricted maximum likelihood calculated at γ̂ γ^ , the estimated variance components returned in gamma.
    5:     b(lb) – double array
    The parameter estimates, (β,ν)(β,ν), with the first nff elements of b containing the fixed effect parameter estimates, ββ and the next nrf elements of b containing the random effect parameter estimates, ν ν.
    Fixed effects
    If fint = 1fint=1, b(1)b1 contains the estimate of the fixed intercept. Let Li Li denote the number of levels associated with the iith fixed variable, that is Li = levels(fvid(i)) Li = levels fvidi . Define
    • if fint = 1fint=1, F1 = 2 F1 = 2 else if fint = 0fint=0, F1 = 1 F1=1 ;
    • Fi + 1 = Fi + max (Li1,1) F i+1 = Fi + max(Li-1,1) , i1 i1 .
    Then for i = 1,2,,nfvi=1,2,,nfv:
    • if Li > 1 Li > 1 , b( Fi + j2 ) b Fi+j-2 contains the parameter estimate for the jjth level of the iith fixed variable, for j = 2,3,,Lij=2,3,,Li;
    • if Li 1 Li 1 , b(Fi) bFi contains the parameter estimate for the iith fixed variable.
    Random effects
    Redefining Li Li to denote the number of levels associated with the iith random variable, that is Li = levels(rvid(i)) Li = levels rvidi . Define
    • if rint = 1rint=1, R1 = 2 R1 = 2 else if rint = 0rint=0, R1 = 1 R1=1 ;
      Ri + 1 = Ri + Li R i+1 = Ri + Li , i1 i1 .
    Then for i = 1 , 2 , , nrv i = 1 , 2 , , nrv :
    • if svid = 0svid=0,
      • if Li > 1 Li > 1 , b( nff + Ri + j1 ) b nff + Ri +j-1 contains the parameter estimate for the jjth level of the iith random variable, for j = 1,2,,Lij=1,2,,Li;
      • if Li 1 Li 1 , b( nff + Ri ) b nff + Ri contains the parameter estimate for the iith random variable;
    • if svid 0 svid 0 ,
      • let LS LS denote the number of levels associated with the subject variable, that is LS = levels(svid) LS = levels svid ;
      • if Li > 1 Li > 1 , b( nff + (s1) LS + Ri + j 1 ) b nff + ( s-1 ) LS + Ri + j - 1 contains the parameter estimate for the interaction between the ssth level of the subject variable and the jjth level of the iith random variable, for s = 1,2,,LSs=1,2,,LS and j = 1,2,,Lij=1,2,,Li;
      • if Li 1 Li 1 , b( nff + (s1) LS + Ri ) b nff + ( s-1 ) LS + Ri contains the parameter estimate for the interaction between the ssth level of the subject variable and the iith random variable, for s = 1,2,,LSs=1,2,,LS;
      • if rint = 1rint=1, b (nff + 1) b ( nff+1 ) contains the estimate of the random intercept.
    6:     se(lb) – double array
    The standard errors of the parameter estimates given in b.
    7:     gamma(nvpr + 2nvpr+2) – double array
    gamma(i)gammai, for i = 1,2,,gi=1,2,,g, holds the final estimate of σi2σi2 and gamma(g + 1)gammag+1 holds the final estimate for σR2σR2.
    8:     warn – int64int32nag_int scalar
    Is set to 1 1 if a variance component was estimated to be a negative value during the fitting process. Otherwise warn is set to 0 0 .
    If warn = 1warn=1, the negative estimate is set to zero and the estimation process allowed to continue.
    9:     ifail – int64int32nag_int scalar
    ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

    Error Indicators and Warnings

    Errors or warnings detected by the function:
      ifail = 1ifail=1
    On entry,n < 2n<2,
    orncol < 1ncol<1,
    orlddat < nlddat<n,
    oryvid < 1yvid<1 or yvid > ncolyvid>ncol,
    orcwid < 0cwid<0 or cwid > ncolcwid>ncol,
    ornfv < 0nfv<0 or nfvncolnfvncol,
    orfint0fint0 and fint1fint1,
    or nrv < 0nrv<0 or nrv > ncolnrv>ncol or nrv + rint0nrv+rint0,
    ornvpr0nvpr0 or nvpr > nrvnvpr>nrv,
    orrint0rint0 and rint1rint1,
    orsvid < 0svid<0 or svid > ncolsvid>ncol,
    orlb is too small.
      ifail = 2ifail=2
    On entry,levels(i) < 1levelsi<1, for at least one ii,
    orfvid(i) < 1fvidi<1, or fvid(i) > ncolfvidi>ncol, for at least one ii,
    orrvid(i) < 1rvidi<1, or rvid(i) > ncolrvidi>ncol, for at least one ii,
    orvpr(i) < 1vpri<1 or vpr(i) > nvprvpri>nvpr, for at least one ii,
    orat least one discrete variable in array dat has a value less than 1.01.0 or greater than that specified in levels,
    orgamma(i) < 0.0gammai<0.0, for at least one ii, and gamma(1)1.0gamma1-1.0.
      ifail = 3ifail=3
    Degrees of freedom < 1<1. The number of parameters exceed the effective number of observations.
      ifail = 4ifail=4
    The function failed to converge to the specified tolerance in maxit iterations. See Section [Further Comments] for advice.

    Accuracy

    The accuracy of the results can be adjusted through the use of the tol parameter.

    Further Comments

    Wherever possible any block structure present in the design matrix ZZ should be modelled through a subject variable, specified via svid, rather than being explicitly entered into dat.
    nag_correg_mixeff_reml (g02ja) uses an iterative process to fit the specified model and for some problems this process may fail to converge (see ifail = 4ifail=4). If the function fails to converge then the maximum number of iterations (see maxit) or tolerance (see tol) may require increasing; try a different starting estimate in gamma. Alternatively, the model can be fit using maximum likelihood (see nag_correg_mixeff_ml (g02jb)) or using the noniterative MIVQUE0.
    To fit the model just using MIVQUE0, the first element of gamma should be set to 1.0-1.0 and maxit should be set to zero.
    Although the quasi-Newton algorithm used in nag_correg_mixeff_reml (g02ja) tends to require more iterations before converging compared to the Newton–Raphson algorithm recommended by Wolfinger et al. (1994), it does not require the second derivatives of the likelihood function to be calculated and consequentially takes significantly less time per iteration.

    Example

    function nag_correg_mixeff_reml_example
    nvpr = int64(1);
    levels = [int64(1);4;3;2;3];
    yvid = int64(1);
    fvid = [int64(3);4;5];
    rvid = [int64(3)];
    svid = int64(2);
    cwid = int64(0);
    vpr = [int64(1)];
    dat = [56, 1, 1, 1, 1;
         50, 1, 2, 1, 1;
         39, 1, 3, 1, 1;
         30, 2, 1, 1, 1;
         36, 2, 2, 1, 1;
         33, 2, 3, 1, 1;
         32, 3, 1, 1, 1;
         31, 3, 2, 1, 1;
         15, 3, 3, 1, 1;
         30, 4, 1, 1, 1;
         35, 4, 2, 1, 1;
         17, 4, 3, 1, 1;
         41, 1, 1, 2, 1;
         36, 1, 2, 2, 2;
         35, 1, 3, 2, 3;
         25, 2, 1, 2, 1;
         28, 2, 2, 2, 2;
         30, 2, 3, 2, 3;
         24, 3, 1, 2, 1;
         27, 3, 2, 2, 2;
         19, 3, 3, 2, 3;
         25, 4, 1, 2, 1;
         30, 4, 2, 2, 2;
         18, 4, 3, 2, 3];
    fint = int64(1);
    rint = int64(1);
    lb = int64(25);
    gamma = [1; 1; 0];
    [nff, nrf, df, reml, b, se, gammaOut, warn, ifail] = ...
         nag_correg_mixeff_reml(nvpr, levels, yvid, fvid, rvid, svid, cwid, vpr, dat, fint, rint, lb, gamma)
    
     
    
    nff =
    
                        6
    
    
    nrf =
    
                       16
    
    
    df =
    
                       16
    
    
    reml =
    
      119.7618
    
    
    b =
    
       37.0000
        1.0000
      -11.0000
       -8.2500
        0.5000
        7.7500
       10.7631
        3.7276
       -1.4476
        0.3733
       -0.5269
       -3.7171
       -1.2253
        4.8125
       -5.6450
        0.5903
        0.3987
       -2.3806
       -4.5912
       -0.6009
        2.2742
       -2.8052
             0
             0
             0
    
    
    se =
    
        4.6674
        3.5173
        3.5173
        2.1635
        3.0596
        3.0596
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
             0
             0
             0
    
    
    gammaOut =
    
       62.3958
       15.3819
        9.3611
    
    
    warn =
    
                        0
    
    
    ifail =
    
                        0
    
    
    
    function g02ja_example
    nvpr = int64(1);
    levels = [int64(1);4;3;2;3];
    yvid = int64(1);
    fvid = [int64(3);4;5];
    rvid = [int64(3)];
    svid = int64(2);
    cwid = int64(0);
    vpr = [int64(1)];
    dat = [56, 1, 1, 1, 1;
         50, 1, 2, 1, 1;
         39, 1, 3, 1, 1;
         30, 2, 1, 1, 1;
         36, 2, 2, 1, 1;
         33, 2, 3, 1, 1;
         32, 3, 1, 1, 1;
         31, 3, 2, 1, 1;
         15, 3, 3, 1, 1;
         30, 4, 1, 1, 1;
         35, 4, 2, 1, 1;
         17, 4, 3, 1, 1;
         41, 1, 1, 2, 1;
         36, 1, 2, 2, 2;
         35, 1, 3, 2, 3;
         25, 2, 1, 2, 1;
         28, 2, 2, 2, 2;
         30, 2, 3, 2, 3;
         24, 3, 1, 2, 1;
         27, 3, 2, 2, 2;
         19, 3, 3, 2, 3;
         25, 4, 1, 2, 1;
         30, 4, 2, 2, 2;
         18, 4, 3, 2, 3];
    fint = int64(1);
    rint = int64(1);
    lb = int64(25);
    gamma = [1; 1; 0];
    [nff, nrf, df, reml, b, se, gammaOut, warn, ifail] = ...
         g02ja(nvpr, levels, yvid, fvid, rvid, svid, cwid, vpr, dat, fint, rint, lb, gamma)
    
     
    
    nff =
    
                        6
    
    
    nrf =
    
                       16
    
    
    df =
    
                       16
    
    
    reml =
    
      119.7618
    
    
    b =
    
       37.0000
        1.0000
      -11.0000
       -8.2500
        0.5000
        7.7500
       10.7631
        3.7276
       -1.4476
        0.3733
       -0.5269
       -3.7171
       -1.2253
        4.8125
       -5.6450
        0.5903
        0.3987
       -2.3806
       -4.5912
       -0.6009
        2.2742
       -2.8052
             0
             0
             0
    
    
    se =
    
        4.6674
        3.5173
        3.5173
        2.1635
        3.0596
        3.0596
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
        4.4865
        3.0331
        3.0331
        3.0331
             0
             0
             0
    
    
    gammaOut =
    
       62.3958
       15.3819
        9.3611
    
    
    warn =
    
                        0
    
    
    ifail =
    
                        0
    
    
    

    PDF version (NAG web site, 64-bit version, 64-bit version)
    Chapter Contents
    Chapter Introduction
    NAG Toolbox

    © The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013