g02ja fits a linear mixed effects regression model using restricted maximum likelihood (REML).

Syntax

C#
public static void g02ja(
	int n,
	int ncol,
	double[,] dat,
	int[] levels,
	int yvid,
	int cwid,
	int nfv,
	int[] fvid,
	int fint,
	int nrv,
	int[] rvid,
	int nvpr,
	int[] vpr,
	int rint,
	int svid,
	double[] gamma,
	out int nff,
	out int nrf,
	out int df,
	out double reml,
	double[] b,
	double[] se,
	int maxit,
	double tol,
	out int warn,
	out int ifail
)
Visual Basic
Public Shared Sub g02ja ( _
	n As Integer, _
	ncol As Integer, _
	dat As Double(,), _
	levels As Integer(), _
	yvid As Integer, _
	cwid As Integer, _
	nfv As Integer, _
	fvid As Integer(), _
	fint As Integer, _
	nrv As Integer, _
	rvid As Integer(), _
	nvpr As Integer, _
	vpr As Integer(), _
	rint As Integer, _
	svid As Integer, _
	gamma As Double(), _
	<OutAttribute> ByRef nff As Integer, _
	<OutAttribute> ByRef nrf As Integer, _
	<OutAttribute> ByRef df As Integer, _
	<OutAttribute> ByRef reml As Double, _
	b As Double(), _
	se As Double(), _
	maxit As Integer, _
	tol As Double, _
	<OutAttribute> ByRef warn As Integer, _
	<OutAttribute> ByRef ifail As Integer _
)
Visual C++
public:
static void g02ja(
	int n, 
	int ncol, 
	array<double,2>^ dat, 
	array<int>^ levels, 
	int yvid, 
	int cwid, 
	int nfv, 
	array<int>^ fvid, 
	int fint, 
	int nrv, 
	array<int>^ rvid, 
	int nvpr, 
	array<int>^ vpr, 
	int rint, 
	int svid, 
	array<double>^ gamma, 
	[OutAttribute] int% nff, 
	[OutAttribute] int% nrf, 
	[OutAttribute] int% df, 
	[OutAttribute] double% reml, 
	array<double>^ b, 
	array<double>^ se, 
	int maxit, 
	double tol, 
	[OutAttribute] int% warn, 
	[OutAttribute] int% ifail
)
F#
static member g02ja : 
        n : int * 
        ncol : int * 
        dat : float[,] * 
        levels : int[] * 
        yvid : int * 
        cwid : int * 
        nfv : int * 
        fvid : int[] * 
        fint : int * 
        nrv : int * 
        rvid : int[] * 
        nvpr : int * 
        vpr : int[] * 
        rint : int * 
        svid : int * 
        gamma : float[] * 
        nff : int byref * 
        nrf : int byref * 
        df : int byref * 
        reml : float byref * 
        b : float[] * 
        se : float[] * 
        maxit : int * 
        tol : float * 
        warn : int byref * 
        ifail : int byref -> unit 

Parameters

n
Type: System..::..Int32
On entry: n, the number of observations.
Constraint: n1.
ncol
Type: System..::..Int32
On entry: the number of columns in the data matrix, dat.
Constraint: ncol1.
dat
Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, dim2]
Note: dim1 must satisfy the constraint: dim1n
Note: the second dimension of the array dat must be at least ncol if _sorder=1, and at least n otherwise.
On entry: array containing all of the data. For the ith observation:
  • dat[i-1,yvid-1] holds the dependent variable, y;
  • if cwid0, dat[i-1,cwid-1] holds the case weights;
  • if svid0, dat[i-1,svid-1] holds the subject variable.
The remaining columns hold the values of the independent variables.
Constraints:
  • if cwid0, dat[i-1,cwid-1]0.0;
  • if levels[j-1]1, 1dat[i-1,j-1]levels[j-1].
levels
Type: array<System..::..Int32>[]()[][]
An array of size [ncol]
On entry: levels[i-1] contains the number of levels associated with the ith variable of the data matrix dat. If this variable is continuous or binary (i.e., only takes the values zero or one) then levels[i-1] should be 1; if the variable is discrete then levels[i-1] is the number of levels associated with it and dat[j-1,i-1] is assumed to take the values 1 to levels[i-1], for j=1,2,,n.
Constraint: levels[i-1]1, for i=1,2,,ncol.
yvid
Type: System..::..Int32
On entry: the column of dat holding the dependent, y, variable.
Constraint: 1yvidncol.
cwid
Type: System..::..Int32
On entry: the column of dat holding the case weights.
If cwid=0, no weights are used.
Constraint: 0cwidncol.
nfv
Type: System..::..Int32
On entry: the number of independent variables in the model which are to be treated as being fixed.
Constraint: 0nfv<ncol.
fvid
Type: array<System..::..Int32>[]()[][]
An array of size [nfv]
On entry: the columns of the data matrix dat holding the fixed independent variables with fvid[i-1] holding the column number corresponding to the ith fixed variable.
Constraint: 1fvid[i-1]ncol, for i=1,2,,nfv.
fint
Type: System..::..Int32
On entry: flag indicating whether a fixed intercept is included (fint=1).
Constraint: fint=0 or 1.
nrv
Type: System..::..Int32
On entry: the number of independent variables in the model which are to be treated as being random.
Constraints:
  • 0nrv<ncol;
  • nrv+rint>0.
rvid
Type: array<System..::..Int32>[]()[][]
An array of size [nrv]
On entry: the columns of the data matrix dat holding the random independent variables with rvid[i-1] holding the column number corresponding to the ith random variable.
Constraint: 1rvid[i-1]ncol, for i=1,2,,nrv.
nvpr
Type: System..::..Int32
On entry: if rint=1 and svid0, nvpr is the number of variance components being estimated-2, (g-1), else nvpr=g.
If nrv=0, nvpr is not referenced.
Constraint: if nrv0, 1nvprnrv.
vpr
Type: array<System..::..Int32>[]()[][]
An array of size [nrv]
On entry: vpr[i-1] holds a flag indicating the variance of the ith random variable. The variance of the ith random variable is σj2, where j=vpr[i-1]+1 if rint=1 and svid0 and j=vpr[i-1] otherwise. Random variables with the same value of j are assumed to be taken from the same distribution.
Constraint: 1vpr[i-1]nvpr, for i=1,2,,nrv.
rint
Type: System..::..Int32
On entry: flag indicating whether a random intercept is included (rint=1).
If svid=0, rint is not referenced.
Constraint: rint=0 or 1.
svid
Type: System..::..Int32
On entry: the column of dat holding the subject variable.
If svid=0, no subject variable is used.
Specifying a subject variable is equivalent to specifying the interaction between that variable and all of the random-effects. Letting the notation Z1×ZS denote the interaction between variables Z1 and ZS, fitting a model with rint=0, random-effects Z1+Z2 and subject variable ZS is equivalent to fitting a model with random-effects Z1×ZS+Z2×ZS and no subject variable. If rint=1 the model is equivalent to fitting ZS+Z1×ZS+Z2×ZS and no subject variable.
Constraint: 0svidncol.
gamma
Type: array<System..::..Double>[]()[][]
An array of size [nvpr+2]
On entry: holds the initial values of the variance components, γ0, with gamma[i-1] the initial value for σi2/σR2, for i=1,2,,g. If rint=1 and svid0, g=nvpr+1, else g=nvpr.
If gamma[0]=-1.0, the remaining elements of gamma are ignored and the initial values for the variance components are estimated from the data using MIVQUE0.
On exit: gamma[i-1], for i=1,2,,g, holds the final estimate of σi2 and gamma[g] holds the final estimate for σR2.
Constraint: gamma[0]=-1.0​ or ​gamma[i-1]0.0, for i=1,2,,g.
nff
Type: System..::..Int32%
On exit: the number of fixed effects estimated (i.e., the number of columns, p, in the design matrix X).
nrf
Type: System..::..Int32%
On exit: the number of random effects estimated (i.e., the number of columns, q, in the design matrix Z).
df
Type: System..::..Int32%
On exit: the degrees of freedom.
reml
Type: System..::..Double%
On exit: -2lRγ^ where lR is the log of the restricted maximum likelihood calculated at γ^, the estimated variance components returned in gamma.
b
Type: array<System..::..Double>[]()[][]
An array of size [dim1]
Note: dim1 must satisfy the constraint: _lbfint+i=1nfvmaxlevels[fvid[i-1]-1]-1,1+LS×rint+i=1nrvlevels[rvid[i-1]-1] where LS=levels[svid-1] if svid0 and 1 otherwise
On exit: the parameter estimates, β,ν, with the first nff elements of b containing the fixed effect parameter estimates, β and the next nrf elements of b containing the random effect parameter estimates, ν.
Fixed effects
If fint=1, b[0] contains the estimate of the fixed intercept. Let Li denote the number of levels associated with the ith fixed variable, that is Li=levels[fvid[i-1]-1]. Define
  • if fint=1, F1=2 else if fint=0, F1=1;
  • Fi+1=Fi+maxLi-1,1, i1.
Then for i=1,2,,nfv:
  • if Li>1, b[Fi+j-3] contains the parameter estimate for the jth level of the ith fixed variable, for j=2,3,,Li;
  • if Li1, b[Fi-1] contains the parameter estimate for the ith fixed variable.
Random effects
Redefining Li to denote the number of levels associated with the ith random variable, that is Li=levels[rvid[i-1]-1]. Define
  • if rint=1, R1=2 else if rint=0, R1=1;
    Ri+1=Ri+Li, i1.
Then for i=1,2,,nrv:
  • if svid=0,
    • if Li>1, b[nff+Ri+j-2] contains the parameter estimate for the jth level of the ith random variable, for j=1,2,,Li;
    • if Li1, b[nff+Ri-1] contains the parameter estimate for the ith random variable;
  • if svid0,
    • let LS denote the number of levels associated with the subject variable, that is LS=levels[svid-1];
    • if Li>1, b[nff+s-1LS+Ri+j-2] contains the parameter estimate for the interaction between the sth level of the subject variable and the jth level of the ith random variable, for s=1,2,,LS and j=1,2,,Li;
    • if Li1, b[nff+s-1LS+Ri-1] contains the parameter estimate for the interaction between the sth level of the subject variable and the ith random variable, for s=1,2,,LS;
    • if rint=1, b[nff] contains the estimate of the random intercept.
se
Type: array<System..::..Double>[]()[][]
An array of size [dim1]
Note: dim1 must satisfy the constraint: _lbfint+i=1nfvmaxlevels[fvid[i-1]-1]-1,1+LS×rint+i=1nrvlevels[rvid[i-1]-1] where LS=levels[svid-1] if svid0 and 1 otherwise
On exit: the standard errors of the parameter estimates given in b.
maxit
Type: System..::..Int32
On entry: the maximum number of iterations.
If maxit<0, the default value of 100 is used.
If maxit=0, the parameter estimates β,ν and corresponding standard errors are calculated based on the value of γ0 supplied in gamma.
tol
Type: System..::..Double
On entry: the tolerance used to assess convergence.
If tol0.0, the default value of ε0.7 is used, where ε is the machine precision.
warn
Type: System..::..Int32%
On exit: is set to 1 if a variance component was estimated to be a negative value during the fitting process. Otherwise warn is set to 0.
If warn=1, the negative estimate is set to zero and the estimation process allowed to continue.
ifail
Type: System..::..Int32%
On exit: ifail=0 unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

g02ja fits a model of the form:
y=Xβ+Zν+ε
where
  • y is a vector of n observations on the dependent variable,
  • X is a known n by p design matrix for the fixed independent variables,
  • β is a vector of length p of unknown fixed effects,
  • Z is a known n by q design matrix for the random independent variables,
  • ν is a vector of length q of unknown random effects,
and
  • ε is a vector of length n of unknown random errors.
Both ν and ε are assumed to have a Gaussian distribution with expectation zero and
Varνε=G00R
where R=σR2I, I is the n×n identity matrix and G is a diagonal matrix. It is assumed that the random variables, Z, can be subdivided into gq groups with each group being identically distributed with expectations zero and variance σi2. The diagonal elements of matrix G therefore take one of the values σi2:i=1,2,,g, depending on which group the associated random variable belongs to.
The model therefore contains three sets of unknowns, the fixed effects, β, the random effects ν and a vector of g+1 variance components, γ, where γ=σ12,σ22,,σg-12,σg2,σR2. Rather than working directly with γ, g02ja uses an iterative process to estimate γ*=σ12/σR2,σ22/σR2,,σg-12/σR2,σg2/σR2,1. Due to the iterative nature of the estimation a set of initial values, γ0, for γ* is required. g02ja allows these initial values either to be supplied by you or calculated from the data using the minimum variance quadratic unbiased estimators (MIVQUE0) suggested by Rao (1972).
g02ja fits the model using a quasi-Newton algorithm to maximize the restricted log-likelihood function:
-2lR=logV+n-plogrV-1r+logXV-1X+n-p1+log2π/n-p
where
V=ZGZ+R,  r=y-Xb  and  b=XV-1X-1XV-1y.
Once the final estimates for γ* have been obtained, the value of σR2 is given by:
σR2=rV-1r/n-p.
Case weights, Wc, can be incorporated into the model by replacing XX and ZZ with XWcX and ZWcZ respectively, for a diagonal weight matrix Wc.
The log-likelihood, lR, is calculated using the sweep algorithm detailed in Wolfinger et al. (1994).

References

Goodnight J H (1979) A tutorial on the SWEEP operator The American Statistician 33(3) 149–158
Harville D A (1977) Maximum likelihood approaches to variance component estimation and to related problems JASA 72 320–340
Rao C R (1972) Estimation of variance and covariance components in a linear model J. Am. Stat. Assoc. 67 112–115
Stroup W W (1989) Predictable functions and prediction space in the mixed model procedure Applications of Mixed Models in Agriculture and Related Disciplines Southern Cooperative Series Bulletin No. 343 39–48
Wolfinger R, Tobias R and Sall J (1994) Computing Gaussian likelihoods and their derivatives for general linear mixed models SIAM Sci. Statist. Comput. 15 1294–1310

Error Indicators and Warnings

Errors or warnings detected by the method:
Some error messages may refer to parameters that are dropped from this interface (LDDAT, LB) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.
ifail=1
On entry,n<2,
orncol<1,
oryvid<1 or yvid>ncol,
orcwid<0 or cwid>ncol,
ornfv<0 or nfvncol,
orfint0 and fint1,
or nrv<0 or nrv>ncol or nrv+rint0,
ornvpr0 or nvpr>nrv,
orrint0 and rint1,
orsvid<0 or svid>ncol,
ifail=2
On entry,levels[i-1]<1, for at least one i,
orfvid[i-1]<1, or fvid[i-1]>ncol, for at least one i,
orrvid[i-1]<1, or rvid[i-1]>ncol, for at least one i,
orvpr[i-1]<1 or vpr[i-1]>nvpr, for at least one i,
orat least one discrete variable in array dat has a value less than 1.0 or greater than that specified in levels,
orgamma[i-1]<0.0, for at least one i, and gamma[0]-1.0.
ifail=3
Degrees of freedom <1. The number of parameters exceed the effective number of observations.
ifail=4
The method failed to converge to the specified tolerance in maxit iterations. See [Further Comments] for advice.
ifail=-9000
An error occured, see message report.
ifail=-6000
Invalid Parameters value
ifail=-4000
Invalid dimension for array value
ifail=-8000
Negative dimension for array value
ifail=-6000
Invalid Parameters value

Accuracy

The accuracy of the results can be adjusted through the use of the tol parameter.

Parallelism and Performance

None.

Further Comments

Wherever possible any block structure present in the design matrix Z should be modelled through a subject variable, specified via svid, rather than being explicitly entered into dat.
g02ja uses an iterative process to fit the specified model and for some problems this process may fail to converge (see ifail=4). If the method fails to converge then the maximum number of iterations (see maxit) or tolerance (see tol) may require increasing; try a different starting estimate in gamma. Alternatively, the model can be fit using maximum likelihood (see g02jb) or using the noniterative MIVQUE0.
To fit the model just using MIVQUE0, the first element of gamma should be set to -1.0 and maxit should be set to zero.
Although the quasi-Newton algorithm used in g02ja tends to require more iterations before converging compared to the Newton–Raphson algorithm recommended by Wolfinger et al. (1994), it does not require the second derivatives of the likelihood function to be calculated and consequentially takes significantly less time per iteration.

Example

The following dataset is taken from Stroup (1989) and arises from a balanced split-plot design with the whole plots arranged in a randomized complete block-design.
In this example the full design matrix for the random independent variable, Z, is given by:
Z=110000000000000010100000000000001001000000000000000011000000000000001010000000000000100100000000000000001100000000000000101000000000000010010000000000000000110000000000000010100000000000001001110000000000000010100000000000001001000000000000000011000000000000001010000000000000100100000000000000001100000000000000101000000000000010010000000000000000110000000000000010100000000000001001
=A0000A0000A0000AA0000A0000A0000A, (1)
where
A=110010101001.
The block structure evident in (1) is modelled by specifying a four-level subject variable, taking the values 1,1,1,2,2,2,3,3,3,4,4,4,1,1,1,2,2,2,3,3,3,4,4,4. The first column of 1s is added to A by setting rint=1. The remaining columns of A are specified by a three level factor, taking the values, 1,2,3,1,2,3,1,.

Example program (C#): g02jae.cs

Example program data: g02jae.d

Example program results: g02jae.r

See Also