hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_anova_random (g04bb)

Purpose

nag_anova_random (g04bb) computes the analysis of variance and treatment means and standard errors for a randomized block or completely randomized design.

Syntax

[gmean, bmean, tmean, tabl, c, irep, r, ef, ifail] = g04bb(y, iblock, nt, it, tol, irdf, 'n', n)
[gmean, bmean, tmean, tabl, c, irep, r, ef, ifail] = nag_anova_random(y, iblock, nt, it, tol, irdf, 'n', n)

Description

In a completely randomized design, experimental material is divided into a number of units, or plots, to which a treatment can be applied. In a randomized block design the units are grouped into blocks so that the variation within blocks is less than the variation between blocks. If every treatment is applied to one plot in each block it is a complete block design. If there are fewer plots per block than treatments then the design will be an incomplete block design and may be balanced or partially balanced.
For a completely randomized design, with tt treatments and ntnt plots per treatment, the linear model is
yij = μ + τj + eij ,   j = 1,2,,t ​ and ​ i = 1,2,,nj ,
yij = μ + τj + eij ,   j=1,2,,t ​ and ​ i=1,2,,nj ,
where yijyij is the iith observation for the jjth treatment, μμ is the overall mean, τjτj is the effect of the jjth treatment and eijeij is the random error term. For a randomized block design, with tt treatments and bb blocks of kk plots, the linear model is
yij(l) = μ + βi + τl + eij ,   i = 1,2,,b , ​ j = 1,2,,k ​ and ​ l = 1,2,,t ,
yij(l) = μ + βi + τl + eij ,   i=1,2,,b , ​ j=1,2,,k ​ and ​ l=1,2,,t ,
where βiβi is the effect of the iith block and the ij(l)ij(l) notation indicates that the llth treatment is applied to the jjth plot in the iith block.
The completely randomized design gives rise to a one-way analysis of variance. The treatments do not have to be equally replicated, i.e., do not have to occur the same number of times. First the overall mean, μ̂μ^, is computed and subtracted from the observations to give yij = yijμ̂yij=yij-μ^. The estimated treatment effects, τ̂jτ^j are then computed as the treatment means of the mean adjusted observations, yijyij, and the treatment sum of squares can be computed from the sum of squares of the treatment totals of the yijyij divided by the number of observations per treatment total, njnj. The final residuals are computed as rij = yijτ̂jrij=yij-τ^j, and, from the residuals, the residual sum of squares is calculated.
For the randomized block design the mean is computed and subtracted from the observations to give yij(l) = yij(l)μ̂yij(l)=yij(l)-μ^. The estimated block effects, ignoring treatment effects, β̂iβ^i, are then computed using the block means of the yij(l)yij(l) and the unadjusted sum of squares computed as the sum of squared block totals for the yij(l)yij(l) divided by number of plots per block, kk. The block adjusted observations are then computed as yij(l) = yij(l) = β̂iyij(l) =yij(l)=β^i. In the case of the complete block design, with the same replication for each treatment within each block, the blocks and treatments are orthogonal, and so the treatment effects are estimated as the treatment means of the block adjusted observations, yij(l)yij(l) . The treatment sum of squares is computed as the sum of squared treatment totals of the yij(l)yij(l)  divided by the number of replicates to the treatments, r = bk / tr=bk/t. Finally the residuals, and hence the residual sum of squares, are given by rij(l) = yij(l)τ̂lrij(l)=yij(l) -τ^l.
For a design without the same replication for each treatment within each block the treatments and the blocks will not be orthogonal, so the treatments adjusted for blocks need to be computed. The adjusted treatment effects are found as the solution to the equations
(RNNT / k)τ̂ = q,
(R-NNT/k)τ^=q,
where qq is the vector of the treatment totals for block adjusted observations, yij(l)yij(l) , RR is a diagonal matrix with RllRll equal to the number of times the llth treatment is replicated, and nn is the tt by bb incidence matrix, with NljNlj equal to the number of times treatment ll occurs in block jj. The solution to the equations can be written as
τ̂ = Ωq
τ^=Ωq
where ΩΩ is a generalized inverse of (RNNT / k)(R-NNT/k). The solution is found from the eigenvalue decomposition of (RNNT / k)(R-NNT/k). The residuals are first calculated by subtracting the estimated treatment effects from the block adjusted observations to give rij(l) = yij(l)τ̂lrij(l)=yij(l) -τ^l. However, since only the unadjusted block effects have been removed and blocks and treatments are not orthogonal, the block means of the rij(l)rij(l) have to be subtracted to give the correct residuals, rij(l)rij(l) and residual sum of squares.
The mean squares are computed as the sum of squares divided by the degrees of freedom. The degrees of freedom for the unadjusted blocks is b1b-1, for the completely randomized and the complete block designs the degrees of freedom for the treatments is t1t-1. In the general case the degrees of freedom for treatments is the rank of the matrix ΩΩ. The FF-statistic given by the ratio of the treatment mean square to the residual mean square tests the hypothesis
H0 : τ1 = τ2 = = τt = 0.
H0:τ1=τ2==τt=0.
The standard errors for the difference in treatment effects, or treatment means, for the completely randomized or the complete block designs, are given by:
se(τjτj * ) = (1/(nj) + 1/(nj * )) s2
se(τj-τj*)=(1nj+1nj*) s2
where s2s2 is the residual mean square and nj = nj * = bnj=nj*=b in the complete block design. In the general case the variances of the treatment effects are given by
var(τ) = Ωs2
var(τ)=Ωs2
from which the appropriate standard errors of the difference between treatment effects or the difference between adjusted means can be calculated.
In the complete block design all the information on the treatment effects is given by the within block analysis. In other designs there may be a loss of information due to the non-orthogonality of treatments and blocks. The efficiency of the within block analysis in these cases is given by the (canonical) efficiency factors, these are the nonzero eigenvalues of the matrix (RNNT / k)(R-NNT/k), divided by the number of replicates in the case of equal replication, or by the mean of the number of replicates in the unequally replicated case, see John (1987). If more than one eigenvalue is zero then the design is said to be disconnected and some treatments can only be compared using a between block analysis.

References

Cochran W G and Cox G M (1957) Experimental Designs Wiley
Davis O L (1978) The Design and Analysis of Industrial Experiments Longman
John J A (1987) Cyclic Designs Chapman and Hall
John J A and Quenouille M H (1977) Experiments: Design and Analysis Griffin
Searle S R (1971) Linear Models Wiley

Parameters

Compulsory Input Parameters

1:     y(n) – double array
n, the dimension of the array, must satisfy the constraint n2n2 and if abs(iblock)2abs(iblock)2, n must be a multiple of abs(iblock)abs(iblock).
The observations in the order as described by iblock and nt.
2:     iblock – int64int32nag_int scalar
Indicates the block structure.
abs(iblock)1abs(iblock)1
There are no blocks, i.e., it is a completely randomized design.
iblock2iblock2
There are iblock blocks and the data should be input by blocks, i.e., y must contain the observations for block 11 followed by the observations for block 22, etc.
iblock-2iblock-2
There are abs(iblock)abs(iblock) blocks and the data is input in parallel with respect to blocks, i.e., y(1)y1 must contain the first observation for block 11, y(2)y2 must contain the first observation for block 2y(abs(iblock))2y(abs(iblock)) must contain the first observation for block abs(iblock),y(abs(iblock + 1))abs(iblock),y(abs(iblock+1)) must contain the second observation for block 11, etc.
Constraint: iblock = 1iblock=1, 22 or -2-2.
3:     nt – int64int32nag_int scalar
The number of treatments. If only blocks are required in the analysis then set nt = 1nt=1.
Constraints:
  • if abs(iblock)2abs(iblock)2, nt1nt1;
  • otherwise nt2nt2.
4:     it( : :) – int64int32nag_int array
Note: the dimension of the array it must be at least nn if nt2nt2, and at least 11 otherwise.
it(i)iti indicates which of the nt treatments plot ii received, for i = 1,2,,ni=1,2,,n.
If nt = 1nt=1, it is not referenced.
Constraint: 1it(i)nt1itint, for i = 1,2,,ni=1,2,,n.
5:     tol – double scalar
The tolerance value used to check for zero eigenvalues of the matrix ΩΩ. If tol = 0.0tol=0.0 a default value of 10510-5 is used.
Constraint: tol0.0tol0.0.
6:     irdf – int64int32nag_int scalar
An adjustment to the degrees of freedom for the residual and total.
irdf1irdf1
The degrees of freedom for the total is set to nirdfn-irdf and the residual degrees of freedom adjusted accordingly.
irdf = 0irdf=0
The total degrees of freedom for the total is set to n1n-1, as usual.
Constraint: irdf0irdf0.

Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array y.
The number of observations.
Constraint: n2n2 and if abs(iblock)2abs(iblock)2, n must be a multiple of abs(iblock)abs(iblock).

Input Parameters Omitted from the MATLAB Interface

ldtabl ldc wk

Output Parameters

1:     gmean – double scalar
The grand mean, μ̂μ^.
2:     bmean(abs(iblock)abs(iblock)) – double array
If abs(iblock)2abs(iblock)2, bmean(j)bmeanj contains the mean for the jjth block, β̂jβ^j, for j = 1,2,,bj=1,2,,b.
3:     tmean(nt) – double array
If nt2nt2, tmean(l)tmeanl contains the (adjusted) mean for the llth treatment, μ̂* + τ̂lμ^*+τ^l, for l = 1,2,,tl=1,2,,t, where μ̂*μ^* is the mean of the treatment adjusted observations, yij(l)τ̂lyij(l)-τ^l.
4:     tabl(ldtabl,55) – double array
ldtabl4ldtabl4.
The analysis of variance table. Column 1 contains the degrees of freedom, column 2 the sum of squares, and where appropriate, column 3 the mean squares, column 4 the FF-statistic and column 5 the significance level of the FF-statistic. Row 1 is for Blocks, row 2 for Treatments, row 3 for Residual and row 4 for Total. Mean squares are computed for all but the Total row; FF-statistics and significance are computed for Treatments and Blocks, if present. Any unfilled cells are set to zero.
5:     c(ldc,nt) – double array
ldcntldcnt.
If nt2nt2, the upper triangular part of c contains the variance-covariance matrix of the treatment effects, the strictly lower triangular part contains the standard errors of the difference between two treatment effects (means), i.e., c(i,j)cij contains the covariance of treatment ii and jj if jiji and the standard error of the difference between treatment ii and jj if j < ij<i, for i = 1,2,,ti=1,2,,t and j = 1,2,,tj=1,2,,t.
6:     irep(nt) – int64int32nag_int array
If nt2nt2, the treatment replications, RllRll, for l = 1,2,,ntl=1,2,,nt.
7:     r(n) – double array
The residuals, riri, for i = 1,2,,ni=1,2,,n.
8:     ef(nt) – double array
If nt2nt2, the canonical efficiency factors.
9:     ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Note: nag_anova_random (g04bb) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

  ifail = 1ifail=1
On entry,n < 2n<2,
ornt0nt0,
ornt = 1nt=1 and abs(iblock)1abs(iblock)1,
orldtabl < 4ldtabl<4,
orldc < ntldc<nt,
ortol < 0.0tol<0.0,
orirdf < 0irdf<0.
  ifail = 2ifail=2
On entry,abs(iblock)2abs(iblock)2 and n is not a multiple of abs(iblock)abs(iblock).
  ifail = 3ifail=3
On entry,it(i) < 1iti<1 or it(i) > ntiti>nt for some ii when nt2nt2,
orno value of it = jit=j for some j = 1,2,,ntj=1,2,,nt, when nt2nt2.
  ifail = 4ifail=4
On entry,the values of y are constant.
  ifail = 5ifail=5
A computed standard error is zero due to rounding errors, or the eigenvalue computation failed to converge. Both are unlikely error exits.
W ifail = 6ifail=6
The treatments are totally confounded with blocks, so the treatment sum of squares and degrees of freedom are zero. The analysis of variance table is not computed, except for block and total sums of squares and degrees of freedom.
W ifail = 7ifail=7
The residual degrees of freedom or the residual sum of squares are zero, columns 3, 4 and 5 of the analysis of variance table will not be computed and the matrix of standard errors and covariances, c, will not be scaled by ss or s2s2.
W ifail = 8ifail=8
The design is disconnected; the standard errors may not be valid. The design may be nested.

Accuracy

The algorithm used by nag_anova_random (g04bb), described in Section [Description], achieves greater accuracy than the traditional algorithms based on the subtraction of sums of squares.

Further Comments

To estimate missing values the Healy and Westmacott procedure or its derivatives may be used, see John and Quenouille (1977). This is an iterative procedure in which estimates of the missing values are adjusted by subtracting the corresponding values of the residuals. The new estimates are then used in the analysis of variance. This process is repeated until convergence. A suitable initial value may be the grand mean μ̂μ^. When using this procedure irdf should be set to the number of missing values plus one to obtain the correct degrees of freedom for the residual sum of squares.
For designs such as Graeco–Latin squares one or more of the blocking factors has to be removed in a preliminary analysis before the final analysis using calls to nag_anova_random (g04bb) or nag_anova_rowcol (g04bc). The residuals from the preliminary analysis are then input to nag_anova_random (g04bb). In these cases irdf should be set to the difference between n and the residual degrees of freedom from preliminary analysis. Care should be taken when using this approach as there is no check on the orthogonality of the two analyses.
For analysis of covariance the residuals are obtained from an analysis of variance of both the response variable and the covariates. The residuals from the response variable are then regressed on the residuals from the covariates using, say, nag_correg_linregs_noconst (g02cb) or nag_correg_linregm_fit (g02da). The results from those functions can be used to test for the significance of the covariates. To test the significance of the treatment effects after fitting the covariate, the residual sum of squares from the regression should be compared with the residual sum of squares obtained from the equivalent regression but using the residuals from fitting blocks only.

Example

function nag_anova_random_example
y = [1;
     5;
     4;
     5;
     10;
     6;
     2;
     9;
     3;
     4;
     8;
     6;
     2;
     4;
     7;
     6;
     7;
     5;
     5;
     7;
     2;
     7;
     2;
     4;
     8;
     4;
     2;
     10;
     8;
     7];
iblock = int64(10);
nt = int64(6);
it = [int64(1);2;3;1;2;4;1;3;5;1;4;6;1;5;6;2;3;6;2;4;5;2;5;6;3;4;5;3;4;6];
tol = 5e-06;
irdf = int64(0);
[gmean, bmean, tmean, table, c, irep, r, ef, ifail] = ...
    nag_anova_random(y, iblock, nt, it, tol, irdf)
 

gmean =

    5.3333


bmean =

    3.3333
    7.0000
    4.6667
    6.0000
    4.3333
    6.0000
    4.6667
    4.3333
    4.6667
    8.3333


tmean =

    2.5000
    7.2500
    8.0833
    5.9167
    2.9167
    5.3333


table =

    9.0000   60.0000    6.6667    4.7872    0.0039
    5.0000  101.7778   20.3556   14.6170    0.0000
   15.0000   20.8889    1.3926         0         0
   29.0000  182.6667         0         0         0


c =

    0.2901   -0.0580   -0.0580   -0.0580   -0.0580   -0.0580
    0.8344    0.2901   -0.0580   -0.0580   -0.0580   -0.0580
    0.8344    0.8344    0.2901   -0.0580   -0.0580   -0.0580
    0.8344    0.8344    0.8344    0.2901   -0.0580   -0.0580
    0.8344    0.8344    0.8344    0.8344    0.2901   -0.0580
    0.8344    0.8344    0.8344    0.8344    0.8344    0.2901


irep =

                    5
                    5
                    5
                    5
                    5
                    5


r =

    1.1111
    0.3611
   -1.4722
    0.7222
    0.9722
   -1.6944
   -0.6667
    0.7500
   -0.0833
    0.0833
    0.6667
   -0.7500
   -1.2500
    0.3333
    0.9167
   -0.3611
   -0.1944
    0.5556
   -1.5556
    1.7778
   -0.2222
    0.5833
   -0.0833
   -0.5000
    0.8889
   -0.9444
    0.0556
    0.0278
    0.1944
   -0.2222


ef =

         0
    0.8000
    0.8000
    0.8000
    0.8000
    0.8000


ifail =

                    0


function g04bb_example
y = [1;
     5;
     4;
     5;
     10;
     6;
     2;
     9;
     3;
     4;
     8;
     6;
     2;
     4;
     7;
     6;
     7;
     5;
     5;
     7;
     2;
     7;
     2;
     4;
     8;
     4;
     2;
     10;
     8;
     7];
iblock = int64(10);
nt = int64(6);
it = [int64(1);2;3;1;2;4;1;3;5;1;4;6;1;5;6;2;3;6;2;4;5;2;5;6;3;4;5;3;4;6];
tol = 5e-06;
irdf = int64(0);
[gmean, bmean, tmean, table, c, irep, r, ef, ifail] = g04bb(y, iblock, nt, it, tol, irdf)
 

gmean =

    5.3333


bmean =

    3.3333
    7.0000
    4.6667
    6.0000
    4.3333
    6.0000
    4.6667
    4.3333
    4.6667
    8.3333


tmean =

    2.5000
    7.2500
    8.0833
    5.9167
    2.9167
    5.3333


table =

    9.0000   60.0000    6.6667    4.7872    0.0039
    5.0000  101.7778   20.3556   14.6170    0.0000
   15.0000   20.8889    1.3926         0         0
   29.0000  182.6667         0         0         0


c =

    0.2901   -0.0580   -0.0580   -0.0580   -0.0580   -0.0580
    0.8344    0.2901   -0.0580   -0.0580   -0.0580   -0.0580
    0.8344    0.8344    0.2901   -0.0580   -0.0580   -0.0580
    0.8344    0.8344    0.8344    0.2901   -0.0580   -0.0580
    0.8344    0.8344    0.8344    0.8344    0.2901   -0.0580
    0.8344    0.8344    0.8344    0.8344    0.8344    0.2901


irep =

                    5
                    5
                    5
                    5
                    5
                    5


r =

    1.1111
    0.3611
   -1.4722
    0.7222
    0.9722
   -1.6944
   -0.6667
    0.7500
   -0.0833
    0.0833
    0.6667
   -0.7500
   -1.2500
    0.3333
    0.9167
   -0.3611
   -0.1944
    0.5556
   -1.5556
    1.7778
   -0.2222
    0.5833
   -0.0833
   -0.5000
    0.8889
   -0.9444
    0.0556
    0.0278
    0.1944
   -0.2222


ef =

         0
    0.8000
    0.8000
    0.8000
    0.8000
    0.8000


ifail =

                    0



PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013