hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_anova_rowcol (g04bc)

Purpose

nag_anova_rowcol (g04bc) computes the analysis of variance for a general row and column design together with the treatment means and standard errors.

Syntax

[gmean, tmean, tabl, c, irep, rpmean, rmean, cmean, r, ef, ifail] = g04bc(nrep, nrow, ncol, y, nt, it, tol, irdf)
[gmean, tmean, tabl, c, irep, rpmean, rmean, cmean, r, ef, ifail] = nag_anova_rowcol(nrep, nrow, ncol, y, nt, it, tol, irdf)

Description

In a row and column design the experimental material can be characterised by a two-way classification, nominally called rows and columns. Each experimental unit can be considered as being located in a particular row and column. It is assumed that all rows are of the same length and all columns are of the same length. Sets of equal numbers of rows/columns can be grouped together to form replicates, sometimes known as squares or rectangles, as appropriate.
If for a replicate, the number of rows, the number of columns and the number of treatments are equal and every treatment occurs once in each row and each column then the design is a Latin square. If this is not the case the treatments will be non-orthogonal to rows and columns. For example in the case of a lattice square each treatment occurs only once in each square.
For a row and column design, with tt treatments in rr rows and cc columns and bb replicates or squares with n = brcn=brc observations the linear model is:
yijk(l) = μ + βi + ρj + γk + τl + eijk
y ijk(l) = μ+ βi+ ρj+ γk+ τl+ eijk
for i = 1,2,bi=1,2,b, j = 1,2,,rj=1,2,,r, k = 1,2,ck=1,2,c and l = 1,2,,tl=1,2,,t, where βiβi is the effect of the iith replicate, ρjρj is the effect of the jjth row, γkγk is the effect of the kkth column and the ijk(l)ijk(l) notation indicates that the llth treatment is applied to the unit in row jj, column kk of replicate ii.
To compute the analysis of variance for a row and column design the mean is computed and subtracted from the observations to give, yijk(l) = yijk(l)μ̂yijk(l)=yijk(l)-μ^. Since the replicates, rows and columns are orthogonal the estimated effects, ignoring treatment effects, β̂iβ^i, ρ̂jρ^j, γ̂kγ^k, can be computed using the appropriate means of the yijk(l)yijk(l), and the unadjusted sum of squares computed as the appropriate sum of squared totals for the yijk(l)yijk(l) divided by number of units per total. The observations adjusted for replicates, rows and columns can then be computed by subtracting the estimated effects from yijk(l)yijk(l) to give yijk(l)yijk(l) .
In the case of a Latin square design the treatments are orthogonal to replicates, rows and columns and so the treatment effects, τ̂lτ^l, can be estimated as the treatment means of the adjusted observations, yijk(l)yijk(l) . The treatment sum of squares is computed as the sum of squared treatment totals of the yij(l)yij(l)  divided by the number of times each treatment is replicated. Finally the residuals, and hence the residual sum of squares, are given by rij(l) = yij(l)τ̂lrij(l)=yij(l) -τ^l.
For a design which is not orthogonal, for example a lattice square or an incomplete Latin square, the treatment effects adjusted for replicates, rows and columns need to be computed. The adjusted treatment effects are found as the solution to the equations:
Aτ̂ = (RNbNbT / (rc)NrNrT / (bc)NcNcT / (br))τ̂ = q
Aτ^=(R-NbNbT/(rc)-NrNrT/(bc)-NcNcT/(br))τ^=q
where qq is the vector of the treatment totals of the observations adjusted for replicates, rows and columns, yijk(l)yijk(l) , RR is a diagonal matrix with RllRll equal to the number of times the llth treatment is replicated, and NbNb is the tt by bb incidence matrix, with Nl,iNl,i equal to the number of times treatment ll occurs in replicate ii, with NrNr and NcNc being similarly defined for rows and columns. The solution to the equations can be written as:
τ̂ = Ωq
τ^=Ωq
where, ΩΩ is a generalized inverse of AA. The solution is found from the eigenvalue decomposition of AA. The residuals are first calculated by subtracting the estimated adjusted treatment effects from the adjusted observations to give rij(l) = yij(l)τ̂lrij(l)=yij(l) -τ^l. However, since only the unadjusted replicate, row and column effects have been removed and they are not orthogonal to treatments, the replicate, row and column means of the rij(l)rij(l) have to be subtracted to give the correct residuals, rij(l)rij(l) and residual sum of squares.
Given the sums of squares, the mean squares are computed as the sums of squares divided by the degrees of freedom. The degrees of freedom for the unadjusted replicates, rows and columns are b1b-1, r1r-1 and c1c-1 respectively and for the Latin square designs the degrees of freedom for the treatments is t1t-1. In the general case the degrees of freedom for treatments is the rank of the matrix ΩΩ. The FF-statistic given by the ratio of the treatment mean square to the residual mean square tests the hypothesis:
H0 : τ1 = τ2 = = τt = 0.
H0:τ1=τ2==τt=0.
The standard errors for the difference in treatment effects, or treatment means, for Latin square designs, are given by:
se(τ̂jτ̂j * ) = sqrt(2s2 / (bt))
se(τ^j-τ^j*)=2s2/(bt)
where s2s2 is the residual mean square. In the general case the variances of the treatment effects are given by:
Var(τ̂) = Ωs2
Var(τ^)=Ωs2
from which the appropriate standard errors of the difference between treatment effects or the difference between adjusted means can be calculated.
The analysis of a row-column design can be considered as consisting of different strata: the replicate stratum, the rows within replicate and the columns within replicate strata and the units stratum. In the Latin square design all the information on the treatment effects is given at the units stratum. In other designs there may be a loss of information due to the non-orthogonality of treatments and replicates, rows and columns and information on treatments may be available in higher strata. The efficiency of the estimation at the units stratum is given by the (canonical) efficiency factors, these are the nonzero eigenvalues of the matrix, AA, divided by the number of replicates in the case of equal replication, or by the mean of the number of replicates in the unequally replicated case, see John (1987). If more than one eigenvalue is zero then the design is said to be disconnected and information on some treatment comparisons can only be obtained from higher strata.

References

Cochran W G and Cox G M (1957) Experimental Designs Wiley
Davis O L (1978) The Design and Analysis of Industrial Experiments Longman
John J A (1987) Cyclic Designs Chapman and Hall
John J A and Quenouille M H (1977) Experiments: Design and Analysis Griffin
Searle S R (1971) Linear Models Wiley

Parameters

Compulsory Input Parameters

1:     nrep – int64int32nag_int scalar
bb, the number of replicates.
Constraint: nrep1nrep1.
2:     nrow – int64int32nag_int scalar
rr, the number of rows per replicate.
Constraint: nrow2nrow2.
3:     ncol – int64int32nag_int scalar
cc, the number of columns per replicate.
Constraint: ncol2ncol2.
4:     y(nrep × nrow × ncolnrep×nrow×ncol) – double array
The n = brcn=brc observations ordered by columns within rows within replicates. That is y(rc(i1) + r(j1) + k)y(rc(i-1)+r(j-1)+k) contains the observation from the kkth column of the jjth row of the iith replicate, for i = 1,2,,bi=1,2,,b, j = 1,2,,rj=1,2,,r and k = 1,2,,ck=1,2,,c.
5:     nt – int64int32nag_int scalar
The number of treatments. If only replicates, rows and columns are required in the analysis then set nt = 1nt=1.
Constraint: nt1nt1.
6:     it( : :) – int64int32nag_int array
Note: the dimension of the array it must be at least nrep × nrow × ncolnrep×nrow×ncol if nt > 1nt>1, and at least 11 otherwise.
If nt > 1nt>1, it(i)iti indicates which of the nt treatments unit ii received, for i = 1,2,,ni=1,2,,n.
If nt = 1nt=1, it is not referenced.
Constraint: if nt2nt2, 1it(i)nt1itint, for i = 1,2,,ni=1,2,,n.
7:     tol – double scalar
The tolerance value used to check for zero eigenvalues of the matrix ΩΩ. If tol = 0.0tol=0.0 a default value of 0.000010.00001 is used.
Constraint: tol0.0tol0.0.
8:     irdf – int64int32nag_int scalar
An adjustment to the degrees of freedom for the residual and total.
irdf1irdf1
The degrees of freedom for the total is set to nirdfn-irdf and the residual degrees of freedom adjusted accordingly.
irdf = 0irdf=0
the total degrees of freedom for the total is set to n1n-1, as usual.
Constraint: irdf0irdf0.

Optional Input Parameters

None.

Input Parameters Omitted from the MATLAB Interface

ldtabl ldc wk

Output Parameters

1:     gmean – double scalar
The grand mean, μ̂μ^.
2:     tmean(nt) – double array
If nt2nt2, tmean(l)tmeanl contains the (adjusted) mean for the llth treatment, μ̂* + τ̂lμ^*+τ^l, for l = 1,2,,tl=1,2,,t, where μ̂*μ^* is the mean of the treatment adjusted observations yijk(l)τ̂lyijk(l)-τ^l. Otherwise tmean is not referenced.
3:     tabl(ldtabl,55) – double array
ldtabl6ldtabl6.
The analysis of variance table. Column 1 contains the degrees of freedom, column 2 the sum of squares, and where appropriate, column 3 the mean squares, column 4 the FF-statistic and column 5 the significance level of the FF-statistic. Row 1 is for replicates, row 2 for rows, row 3 for columns, row 4 for treatments (if nt > 1nt>1), row 5 for residual and row 6 for total. Mean squares are computed for all but the total row, FF-statistics and significance are computed for treatments, replicates, rows and columns. Any unfilled cells are set to zero.
4:     c(ldc,nt) – double array
ldcntldcnt.
The upper triangular part of c contains the variance-covariance matrix of the treatment effects, the strictly lower triangular part contains the standard errors of the difference between two treatment effects (means), i.e., c(i,j)cij contains the covariance of treatment ii and jj if jiji and the standard error of the difference between treatment ii and jj if j < ij<i, for i = 1,2,,ti=1,2,,t and j = 1,2,,tj=1,2,,t.
5:     irep(nt) – int64int32nag_int array
If nt > 1nt>1, the treatment replications, RllRll, for l = 1,2,,ntl=1,2,,nt. Otherwise irep is not referenced.
6:     rpmean(nrep) – double array
If nrep > 1nrep>1, rpmean(i)rpmeani contains the mean for the iith replicate, μ̂ + β̂iμ^+β^i, for i = 1,2,,bi=1,2,,b. Otherwise rpmean is not referenced.
7:     rmean(nrep × nrownrep×nrow) – double array
rmean(j)rmeanj contains the mean for the jjth row, μ̂ + ρ̂iμ^+ρ^i, for j = 1,2,,rj=1,2,,r.
8:     cmean(nrep × ncolnrep×ncol) – double array
cmean(k)cmeank contains the mean for the kkth column, μ̂ + γ̂kμ^+γ^k, for k = 1,2,,ck=1,2,,c.
9:     r(nrep × nrow × ncolnrep×nrow×ncol) – double array
The residuals, riri, for i = 1,2,,ni=1,2,,n.
10:   ef(nt) – double array
If nt2nt2, the canonical efficiency factors. Otherwise ef is not referenced.
11:   ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Note: nag_anova_rowcol (g04bc) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

  ifail = 1ifail=1
On entry,nrep < 1nrep<1,
ornrow < 2nrow<2,
orncol < 2ncol<2,
ornt < 1nt<1,
orldtabl < 6ldtabl<6,
orldc < ntldc<nt,
ortol < 0.0tol<0.0,
orirdf < 0irdf<0.
  ifail = 2ifail=2
On entry,it(i) < 1iti<1 or it(i) > ntiti>nt for some ii when nt2nt2,
orno value of it = jit=j for some j = 1,2,,ntj=1,2,,nt, when nt2nt2.
  ifail = 3ifail=3
On entry,the values of y are constant.
  ifail = 4ifail=4
A computed standard error is zero due to rounding errors, or the eigenvalue computation failed to converge. Both are unlikely error exits.
W ifail = 5ifail=5
The treatments are totally confounded with replicates, rows and columns, so the treatment sum of squares and degrees of freedom are zero. The analysis of variance table is not computed, except for replicate, row, column and total sums of squares and degrees of freedom.
W ifail = 6ifail=6
The residual degrees of freedom or the residual sum of squares are zero, columns 3, 4 and 5 of the analysis of variance table will not be computed and the matrix of standard errors and covariances, c, will not be scaled by ss or s2s2.
W ifail = 7ifail=7
The design is disconnected, the standard errors may not be valid. The design may have a nested structure.

Accuracy

The algorithm used in nag_anova_rowcol (g04bc), described in Section [Description], achieves greater accuracy than the traditional algorithms based on the subtraction of sums of squares.

Further Comments

To estimate missing values the Healy and Westmacott procedure or its derivatives may be used, see John and Quenouille (1977). This is an iterative procedure in which estimates of the missing values are adjusted by subtracting the corresponding values of the residuals. The new estimates are then used in the analysis of variance. This process is repeated until convergence. A suitable initial value may be the grand mean. When using this procedure irdf should be set to the number of missing values plus one to obtain the correct degrees of freedom for the residual sum of squares.
For analysis of covariance the residuals are obtained from an analysis of variance of both the response variable and the covariates. The residuals from the response variable are then regressed on the residuals from the covariates using, say, nag_correg_linregs_noconst (g02cb) or nag_correg_linregm_fit (g02da). The results from those functions can be used to test for the significance of the covariates. To test the significance of the treatment effects after fitting the covariate, the residual sum of squares from the regression should be compared with the residual sum of squares obtained from the equivalent regression but using the residuals from fitting replicates, rows and columns only.

Example

function nag_anova_rowcol_example
nrep = int64(1);
nrow = int64(5);
ncol = int64(5);
y = [6.67;
     7.15;
     8.29;
     8.95;
     9.62;
     5.4;
     4.77;
     5.4;
     7.54;
     6.93;
     7.32;
     8.53;
     8.5;
     9.99;
     9.68;
     4.92;
     5;
     7.29;
     7.85;
     7.08;
     4.88;
     6.16;
     7.83;
     5.38;
     8.51];
nt = int64(5);
it = [int64(5);4;1;3;2;2;5;4;1;3;3;2;5;4;1;1;3;2;5;4;4;1;3;2;5];
tol = 1e-05;
irdf = int64(0);
[gmean, tmean, table, c, irep, rpmean, rmean, cmean, r, ef, ifail] = ...
    nag_anova_rowcol(nrep, nrow, ncol, y, nt, it, tol, irdf)
 

gmean =

    7.1856


tmean =

    7.3180
    7.2440
    7.2060
    6.9000
    7.2600


table =

         0         0         0         0         0
    4.0000   29.4231    7.3558    9.0266    0.0013
    4.0000   22.9950    5.7487    7.0545    0.0037
    4.0000    0.5423    0.1356    0.1664    0.9514
   12.0000    9.7788    0.8149         0         0
   24.0000   62.7392         0         0         0


c =

    0.1304   -0.0326   -0.0326   -0.0326   -0.0326
    0.5709    0.1304   -0.0326   -0.0326   -0.0326
    0.5709    0.5709    0.1304   -0.0326   -0.0326
    0.5709    0.5709    0.5709    0.1304   -0.0326
    0.5709    0.5709    0.5709    0.5709    0.1304


irep =

                    5
                    5
                    5
                    5
                    5


rpmean =

     0


rmean =

    8.1360
    6.0080
    8.8040
    6.4280
    6.5520


cmean =

    5.8380
    6.3220
    7.4620
    7.9420
    8.3640


r =

   -0.1928
    0.1632
   -0.2548
    0.0372
    0.2472
    0.6812
   -0.4488
   -0.5988
    0.6432
   -0.2768
   -0.1568
    0.5312
   -0.6548
    0.7152
   -0.4348
   -0.2928
   -0.5848
    0.5272
    0.5912
   -0.2408
   -0.0388
    0.3392
    0.9812
   -1.9868
    0.7052


ef =

     1
     1
     1
     1
     1


ifail =

                    0


function g04bc_example
nrep = int64(1);
nrow = int64(5);
ncol = int64(5);
y = [6.67;
     7.15;
     8.29;
     8.95;
     9.62;
     5.4;
     4.77;
     5.4;
     7.54;
     6.93;
     7.32;
     8.53;
     8.5;
     9.99;
     9.68;
     4.92;
     5;
     7.29;
     7.85;
     7.08;
     4.88;
     6.16;
     7.83;
     5.38;
     8.51];
nt = int64(5);
it = [int64(5);4;1;3;2;2;5;4;1;3;3;2;5;4;1;1;3;2;5;4;4;1;3;2;5];
tol = 1e-05;
irdf = int64(0);
[gmean, tmean, table, c, irep, rpmean, rmean, cmean, r, ef, ifail] = ...
    g04bc(nrep, nrow, ncol, y, nt, it, tol, irdf)
 

gmean =

    7.1856


tmean =

    7.3180
    7.2440
    7.2060
    6.9000
    7.2600


table =

         0         0         0         0         0
    4.0000   29.4231    7.3558    9.0266    0.0013
    4.0000   22.9950    5.7487    7.0545    0.0037
    4.0000    0.5423    0.1356    0.1664    0.9514
   12.0000    9.7788    0.8149         0         0
   24.0000   62.7392         0         0         0


c =

    0.1304   -0.0326   -0.0326   -0.0326   -0.0326
    0.5709    0.1304   -0.0326   -0.0326   -0.0326
    0.5709    0.5709    0.1304   -0.0326   -0.0326
    0.5709    0.5709    0.5709    0.1304   -0.0326
    0.5709    0.5709    0.5709    0.5709    0.1304


irep =

                    5
                    5
                    5
                    5
                    5


rpmean =

     0


rmean =

    8.1360
    6.0080
    8.8040
    6.4280
    6.5520


cmean =

    5.8380
    6.3220
    7.4620
    7.9420
    8.3640


r =

   -0.1928
    0.1632
   -0.2548
    0.0372
    0.2472
    0.6812
   -0.4488
   -0.5988
    0.6432
   -0.2768
   -0.1568
    0.5312
   -0.6548
    0.7152
   -0.4348
   -0.2928
   -0.5848
    0.5272
    0.5912
   -0.2408
   -0.0388
    0.3392
    0.9812
   -1.9868
    0.7052


ef =

     1
     1
     1
     1
     1


ifail =

                    0



PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013