NAG Library Routine Document
g11baf (tabulate_stat)
1
Purpose
g11baf computes a table from a set of classification factors using a selected statistic.
2
Specification
Fortran Interface
Subroutine g11baf ( 
stat, update, weight, n, nfac, isf, lfac, ifac, ldf, y, wt, table, maxt, ncells, ndim, idim, icount, auxt, iwk, ifail) 
Integer, Intent (In)  ::  n, nfac, isf(nfac), lfac(nfac), ifac(ldf,nfac), ldf, maxt  Integer, Intent (Inout)  ::  ncells, icount(maxt), ifail  Integer, Intent (Out)  ::  ndim, idim(nfac), iwk(2*nfac)  Real (Kind=nag_wp), Intent (In)  ::  y(n), wt(*)  Real (Kind=nag_wp), Intent (Inout)  ::  table(maxt), auxt(*)  Character (1), Intent (In)  ::  stat, update, weight 

C Header Interface
#include nagmk26.h
void 
g11baf_ (const char *stat, const char *update, const char *weight, const Integer *n, const Integer *nfac, const Integer isf[], const Integer lfac[], const Integer ifac[], const Integer *ldf, const double y[], const double wt[], double table[], const Integer *maxt, Integer *ncells, Integer *ndim, Integer idim[], Integer icount[], double auxt[], Integer iwk[], Integer *ifail, const Charlen length_stat, const Charlen length_update, const Charlen length_weight) 

3
Description
A dataset may include both classification variables and general variables. The classification variables, known as factors, take a small number of values known as levels. For example, the factor sex would have the levels male and female. These can be coded as
$1$ and
$2$ respectively. Given several factors, a multiway table can be constructed such that each cell of the table represents one level from each factor. For example, the two factors sex and habitat, habitat having three levels (innercity, suburban and rural) define the
$2\times 3$ contingency table
Sex 
Habitat 

Innercity 
Suburban 
Rural 
Male 



Female 



For each cell statistics can be computed. If a third variable in the dataset was age, then for each cell the average age could be computed:
Sex 
Habitat 

Innercity 
Suburban 
Rural 
Male 
25.5 
30.3 
35.6 
Female 
23.2 
29.1 
30.4 
That is the average age for all observations for males living in rural areas is $35.6$. Other statistics can also be computed: the number of observations, the total, the variance, the largest value and the smallest value.
g11baf computes a table for one of the selected statistics. The factors have to be coded with levels $1,2,\dots \text{}$. Weights can be used to eliminate values from the calculations, e.g., if they represent ‘missing values’. There is also the facility to update an existing table with the addition of new observations.
4
References
John J A and Quenouille M H (1977) Experiments: Design and Analysis Griffin
Kendall M G and Stuart A (1969) The Advanced Theory of Statistics (Volume 1) (3rd Edition) Griffin
West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555
5
Arguments
 1: $\mathbf{stat}$ – Character(1)Input

On entry: indicates which statistic is to be computed for the table cells.
 ${\mathbf{stat}}=\text{'N'}$
 The number of observations for each cell.
 ${\mathbf{stat}}=\text{'T'}$
 The total for the variable in y for each cell.
 ${\mathbf{stat}}=\text{'A'}$
 The average (mean) for the variable in y for each cell.
 ${\mathbf{stat}}=\text{'V'}$
 The variance for the variable in y for each cell.
 ${\mathbf{stat}}=\text{'L'}$
 The largest value for the variable in y for each cell.
 ${\mathbf{stat}}=\text{'S'}$
 The smallest value for the variable in y for each cell.
Constraint:
${\mathbf{stat}}=\text{'N'}$, $\text{'T'}$, $\text{'A'}$, $\text{'V'}$, $\text{'L'}$ or $\text{'S'}$.
 2: $\mathbf{update}$ – Character(1)Input

On entry: indicates if an existing table is to be updated by further observation.
 ${\mathbf{update}}=\text{'I'}$
 The table cells will be initialized to zero before tabulations take place.
 ${\mathbf{update}}=\text{'U'}$
 The table input in table will be updated. The arguments ncells, table, icount and auxt must remain unchanged from the previous call to g11baf.
Constraint:
${\mathbf{update}}=\text{'I'}$ or $\text{'U'}$.
 3: $\mathbf{weight}$ – Character(1)Input

On entry: indicates if weights are to be used.
 ${\mathbf{weight}}=\text{'U'}$
 Weights are not used and unit weights are assumed.
 ${\mathbf{weight}}=\text{'W'}$ or $\text{'V'}$
 Weights are used and must be supplied in wt. The only difference between ${\mathbf{weight}}=\text{'W'}$ and ${\mathbf{weight}}=\text{'V'}$ is if the variance is computed.
 ${\mathbf{weight}}=\text{'W'}$
 The divisor for the variance is the sum of the weights minus one and if ${\mathbf{weight}}=\text{'V'}$, the divisor is the number of observations with nonzero weights minus one. The former is useful if the weights represent the frequency of the observed values.
If ${\mathbf{stat}}=\text{'T'}$ or $\text{'A'}$, the weighted total or mean is computed respectively.
If ${\mathbf{stat}}=\text{'N'}$, $\text{'L'}$ or $\text{'S'}$, the only effect of weights is to eliminate values with zero weights from the computations.
Constraint:
${\mathbf{weight}}=\text{'U'}$, $\text{'V'}$ or $\text{'W'}$.
 4: $\mathbf{n}$ – IntegerInput

On entry: the number of observations.
Constraint:
${\mathbf{n}}\ge 2$.
 5: $\mathbf{nfac}$ – IntegerInput

On entry: the number of classifying factors in
ifac.
Constraint:
${\mathbf{nfac}}\ge 1$.
 6: $\mathbf{isf}\left({\mathbf{nfac}}\right)$ – Integer arrayInput

On entry: indicates which factors in
ifac are to be used in the tabulation.
If
${\mathbf{isf}}\left(i\right)>0$ the
$i$th factor in
ifac is included in the tabulation.
Note that if
${\mathbf{isf}}\left(\mathit{i}\right)\le 0$, for $\mathit{i}=1,2,\dots ,{\mathbf{nfac}}$ then the statistic for the whole sample is calculated and returned in a $1\times 1$ table.
 7: $\mathbf{lfac}\left({\mathbf{nfac}}\right)$ – Integer arrayInput

On entry: the number of levels of the classifying factors in
ifac.
Constraint:
if ${\mathbf{isf}}\left(\mathit{i}\right)>0$, ${\mathbf{lfac}}\left(\mathit{i}\right)\ge 2$, for $\mathit{i}=\mathrm{Ai},\dots ,\mathrm{Ai}$.
 8: $\mathbf{ifac}\left({\mathbf{ldf}},{\mathbf{nfac}}\right)$ – Integer arrayInput

On entry: the
nfac coded classification factors for the
n observations.
Constraint:
$1\le {\mathbf{ifac}}\left(\mathit{i},\mathit{j}\right)\le {\mathbf{lfac}}\left(\mathit{j}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{n}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{nfac}}$.
 9: $\mathbf{ldf}$ – IntegerInput

On entry: the first dimension of the array
ifac as declared in the (sub)program from which
g11baf is called.
Constraint:
${\mathbf{ldf}}\ge {\mathbf{n}}$.
 10: $\mathbf{y}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry: the variable to be tabulated. If
${\mathbf{stat}}=\text{'N'}$,
y is not referenced.
 11: $\mathbf{wt}\left(*\right)$ – Real (Kind=nag_wp) arrayInput

Note: the dimension of the array
wt
must be at least
${\mathbf{n}}$ if
${\mathbf{weight}}=\text{'W'}$ or
$\text{'V'}$, and at least
$1$ otherwise.
On entry: if
${\mathbf{weight}}=\text{'W'}$ or
$\text{'V'}$,
wt must contain the
n weights. Otherwise
wt is not referenced.
Constraint:
if ${\mathbf{weight}}=\text{'W'}$ or $\text{'V'}$, ${\mathbf{wt}}\left(\mathit{i}\right)\ge 0.0$, for $\mathit{i}=\mathrm{Ai},\dots ,\mathrm{Ai}$.
 12: $\mathbf{table}\left({\mathbf{maxt}}\right)$ – Real (Kind=nag_wp) arrayInput/Output

On entry: if
${\mathbf{update}}=\text{'U'}$,
table must be unchanged from the previous call to
g11baf, otherwise
table need not be set.
On exit: the computed table. The
ncells cells of the table are stored so that for any two factors the index relating to the factor referred to later in
lfac and
ifac changes faster. For further details see
Section 9.
 13: $\mathbf{maxt}$ – IntegerInput

On entry: the maximum size of the table to be computed.
Constraint:
${\mathbf{maxt}}\ge \text{}$ product of the levels of the factors included in the tabulation.
 14: $\mathbf{ncells}$ – IntegerInput/Output

On entry: if
${\mathbf{update}}=\text{'U'}$,
ncells must be unchanged from the previous call to
g11baf, otherwise
ncells need not be set.
On exit: the number of cells in the table.
 15: $\mathbf{ndim}$ – IntegerOutput

On exit: the number of factors defining the table.
 16: $\mathbf{idim}\left({\mathbf{nfac}}\right)$ – Integer arrayOutput

On exit: the first
ndim elements contain the number of levels for the factors defining the table.
 17: $\mathbf{icount}\left({\mathbf{maxt}}\right)$ – Integer arrayInput/Output

On entry: if
${\mathbf{update}}=\text{'U'}$,
icount must be unchanged from the previous call to
g11baf, otherwise
icount need not be set.
On exit: a table containing the number of observations contributing to each cell of the table, stored identically to
table. Note if
${\mathbf{stat}}=\text{'N'}$ this is the same as is returned in
table.
 18: $\mathbf{auxt}\left(*\right)$ – Real (Kind=nag_wp) arrayInput/Output

Note: the dimension of the array
auxt
must be at least
${\mathbf{ncells}}$ if
${\mathbf{stat}}=\text{'A'}$,
$2\times {\mathbf{ncells}}$ if
${\mathbf{stat}}=\text{'V'}$, and at least
$1$ otherwise.
On entry: if
${\mathbf{update}}=\text{'U'}$,
auxt must be unchanged from the previous call to
g11baf, otherwise
auxt need not be set.
On exit: if
${\mathbf{stat}}=\text{'A'}$ or
$\text{'V'}$, the first
ncells values hold the table containing the sum of the weights for the observations contributing to each cell, stored identically to
table.
If
${\mathbf{stat}}=\text{'V'}$, the second set of
ncells values hold the table of cell means. Otherwise
auxt is not referenced.
 19: $\mathbf{iwk}\left(2\times {\mathbf{nfac}}\right)$ – Integer arrayWorkspace

 20: $\mathbf{ifail}$ – IntegerInput/Output

On entry:
ifail must be set to
$0$,
$1\text{ or}1$. If you are unfamiliar with this argument you should refer to
Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{ or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{ or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry,  ${\mathbf{n}}<2$, 
or  ${\mathbf{nfac}}<1$, 
or  ${\mathbf{ldf}}<{\mathbf{n}}$, 
or  ${\mathbf{update}}\ne \text{'I'}$ or $\text{'U'}$, 
or  ${\mathbf{weight}}\ne \text{'U'}$, $\text{'W'}$ or $\text{'V'}$, 
or  ${\mathbf{stat}}\ne \text{'N'}$, $\text{'T'}$, $\text{'A'}$, $\text{'V'}$, $\text{'L'}$ or $\text{'S'}$. 
 ${\mathbf{ifail}}=2$

On entry,  ${\mathbf{isf}}\left(i\right)>0$ and ${\mathbf{lfac}}\left(i\right)<2$, for some $i$, 
or  ${\mathbf{ifac}}\left(i,j\right)<1$, for some $i,j$, 
or  ${\mathbf{ifac}}\left(i,j\right)>{\mathbf{lfac}}\left(j\right)$ for some $i,j$, 
or  maxt is too small, 
or  ${\mathbf{weight}}=\text{'W'}$ or $\text{'V'}$ and ${\mathbf{wt}}\left(i\right)<0.0$, for some $i$. 
 ${\mathbf{ifail}}=3$

${\mathbf{stat}}=\text{'V'}$ and the divisor for the variance is $\text{}\le 0.0$.
 ${\mathbf{ifail}}=4$

${\mathbf{update}}=\text{'U'}$ and at least one of
ncells,
table,
auxt or
icount have been changed since previous call to
g11baf.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 3.9 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 3.8 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 3.7 in How to Use the NAG Library and its Documentation for further information.
7
Accuracy
Only applicable when
${\mathbf{stat}}=\text{'V'}$. In this case a one pass algorithm is used as described by
West (1979).
8
Parallelism and Performance
g11baf is not threaded in any implementation.
The tables created by
g11baf and stored in
table,
icount and, depending on
stat, also in
auxt are stored in the following way. Let there be
$n$ factors defining the table with factor
$k$ having
${l}_{k}$ levels, then the cell defined by the levels
${i}_{1}$,
${i}_{2},\dots ,{i}_{n}$ of the factors is stored in the
$m$th cell given by
where
${c}_{j}={\displaystyle \prod _{k=j+1}^{n}}{l}_{k}$, for
$j=1,2,\dots ,n1$ and
${c}_{n}=1$.
10
Example
The data, given by
John and Quenouille (1977), is for a
$3\times 6$ factorial experiment in
$3$ blocks of
$18$ units. The data is input in the order, blocks, factor with
$3$ levels, factor with
$6$ levels, yield. The
$3\times 6$ table of treatment means for yield over blocks is computed and printed.
10.1
Program Text
Program Text (g11bafe.f90)
10.2
Program Data
Program Data (g11bafe.d)
10.3
Program Results
Program Results (g11bafe.r)