hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_contab_tabulate_stat (g11ba)

Purpose

nag_contab_tabulate_stat (g11ba) computes a table from a set of classification factors using a selected statistic.

Syntax

[table, ncells, ndim, idim, icount, auxt, ifail] = g11ba(stat, update, weight, isf, lfac, ifac, y, wt, table, ncells, icount, auxt, 'n', n, 'nfac', nfac, 'maxt', maxt)
[table, ncells, ndim, idim, icount, auxt, ifail] = nag_contab_tabulate_stat(stat, update, weight, isf, lfac, ifac, y, wt, table, ncells, icount, auxt, 'n', n, 'nfac', nfac, 'maxt', maxt)

Description

A dataset may include both classification variables and general variables. The classification variables, known as factors, take a small number of values known as levels. For example, the factor sex would have the levels male and female. These can be coded as 11 and 22 respectively. Given several factors, a multi-way table can be constructed such that each cell of the table represents one level from each factor. For example, the two factors sex and habitat, habitat having three levels (inner-city, suburban and rural) define the 2 × 32×3 contingency table
Sex Habitat
  Inner-city Suburban Rural
Male      
Female      
For each cell statistics can be computed. If a third variable in the dataset was age, then for each cell the average age could be computed:
Sex Habitat
  Inner-city Suburban Rural
Male 25.5 30.3 35.6
Female 23.2 29.1 30.4
That is the average age for all observations for males living in rural areas is 35.635.6. Other statistics can also be computed: the number of observations, the total, the variance, the largest value and the smallest value.
nag_contab_tabulate_stat (g11ba) computes a table for one of the selected statistics. The factors have to be coded with levels 1,2,1,2,. Weights can be used to eliminate values from the calculations, e.g., if they represent ‘missing values’. There is also the facility to update an existing table with the addition of new observations.

References

John J A and Quenouille M H (1977) Experiments: Design and Analysis Griffin
Kendall M G and Stuart A (1969) The Advanced Theory of Statistics (Volume 1) (3rd Edition) Griffin
West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555

Parameters

Compulsory Input Parameters

1:     stat – string (length ≥ 1)
Indicates which statistic is to be computed for the table cells.
stat = 'N'stat='N'
The number of observations for each cell.
stat = 'T'stat='T'
The total for the variable in y for each cell.
stat = 'A'stat='A'
The average (mean) for the variable in y for each cell.
stat = 'V'stat='V'
The variance for the variable in y for each cell.
stat = 'L'stat='L'
The largest value for the variable in y for each cell.
stat = 'S'stat='S'
The smallest value for the variable in y for each cell.
Constraint: stat = 'N'stat='N', 'T''T', 'A''A', 'V''V', 'L''L' or 'S''S'.
2:     update – string (length ≥ 1)
Indicates if an existing table is to be updated by further observation.
update = 'I'update='I'
The table cells will be initialized to zero before tabulations take place.
update = 'U'update='U'
The table input in table will be updated. The parameters ncells, table, icount and auxt must remain unchanged from the previous call to nag_contab_tabulate_stat (g11ba).
Constraint: update = 'I'update='I' or 'U''U'.
3:     weight – string (length ≥ 1)
Indicates if weights are to be used.
weight = 'U'weight='U'
Weights are not used and unit weights are assumed.
weight = 'W'weight='W' or 'V''V'
Weights are used and must be supplied in wt. The only difference between weight = 'W'weight='W' and weight = 'V'weight='V' is if the variance is computed.
weight = 'W'weight='W'
The divisor for the variance is the sum of the weights minus one and if weight = 'V'weight='V', the divisor is the number of observations with nonzero weights minus one. The former is useful if the weights represent the frequency of the observed values.
If stat = 'T'stat='T' or 'A''A', the weighted total or mean is computed respectively.
If stat = 'N'stat='N', 'L''L' or 'S''S', the only effect of weights is to eliminate values with zero weights from the computations.
Constraint: weight = 'U'weight='U', 'V''V' or 'W''W'.
4:     isf(nfac) – int64int32nag_int array
nfac, the dimension of the array, must satisfy the constraint nfac1nfac1.
Indicates which factors in ifac are to be used in the tabulation.
If isf(i) > 0isfi>0 the iith factor in ifac is included in the tabulation.
Note that if isf(i)0isfi0, for i = 1,2,,nfaci=1,2,,nfac then the statistic for the whole sample is calculated and returned in a 1 × 11×1 table.
5:     lfac(nfac) – int64int32nag_int array
nfac, the dimension of the array, must satisfy the constraint nfac1nfac1.
The number of levels of the classifying factors in ifac.
Constraint: if isf(i) > 0isfi>0, lfac(i)2lfaci2, for i = Ai,,Aii=Ai,,Ai.
6:     ifac(ldf,nfac) – int64int32nag_int array
ldf, the first dimension of the array, must satisfy the constraint ldfnldfn.
The nfac coded classification factors for the n observations.
Constraint: 1ifac(i,j)lfac(j)1ifacijlfacj, for i = 1,2,,ni=1,2,,n and j = 1,2,,nfacj=1,2,,nfac.
7:     y(n) – double array
n, the dimension of the array, must satisfy the constraint n2n2.
The variable to be tabulated. If stat = 'N'stat='N', y is not referenced.
8:     wt( : :) – double array
Note: the dimension of the array wt must be at least nn if weight = 'W'weight='W' or 'V''V', and at least 11 otherwise.
If weight = 'W'weight='W' or 'V''V', wt must contain the n weights. Otherwise wt is not referenced.
Constraint: if weight = 'W'weight='W' or 'V''V', wt(i)0.0wti0.0, for i = Ai,,Aii=Ai,,Ai.
9:     table(maxt) – double array
maxt, the dimension of the array, must satisfy the constraint maxtmaxt product of the levels of the factors included in the tabulation.
If update = 'U'update='U', table must be unchanged from the previous call to nag_contab_tabulate_stat (g11ba), otherwise table need not be set.
10:   ncells – int64int32nag_int scalar
If update = 'U'update='U', ncells must be unchanged from the previous call to nag_contab_tabulate_stat (g11ba), otherwise ncells need not be set.
11:   icount(maxt) – int64int32nag_int array
maxt, the dimension of the array, must satisfy the constraint maxtmaxt product of the levels of the factors included in the tabulation.
If update = 'U'update='U', icount must be unchanged from the previous call to nag_contab_tabulate_stat (g11ba), otherwise icount need not be set.
12:   auxt( : :) – double array
Note: the dimension of the array auxt must be at least ncellsncells if stat = 'A'stat='A', 2 × ncells2×ncells if stat = 'V'stat='V', and at least 11 otherwise.
If update = 'U'update='U', auxt must be unchanged from the previous call to nag_contab_tabulate_stat (g11ba), otherwise auxt need not be set.

Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array y and the first dimension of the array ifac. (An error is raised if these dimensions are not equal.)
The number of observations.
Constraint: n2n2.
2:     nfac – int64int32nag_int scalar
Default: The dimension of the arrays isf, lfac and the second dimension of the array ifac. (An error is raised if these dimensions are not equal.)
The number of classifying factors in ifac.
Constraint: nfac1nfac1.
3:     maxt – int64int32nag_int scalar
Default: The dimension of the arrays table, icount. (An error is raised if these dimensions are not equal.)
The maximum size of the table to be computed.
Constraint: maxtmaxt product of the levels of the factors included in the tabulation.

Input Parameters Omitted from the MATLAB Interface

ldf iwk

Output Parameters

1:     table(maxt) – double array
The computed table. The ncells cells of the table are stored so that for any two factors the index relating to the factor referred to later in lfac and ifac changes faster. For further details see Section [Further Comments].
2:     ncells – int64int32nag_int scalar
The number of cells in the table.
3:     ndim – int64int32nag_int scalar
The number of factors defining the table.
4:     idim(nfac) – int64int32nag_int array
The first ndim elements contain the number of levels for the factors defining the table.
5:     icount(maxt) – int64int32nag_int array
A table containing the number of observations contributing to each cell of the table, stored identically to table. Note if stat = 'N'stat='N' this is the same as is returned in table.
6:     auxt( : :) – double array
Note: the dimension of the array auxt must be at least ncellsncells if stat = 'A'stat='A', 2 × ncells2×ncells if stat = 'V'stat='V', and at least 11 otherwise.
If stat = 'A'stat='A' or 'V''V', the first ncells values hold the table containing the sum of the weights for the observations contributing to each cell, stored identically to table.
If stat = 'V'stat='V', the second set of ncells values hold the table of cell means. Otherwise auxt is not referenced.
7:     ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Errors or warnings detected by the function:
  ifail = 1ifail=1
On entry,n < 2n<2,
ornfac < 1nfac<1,
orldf < nldf<n,
orupdate'I'update'I' or 'U''U',
orweight'U'weight'U', 'W''W' or 'V''V',
orstat'N'stat'N', 'T''T', 'A''A', 'V''V', 'L''L' or 'S''S'.
  ifail = 2ifail=2
On entry,isf(i) > 0isfi>0 and lfac(i) < 2lfaci<2, for some ii,
orifac(i,j) < 1ifacij<1, for some i,ji,j,
orifac(i,j) > lfac(j)ifacij>lfacj for some i,ji,j,
ormaxt is too small,
orweight = 'W'weight='W' or 'V''V' and wt(i) < 0.0wti<0.0, for some ii.
  ifail = 3ifail=3
stat = 'V'stat='V' and the divisor for the variance is 0.00.0.
  ifail = 4ifail=4
update = 'U'update='U' and at least one of ncells, table, auxt or icount have been changed since previous call to nag_contab_tabulate_stat (g11ba).

Accuracy

Only applicable when stat = 'V'stat='V'. In this case a one pass algorithm is used as described by West (1979).

Further Comments

The tables created by nag_contab_tabulate_stat (g11ba) and stored in table, icount and, depending on stat, also in auxt are stored in the following way. Let there be nn factors defining the table with factor kk having lklk levels, then the cell defined by the levels i1i1, i2,,ini2,,in of the factors is stored in the mmth cell given by
n
m = 1 + [(ik1)ck],
k = 1
m=1+k=1n[(ik-1)ck],
where cj = k = j + 1nlkcj=k=j+1nlk, for j = 1,2,,n1j=1,2,,n-1 and cn = 1cn=1.

Example

function nag_contab_tabulate_stat_example
stat = 'A';
update = 'I';
weight = 'U';
isf = [int64(0);1;1];
lfac = [int64(3);3;6];
ifac = [int64(1),1,1; ...
             1,2,1; ...
             1,3,1; ...
             1,1,2; ...
             1,2,2; ...
             1,3,2; ...
             1,1,3; ...
             1,2,3; ...
             1,3,3; ...
             1,1,4; ...
             1,2,4; ...
             1,3,4; ...
             1,1,5; ...
             1,2,5; ...
             1,3,5; ...
             1,1,6; ...
             1,2,6; ...
             1,3,6; ...
             2,1,1; ...
             2,2,1; ...
             2,3,1; ...
             2,1,2; ...
             2,2,2; ...
             2,3,2; ...
             2,1,3; ...
             2,2,3; ...
             2,3,3; ...
             2,1,4; ...
             2,2,4; ...
             2,3,4; ...
             2,1,5; ...
             2,2,5; ...
             2,3,5; ...
             2,1,6; ...
             2,2,6; ...
             2,3,6; ...
             3,1,1; ...
             3,2,1; ...
             3,3,1; ...
             3,1,2; ...
             3,2,2; ...
             3,3,2; ...
             3,1,3; ...
             3,2,3; ...
             3,3,3; ...
             3,1,4; ...
             3,2,4; ...
             3,3,4; ...
             3,1,5; ...
             3,2,5; ...
             3,3,5; ...
             3,1,6; ...
             3,2,6; ...
             3,3,6];
y = [274;
     361;
     253;
     325;
     317;
     339;
     326;
     402;
     336;
     379;
     345;
     361;
     352;
     334;
     318;
     339;
     393;
     358;
     350;
     340;
     203;
     397;
     356;
     298;
     382;
     376;
     355;
     418;
     387;
     379;
     432;
     339;
     293;
     322;
     417;
     342;
     82;
     297;
     133;
     306;
     352;
     361;
     220;
     333;
     270;
     388;
     379;
     274;
     336;
     307;
     266;
     389;
     333;
     353];
wt = [];
table = zeros(18,1);
ncells = int64(0);
icount = [int64(8186736);-16;-1232427584;-1208182168; ...
          7;8183760;8142643;10581210;-1208182228;8183732; ...
          -1208182816;0;-1081228592;8123961;8264206;10581250;-1081228624;32];
auxt = zeros(36,1);
[tableOut, ncellsOut, ndim, idim, icountOut, auxtOut, ifail] = ...
     nag_contab_tabulate_stat(stat, update, weight, isf, lfac, ifac, y, wt, ...
     table, ncells, icount, auxt)
 

tableOut =

  235.3333
  342.6667
  309.3333
  395.0000
  373.3333
  350.0000
  332.6667
  341.6667
  370.3333
  370.3333
  326.6667
  381.0000
  196.3333
  332.6667
  320.3333
  338.0000
  292.3333
  351.0000


ncellsOut =

                   18


ndim =

                    2


idim =

                    3
                    6
                    0


icountOut =

                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3


auxtOut =

     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0


ifail =

                    0


function g11ba_example
stat = 'A';
update = 'I';
weight = 'U';
isf = [int64(0);1;1];
lfac = [int64(3);3;6];
ifac = [int64(1),1,1; ...
             1,2,1; ...
             1,3,1; ...
             1,1,2; ...
             1,2,2; ...
             1,3,2; ...
             1,1,3; ...
             1,2,3; ...
             1,3,3; ...
             1,1,4; ...
             1,2,4; ...
             1,3,4; ...
             1,1,5; ...
             1,2,5; ...
             1,3,5; ...
             1,1,6; ...
             1,2,6; ...
             1,3,6; ...
             2,1,1; ...
             2,2,1; ...
             2,3,1; ...
             2,1,2; ...
             2,2,2; ...
             2,3,2; ...
             2,1,3; ...
             2,2,3; ...
             2,3,3; ...
             2,1,4; ...
             2,2,4; ...
             2,3,4; ...
             2,1,5; ...
             2,2,5; ...
             2,3,5; ...
             2,1,6; ...
             2,2,6; ...
             2,3,6; ...
             3,1,1; ...
             3,2,1; ...
             3,3,1; ...
             3,1,2; ...
             3,2,2; ...
             3,3,2; ...
             3,1,3; ...
             3,2,3; ...
             3,3,3; ...
             3,1,4; ...
             3,2,4; ...
             3,3,4; ...
             3,1,5; ...
             3,2,5; ...
             3,3,5; ...
             3,1,6; ...
             3,2,6; ...
             3,3,6];
y = [274;
     361;
     253;
     325;
     317;
     339;
     326;
     402;
     336;
     379;
     345;
     361;
     352;
     334;
     318;
     339;
     393;
     358;
     350;
     340;
     203;
     397;
     356;
     298;
     382;
     376;
     355;
     418;
     387;
     379;
     432;
     339;
     293;
     322;
     417;
     342;
     82;
     297;
     133;
     306;
     352;
     361;
     220;
     333;
     270;
     388;
     379;
     274;
     336;
     307;
     266;
     389;
     333;
     353];
wt = [];
table = zeros(18,1);
ncells = int64(0);
icount = [int64(8186736);-16;-1232427584;-1208182168; ...
          7;8183760;8142643;10581210;-1208182228;8183732; ...
          -1208182816;0;-1081228592;8123961;8264206;10581250;-1081228624;32];
auxt = zeros(36,1);
[tableOut, ncellsOut, ndim, idim, icountOut, auxtOut, ifail] = ...
     g11ba(stat, update, weight, isf, lfac, ifac, y, wt, table, ncells, icount, auxt)
 

tableOut =

  235.3333
  342.6667
  309.3333
  395.0000
  373.3333
  350.0000
  332.6667
  341.6667
  370.3333
  370.3333
  326.6667
  381.0000
  196.3333
  332.6667
  320.3333
  338.0000
  292.3333
  351.0000


ncellsOut =

                   18


ndim =

                    2


idim =

                    3
                    6
                    0


icountOut =

                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3
                    3


auxtOut =

     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     3
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0


ifail =

                    0



PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013