hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_stat_summary_onevar (g01at)

Purpose

nag_stat_summary_onevar (g01at) calculates the mean, standard deviation, coefficients of skewness and kurtosis, and the maximum and minimum values for a set of (optionally weighted) data. The input data can be split into arbitrary sized blocks, allowing large datasets to be summarised.

Syntax

[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = g01at(x, 'nb', nb, 'wt', wt, 'pn', pn, 'rcomm', rcomm)
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = nag_stat_summary_onevar(x, 'nb', nb, 'wt', wt, 'pn', pn, 'rcomm', rcomm)

Description

Given a sample of nn observations, denoted by x = {xi : i = 1,2,,n} x = { xi : i=1,2,,n }  and a set of non-negative weights, w = {wi : i = 1,2,,n} w = { wi : i=1,2,,n } , nag_stat_summary_onevar (g01at) calculates a number of quantities:
(a) Mean
n
x = ( i = 1n wi xi )/W,   where  W = wi.
i = 1
x- = i=1 n wi xi W ,   where   W = i=1 n wi .
(b) Standard deviation
s2 = sqrt( ( i = 1n wi (xix)2 )/d ) ,   where   d = W ( i = 1n wi2 )/W .
s2 = i=1 n wi ( xi - x- ) 2 d ,   where   d = W - i=1 n wi2 W .
(c) Coefficient of skewness
s3 = ( i = 1n wi (xix)3 )/( d s23 ) .
s3 = i=1 n wi ( xi - x- ) 3 d s23 .
(d) Coefficient of kurtosis
s4 = ( i = 1n wi (xix)4 )/( d s24 ) 3 .
s4 = i=1 n wi ( xi - x- ) 4 d s24 -3 .
(e) Maximum and minimum elements, with wi0wi0.
These quantities are calculated using the one pass algorithm of West (1979).
For large datasets, or where all the data is not available at the same time, xx and ww can be split into arbitrary sized blocks and nag_stat_summary_onevar (g01at) called multiple times.

References

West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555

Parameters

Compulsory Input Parameters

1:     x(nb) – double array
nb, the dimension of the array, must satisfy the constraint nb0nb0.
The current block of observations, corresponding to xixi, for i = k + 1,,k + bi=k+1,,k+b, where kk is the number of observations processed so far and bb is the size of the current block of data.

Optional Input Parameters

1:     nb – int64int32nag_int scalar
Default: The dimension of the array x.
bb, the number of observations in the current block of data. The size of the block of data supplied in x and wt can vary; therefore nb can change between calls to nag_stat_summary_onevar (g01at).
Constraint: nb0nb0.
2:     wt( : :) – double array
Note: the dimension of the array wt must be at least nbnb if iwt = 1iwt=1.
If iwt = 1iwt=1, wt must contain the user-supplied weights corresponding to the block of data supplied in x, that is wiwi, for i = k + 1,,k + bi=k+1,,k+b.
Constraint: if iwt = 1iwt=1, wt(i)0wti0, for i = 1,2,,nbi=1,2,,nb.
3:     pn – int64int32nag_int scalar
The number of valid observations processed so far, that is the number of observations with wi > 0wi>0, for i = 1,2,,ki=1,2,,k. On the first call to nag_stat_summary_onevar (g01at), or when starting to summarise a new dataset, pn must be set to 00.
If pn0pn0, it must be the same value as returned by the last call to nag_stat_summary_onevar (g01at).
Default: 00
4:     rcomm(2020) – double array
Communication array, used to store information between calls to nag_stat_summary_onevar (g01at). If pn = 0pn=0, rcomm need not be initialized, otherwise it must be unchanged since the last call to this function.

Input Parameters Omitted from the MATLAB Interface

iwt

Output Parameters

1:     pn – int64int32nag_int scalar
Default: 00
The updated number of valid observations processed, that is the number of observations with wi > 0wi>0, for i = 1,2,,k + bi=1,2,,k+b.
2:     xmean – double scalar
xx-, the mean of the first k + bk+b observations.
3:     xsd – double scalar
s2s2, the standard deviation of the first k + bk+b observations.
4:     xskew – double scalar
s3s3, the coefficient of skewness for the first k + bk+b observations.
5:     xkurt – double scalar
s4s4, the coefficient of kurtosis for the first k + bk+b observations.
6:     xmin – double scalar
The smallest value in the first k + bk+b observations.
7:     xmax – double scalar
The largest value in the first k + bk+b observations.
8:     rcomm(2020) – double array
The updated communication array. The first five elements of rcomm hold information that may be of interest with
rcomm(1) = k + b ∑ wii = 1 rcomm(2) = k + b
(k + b ) ∑ wii = 1 2
− ∑ wi2 i = 1 rcomm(3) = k + b ∑ wi (xi − x)2 i = 1 rcomm(4) = k + b ∑ wi (xi − x)3 i = 1 rcomm(5) = k + b ∑ wi (xi − x)4 i = 1
rcomm1 = i=1 k+b wi rcomm2 = ( i=1 k+b wi ) 2 - i=1 k+b wi2 rcomm3 = i=1 k+b wi ( xi - x- ) 2 rcomm4 = i=1 k+b wi ( xi - x- ) 3 rcomm5 = i=1 k+b wi (xi - x- ) 4
the remaining elements of rcomm are used for workspace and so are undefined.
9:     ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

  ifail = 11ifail=11
Constraint: nb0nb0.
  ifail = 31ifail=31
Constraint: iwt = 0iwt=0 or 11.
  ifail = 41ifail=41
Constraint: if iwt = 1iwt=1 then wt(i)0wti0, for i = 1,2,,nbi=1,2,,nb.
  ifail = 51ifail=51
Constraint: pn0pn0.
  ifail = 52ifail=52
Constraint: if pn > 0pn>0, pn must be unchanged since previous call.
W ifail = 53ifail=53
On entry, the number of valid observations is zero.
W ifail = 71ifail=71
On exit we were unable to calculate xskew or xkurt. A value of 00 has been returned.
W ifail = 72ifail=72
On exit we were unable to calculate xsd, xskew or xkurt. A value of 00 has been returned.
  ifail = 121ifail=121
rcomm has been corrupted between calls.

Accuracy

Not applicable.

Further Comments

Both nag_stat_summary_onevar (g01at) and nag_stat_summary_onevar_combine (g01au) consolidate results from multiple summaries. Whereas the former can only be used to combine summaries calculated sequentially, the latter combines summaries calculated in an arbitrary order allowing, for example, summaries calculated on different processing units to be combined.

Example

function nag_stat_summary_onevar_example
x1 = [-0.62; -1.92; -1.72; -6.35; 2.00; 7.65; 6.15; 3.81; 4.87; -0.51; ...
       6.88; -5.85; -0.72; 0.66; 2.23; -1.61; -0.15; -1.15; -8.74; -3.94; 3.61];
wt1 = [4.91; 0.25; 3.90; 3.75; 1.17; 3.19; 2.66; 0.02; 3.59; 3.63; 4.83; ...
       3.72; 1.72; 0.78; 4.74; 1.72; 3.94; 1.33; 0.51; 2.40; 3.90];
x2 = [-0.66; -2.39; -6.25; 1.23; 2.27; -2.27; 10.12; 8.29; -2.99; 8.71; ...
      -0.74; 0.02; 1.22; 1.70; 4.30; 2.99; -0.83; -1.00; 6.57; 2.32; -3.47; ...
      -1.41; -5.26; 0.53; 1.80; 4.79; -3.04; 1.20; -3.21; -3.75; 0.86; ...
       1.27; -5.95; -5.27; 1.63; 3.59; -0.01; -1.38; -4.71; -4.82; 3.55; ...
       0.46; 2.57; 1.76; -4.05; 1.23; -1.99; 3.20; -0.65; 8.42; -6.01];
x3 = [1.13; -8.86; 5.92; -1.71; -3.99; 6.57; -2.01; -2.29; -1.11; 7.14; ...
      4.84; -4.44; -3.32; 10.25; -2.11; 8.02; -7.31; 2.80; -1.20; 1.01; ...
      1.37; -2.28; 1.28; -3.95; 3.43; -0.61; 4.85; -0.11];
rcomm = zeros(20,1);

% Initialise the number of valid observations processed so far
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    nag_stat_summary_onevar(x1, 'wt', wt1);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    nag_stat_summary_onevar(x2, 'pn', pn, 'rcomm', rcomm);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    nag_stat_summary_onevar(x3, 'pn', pn, 'rcomm', rcomm);

% Display the results
fprintf('\nData supplied in 3 blocks\n');
if (ifail==53)
  fprintf('No valid observations supplied. All weights are zero.\n')
else
  fprintf('%d valid observations\n', pn);
  fprintf('Mean          %13.2f\n', xmean);
  if (ifail==72)
    fprintf('Unable to calculate the standard deviation, skewness or kurtosis\n');
  else
    fprintf('Std devn      %13.2f\n', xsd);
    if (ifail==71)
      fprintf('Unable to calculate the skewness or kurtosis\n');
    else
      fprintf('Skewness      %13.2f\n', xskew);
      fprintf('Kurtosis      %13.2f\n', xkurt);
    end
  end
  fprintf('Minimum       %13.2f\n', xmin);
  fprintf('Maximum       %13.2f\n', xmax);
end
 

Data supplied in 3 blocks
100 valid observations
Mean                   0.51
Std devn               4.24
Skewness               0.18
Kurtosis              -0.59
Minimum               -8.86
Maximum               10.25

function g01at_example
x1 = [-0.62; -1.92; -1.72; -6.35; 2.00; 7.65; 6.15; 3.81; 4.87; -0.51; ...
       6.88; -5.85; -0.72; 0.66; 2.23; -1.61; -0.15; -1.15; -8.74; -3.94; 3.61];
wt1 = [4.91; 0.25; 3.90; 3.75; 1.17; 3.19; 2.66; 0.02; 3.59; 3.63; 4.83; ...
       3.72; 1.72; 0.78; 4.74; 1.72; 3.94; 1.33; 0.51; 2.40; 3.90];
x2 = [-0.66; -2.39; -6.25; 1.23; 2.27; -2.27; 10.12; 8.29; -2.99; 8.71; ...
      -0.74; 0.02; 1.22; 1.70; 4.30; 2.99; -0.83; -1.00; 6.57; 2.32; -3.47; ...
      -1.41; -5.26; 0.53; 1.80; 4.79; -3.04; 1.20; -3.21; -3.75; 0.86; ...
       1.27; -5.95; -5.27; 1.63; 3.59; -0.01; -1.38; -4.71; -4.82; 3.55; ...
       0.46; 2.57; 1.76; -4.05; 1.23; -1.99; 3.20; -0.65; 8.42; -6.01];
x3 = [1.13; -8.86; 5.92; -1.71; -3.99; 6.57; -2.01; -2.29; -1.11; 7.14; ...
      4.84; -4.44; -3.32; 10.25; -2.11; 8.02; -7.31; 2.80; -1.20; 1.01; ...
      1.37; -2.28; 1.28; -3.95; 3.43; -0.61; 4.85; -0.11];
rcomm = zeros(20,1);

% Initialise the number of valid observations processed so far
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    g01at(x1, 'wt', wt1);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    g01at(x2, 'pn', pn, 'rcomm', rcomm);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
    g01at(x3, 'pn', pn, 'rcomm', rcomm);

% Display the results
fprintf('\nData supplied in 3 blocks\n');
if (ifail==53)
  fprintf('No valid observations supplied. All weights are zero.\n')
else
  fprintf('%d valid observations\n', pn);
  fprintf('Mean          %13.2f\n', xmean);
  if (ifail==72)
    fprintf('Unable to calculate the standard deviation, skewness or kurtosis\n');
  else
    fprintf('Std devn      %13.2f\n', xsd);
    if (ifail==71)
      fprintf('Unable to calculate the skewness or kurtosis\n');
    else
      fprintf('Skewness      %13.2f\n', xskew);
      fprintf('Kurtosis      %13.2f\n', xkurt);
    end
  end
  fprintf('Minimum       %13.2f\n', xmin);
  fprintf('Maximum       %13.2f\n', xmax);
end
 

Data supplied in 3 blocks
100 valid observations
Mean                   0.51
Std devn               4.24
Skewness               0.18
Kurtosis              -0.59
Minimum               -8.86
Maximum               10.25


PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013