Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_summary_onevar (g01at)

## Purpose

nag_stat_summary_onevar (g01at) calculates the mean, standard deviation, coefficients of skewness and kurtosis, and the maximum and minimum values for a set of (optionally weighted) data. The input data can be split into arbitrary sized blocks, allowing large datasets to be summarised.

## Syntax

[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = g01at(x, 'nb', nb, 'wt', wt, 'pn', pn, 'rcomm', rcomm)
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = nag_stat_summary_onevar(x, 'nb', nb, 'wt', wt, 'pn', pn, 'rcomm', rcomm)

## Description

Given a sample of n$n$ observations, denoted by x = {xi : i = 1,2,,n} $x=\left\{{x}_{i}:i=1,2,\dots ,n\right\}$ and a set of non-negative weights, w = {wi : i = 1,2,,n} $w=\left\{{w}_{i}:i=1,2,\dots ,n\right\}$, nag_stat_summary_onevar (g01at) calculates a number of quantities:
(a) Mean
 n x = ( ∑ i = 1n wi xi )/W,   where  W = ∑ wi. i = 1
$x- = ∑ i=1 n wi xi W , where W = ∑ i=1 n wi .$
(b) Standard deviation
 s2 = sqrt( ( ∑ i = 1n wi (xi − x)2 )/d ) ,   where   d = W − ( ∑ i = 1n wi2 )/W . $s2 = ∑ i=1 n wi ( xi - x- ) 2 d , where d = W - ∑ i=1 n wi2 W .$
(c) Coefficient of skewness
 s3 = ( ∑ i = 1n wi (xi − x)3 )/( d s23 ) . $s3 = ∑ i=1 n wi ( xi - x- ) 3 d ⁢ s23 .$
(d) Coefficient of kurtosis
 s4 = ( ∑ i = 1n wi (xi − x)4 )/( d s24 ) − 3 . $s4 = ∑ i=1 n wi ( xi - x- ) 4 d ⁢ s24 -3 .$
(e) Maximum and minimum elements, with wi0${w}_{i}\ne 0$.
These quantities are calculated using the one pass algorithm of West (1979).
For large datasets, or where all the data is not available at the same time, x$x$ and w$w$ can be split into arbitrary sized blocks and nag_stat_summary_onevar (g01at) called multiple times.

## References

West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555

## Parameters

### Compulsory Input Parameters

1:     x(nb) – double array
nb, the dimension of the array, must satisfy the constraint nb0${\mathbf{nb}}\ge 0$.
The current block of observations, corresponding to xi${x}_{\mathit{i}}$, for i = k + 1,,k + b$\mathit{i}=k+1,\dots ,k+b$, where k$k$ is the number of observations processed so far and b$b$ is the size of the current block of data.

### Optional Input Parameters

1:     nb – int64int32nag_int scalar
Default: The dimension of the array x.
b$b$, the number of observations in the current block of data. The size of the block of data supplied in x and wt can vary; therefore nb can change between calls to nag_stat_summary_onevar (g01at).
Constraint: nb0${\mathbf{nb}}\ge 0$.
2:     wt( : $:$) – double array
Note: the dimension of the array wt must be at least nb${\mathbf{nb}}$ if iwt = 1$\mathit{iwt}=1$.
If iwt = 1$\mathit{iwt}=1$, wt must contain the user-supplied weights corresponding to the block of data supplied in x, that is wi${w}_{\mathit{i}}$, for i = k + 1,,k + b$\mathit{i}=k+1,\dots ,k+b$.
Constraint: if iwt = 1$\mathit{iwt}=1$, wt(i)0${\mathbf{wt}}\left(\mathit{i}\right)\ge 0$, for i = 1,2,,nb$\mathit{i}=1,2,\dots ,{\mathbf{nb}}$.
3:     pn – int64int32nag_int scalar
The number of valid observations processed so far, that is the number of observations with wi > 0${w}_{i}>0$, for i = 1,2,,k$\mathit{i}=1,2,\dots ,k$. On the first call to nag_stat_summary_onevar (g01at), or when starting to summarise a new dataset, pn must be set to 0$0$.
If pn0${\mathbf{pn}}\ne 0$, it must be the same value as returned by the last call to nag_stat_summary_onevar (g01at).
Default: 0$0$
4:     rcomm(20$20$) – double array
Communication array, used to store information between calls to nag_stat_summary_onevar (g01at). If pn = 0${\mathbf{pn}}=0$, rcomm need not be initialized, otherwise it must be unchanged since the last call to this function.

iwt

### Output Parameters

1:     pn – int64int32nag_int scalar
Default: 0$0$
The updated number of valid observations processed, that is the number of observations with wi > 0${w}_{i}>0$, for i = 1,2,,k + b$\mathit{i}=1,2,\dots ,k+b$.
2:     xmean – double scalar
x$\stackrel{-}{x}$, the mean of the first k + b$k+b$ observations.
3:     xsd – double scalar
s2${s}_{2}$, the standard deviation of the first k + b$k+b$ observations.
4:     xskew – double scalar
s3${s}_{3}$, the coefficient of skewness for the first k + b$k+b$ observations.
5:     xkurt – double scalar
s4${s}_{4}$, the coefficient of kurtosis for the first k + b$k+b$ observations.
6:     xmin – double scalar
The smallest value in the first k + b$k+b$ observations.
7:     xmax – double scalar
The largest value in the first k + b$k+b$ observations.
8:     rcomm(20$20$) – double array
The updated communication array. The first five elements of rcomm hold information that may be of interest with
rcomm(1) = k + b ∑ wii = 1 rcomm(2) = k + b
 (k + b ) ∑ wii = 1 2
− ∑ wi2 i = 1 rcomm(3) = k + b ∑ wi (xi − x)2 i = 1 rcomm(4) = k + b ∑ wi (xi − x)3 i = 1 rcomm(5) = k + b ∑ wi (xi − x)4 i = 1
$rcomm1 = ∑ i=1 k+b wi rcomm2 = ( ∑ i=1 k+b wi ) 2 - ∑ i=1 k+b wi2 rcomm3 = ∑ i=1 k+b wi ( xi - x- ) 2 rcomm4 = ∑ i=1 k+b wi ( xi - x- ) 3 rcomm5 = ∑ i=1 k+b wi (xi - x- ) 4$
the remaining elements of rcomm are used for workspace and so are undefined.
9:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

ifail = 11${\mathbf{ifail}}=11$
Constraint: nb0${\mathbf{nb}}\ge 0$.
ifail = 31${\mathbf{ifail}}=31$
Constraint: iwt = 0$\mathit{iwt}=0$ or 1$1$.
ifail = 41${\mathbf{ifail}}=41$
Constraint: if iwt = 1$\mathit{iwt}=1$ then wt(i)0${\mathbf{wt}}\left(\mathit{i}\right)\ge 0$, for i = 1,2,,nb$\mathit{i}=1,2,\dots ,{\mathbf{nb}}$.
ifail = 51${\mathbf{ifail}}=51$
Constraint: pn0${\mathbf{pn}}\ge 0$.
ifail = 52${\mathbf{ifail}}=52$
Constraint: if pn > 0${\mathbf{pn}}>0$, pn must be unchanged since previous call.
W ifail = 53${\mathbf{ifail}}=53$
On entry, the number of valid observations is zero.
W ifail = 71${\mathbf{ifail}}=71$
On exit we were unable to calculate xskew or xkurt. A value of 0$0$ has been returned.
W ifail = 72${\mathbf{ifail}}=72$
On exit we were unable to calculate xsd, xskew or xkurt. A value of 0$0$ has been returned.
ifail = 121${\mathbf{ifail}}=121$
rcomm has been corrupted between calls.

## Accuracy

Not applicable.

Both nag_stat_summary_onevar (g01at) and nag_stat_summary_onevar_combine (g01au) consolidate results from multiple summaries. Whereas the former can only be used to combine summaries calculated sequentially, the latter combines summaries calculated in an arbitrary order allowing, for example, summaries calculated on different processing units to be combined.

## Example

```function nag_stat_summary_onevar_example
x1 = [-0.62; -1.92; -1.72; -6.35; 2.00; 7.65; 6.15; 3.81; 4.87; -0.51; ...
6.88; -5.85; -0.72; 0.66; 2.23; -1.61; -0.15; -1.15; -8.74; -3.94; 3.61];
wt1 = [4.91; 0.25; 3.90; 3.75; 1.17; 3.19; 2.66; 0.02; 3.59; 3.63; 4.83; ...
3.72; 1.72; 0.78; 4.74; 1.72; 3.94; 1.33; 0.51; 2.40; 3.90];
x2 = [-0.66; -2.39; -6.25; 1.23; 2.27; -2.27; 10.12; 8.29; -2.99; 8.71; ...
-0.74; 0.02; 1.22; 1.70; 4.30; 2.99; -0.83; -1.00; 6.57; 2.32; -3.47; ...
-1.41; -5.26; 0.53; 1.80; 4.79; -3.04; 1.20; -3.21; -3.75; 0.86; ...
1.27; -5.95; -5.27; 1.63; 3.59; -0.01; -1.38; -4.71; -4.82; 3.55; ...
0.46; 2.57; 1.76; -4.05; 1.23; -1.99; 3.20; -0.65; 8.42; -6.01];
x3 = [1.13; -8.86; 5.92; -1.71; -3.99; 6.57; -2.01; -2.29; -1.11; 7.14; ...
4.84; -4.44; -3.32; 10.25; -2.11; 8.02; -7.31; 2.80; -1.20; 1.01; ...
1.37; -2.28; 1.28; -3.95; 3.43; -0.61; 4.85; -0.11];
rcomm = zeros(20,1);

% Initialise the number of valid observations processed so far
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
nag_stat_summary_onevar(x1, 'wt', wt1);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
nag_stat_summary_onevar(x2, 'pn', pn, 'rcomm', rcomm);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
nag_stat_summary_onevar(x3, 'pn', pn, 'rcomm', rcomm);

% Display the results
fprintf('\nData supplied in 3 blocks\n');
if (ifail==53)
fprintf('No valid observations supplied. All weights are zero.\n')
else
fprintf('%d valid observations\n', pn);
fprintf('Mean          %13.2f\n', xmean);
if (ifail==72)
fprintf('Unable to calculate the standard deviation, skewness or kurtosis\n');
else
fprintf('Std devn      %13.2f\n', xsd);
if (ifail==71)
fprintf('Unable to calculate the skewness or kurtosis\n');
else
fprintf('Skewness      %13.2f\n', xskew);
fprintf('Kurtosis      %13.2f\n', xkurt);
end
end
fprintf('Minimum       %13.2f\n', xmin);
fprintf('Maximum       %13.2f\n', xmax);
end
```
```

Data supplied in 3 blocks
100 valid observations
Mean                   0.51
Std devn               4.24
Skewness               0.18
Kurtosis              -0.59
Minimum               -8.86
Maximum               10.25

```
```function g01at_example
x1 = [-0.62; -1.92; -1.72; -6.35; 2.00; 7.65; 6.15; 3.81; 4.87; -0.51; ...
6.88; -5.85; -0.72; 0.66; 2.23; -1.61; -0.15; -1.15; -8.74; -3.94; 3.61];
wt1 = [4.91; 0.25; 3.90; 3.75; 1.17; 3.19; 2.66; 0.02; 3.59; 3.63; 4.83; ...
3.72; 1.72; 0.78; 4.74; 1.72; 3.94; 1.33; 0.51; 2.40; 3.90];
x2 = [-0.66; -2.39; -6.25; 1.23; 2.27; -2.27; 10.12; 8.29; -2.99; 8.71; ...
-0.74; 0.02; 1.22; 1.70; 4.30; 2.99; -0.83; -1.00; 6.57; 2.32; -3.47; ...
-1.41; -5.26; 0.53; 1.80; 4.79; -3.04; 1.20; -3.21; -3.75; 0.86; ...
1.27; -5.95; -5.27; 1.63; 3.59; -0.01; -1.38; -4.71; -4.82; 3.55; ...
0.46; 2.57; 1.76; -4.05; 1.23; -1.99; 3.20; -0.65; 8.42; -6.01];
x3 = [1.13; -8.86; 5.92; -1.71; -3.99; 6.57; -2.01; -2.29; -1.11; 7.14; ...
4.84; -4.44; -3.32; 10.25; -2.11; 8.02; -7.31; 2.80; -1.20; 1.01; ...
1.37; -2.28; 1.28; -3.95; 3.43; -0.61; 4.85; -0.11];
rcomm = zeros(20,1);

% Initialise the number of valid observations processed so far
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
g01at(x1, 'wt', wt1);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
g01at(x2, 'pn', pn, 'rcomm', rcomm);
[pn, xmean, xsd, xskew, xkurt, xmin, xmax, rcomm, ifail] = ...
g01at(x3, 'pn', pn, 'rcomm', rcomm);

% Display the results
fprintf('\nData supplied in 3 blocks\n');
if (ifail==53)
fprintf('No valid observations supplied. All weights are zero.\n')
else
fprintf('%d valid observations\n', pn);
fprintf('Mean          %13.2f\n', xmean);
if (ifail==72)
fprintf('Unable to calculate the standard deviation, skewness or kurtosis\n');
else
fprintf('Std devn      %13.2f\n', xsd);
if (ifail==71)
fprintf('Unable to calculate the skewness or kurtosis\n');
else
fprintf('Skewness      %13.2f\n', xskew);
fprintf('Kurtosis      %13.2f\n', xkurt);
end
end
fprintf('Minimum       %13.2f\n', xmin);
fprintf('Maximum       %13.2f\n', xmax);
end
```
```

Data supplied in 3 blocks
100 valid observations
Mean                   0.51
Std devn               4.24
Skewness               0.18
Kurtosis              -0.59
Minimum               -8.86
Maximum               10.25

```