Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_frequency_table (g01ae)

## Purpose

nag_stat_frequency_table (g01ae) constructs a frequency distribution of a variable, according to either user-supplied, or function-calculated class boundary values.

## Syntax

[cb, ifreq, xmin, xmax, ifail] = g01ae(k, x, 'n', n, 'cb', cb)
[cb, ifreq, xmin, xmax, ifail] = nag_stat_frequency_table(k, x, 'n', n, 'cb', cb)
Note: the interface to this routine has changed since earlier releases of the toolbox:
Mark 23: iclass no longer an input parameter, cb now optional, k now a compulsory input parameter
.

## Description

The data consists of a sample of n$n$ observations of a continuous variable, denoted by xi${x}_{i}$, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$. Let a = min (x1,,xn) $a=\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({x}_{1},\dots ,{x}_{n}\right)$ and b = max (x1,,xn) $b=\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({x}_{1},\dots ,{x}_{n}\right)$.
nag_stat_frequency_table (g01ae) constructs a frequency distribution with k( > 1)$k\left(>1\right)$ classes denoted by fi${f}_{i}$, for i = 1,2,,k$\mathit{i}=1,2,\dots ,k$.
The boundary values may be either user-supplied, or function-calculated, and are denoted by yj${y}_{j}$, for j = 1,2,,k1$\mathit{j}=1,2,\dots ,k-1$.
If the boundary values of the classes are to be function-calculated, then they are determined in one of the following ways:
 (a) if k > 2$k>2$, the range of x$x$ values is divided into k − 2$k-2$ intervals of equal length, and two extreme intervals, defined by the class boundary values y1,y2, … ,yk − 1${y}_{1},{y}_{2},\dots ,{y}_{k-1}$; (b) if k = 2$k=2$, y1 = (1/2)(a + b)${y}_{1}=\frac{1}{2}\left(a+b\right)$.
However formed, the values y1,,yk1${y}_{1},\dots ,{y}_{k-1}$ are assumed to be in ascending order. The class frequencies are formed with
• f1 = ${f}_{1}=\text{}$ the number of x$x$ values in the interval (,y1) $\left(-\infty ,{y}_{1}\right)$
• fi = ${f}_{i}=\text{}$ the number of x$x$ values in the interval [ yi1 ,yi) $\left[{y}_{i-1},{y}_{i}\right)$, i = 2,,k1$\text{ }i=2,\dots ,k-1$
• fk = ${f}_{k}=\text{}$ the number of x$x$ values in the interval [ yk1 ,) $\left[{y}_{k-1},\infty \right)$,
where [ means inclusive, and ) means exclusive. If the class boundary values are function-calculated and k > 2$k>2$, then f1 = fk = 0${f}_{1}={f}_{k}=0$, and y1${y}_{1}$ and yk1${y}_{k-1}$ are chosen so that y1 < a${y}_{1} and yk1 > b${y}_{k-1}>b$
If a frequency distribution is required for a discrete variable, then it is suggested that you supply the class boundary values; function-calculated boundary values may be slightly imprecise (due to the adjustment of y1${y}_{1}$ and yk1${y}_{k-1}$ outlined above) and cause values very close to a class boundary to be assigned to the wrong class.

None.

## Parameters

### Compulsory Input Parameters

1:     k – int64int32nag_int scalar
k$k$, the number of classes desired in the frequency distribution. Whether or not class boundary values are user-supplied, k must include the two extreme classes which stretch to ± $±\infty$.
Constraint: k2${\mathbf{k}}\ge 2$.
2:     x(n) – double array
n, the dimension of the array, must satisfy the constraint n1${\mathbf{n}}\ge 1$.
The sample of observations of the variable for which the frequency distribution is required, xi${x}_{\mathit{i}}$, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$. The values may be in any order.

### Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array x.
n$n$, the number of observations.
Constraint: n1${\mathbf{n}}\ge 1$.
2:     cb(k) – double array
If cb is not supplied, nag_stat_frequency_table (g01ae) calculates k1$k-1$ class boundary values.
If cb is supplied, the first k1$k-1$ elements of cb must contain the class boundary values you supplied, in ascending order.
Constraint: cb(i) < cb(i + 1)${\mathbf{cb}}\left(\mathit{i}\right)<{\mathbf{cb}}\left(\mathit{i}+1\right)$, for i = 1,2,,k2$\mathit{i}=1,2,\dots ,k-2$.

None.

### Output Parameters

1:     cb(k) – double array
The first k1$k-1$ elements of cb contain the class boundary values in ascending order.
2:     ifreq(k) – int64int32nag_int array
The elements of ifreq contain the frequencies in each class, fi${f}_{\mathit{i}}$, for i = 1,2,,k$\mathit{i}=1,2,\dots ,k$. In particular ifreq(1)${\mathbf{ifreq}}\left(1\right)$ contains the frequency of the class up to cb(1)${\mathbf{cb}}\left(1\right)$, f1${f}_{1}$, and ifreq(k)${\mathbf{ifreq}}\left(k\right)$ contains the frequency of the class greater than cb(k1)${\mathbf{cb}}\left(k-1\right)$, fk${f}_{k}$.
3:     xmin – double scalar
The smallest value in the sample, a$a$.
4:     xmax – double scalar
The largest value in the sample, b$b$.
5:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
 On entry, k < 2${\mathbf{k}}<2$.
ifail = 2${\mathbf{ifail}}=2$
 On entry, n < 1${\mathbf{n}}<1$.
ifail = 3${\mathbf{ifail}}=3$
 On entry, the user-supplied class boundary values are not in ascending order.

## Accuracy

The method used is believed to be stable.

The time taken by nag_stat_frequency_table (g01ae) increases with k and n. It also depends on the distribution of the sample observations.

## Example

```function nag_stat_frequency_table_example
k = int64(7);
x = [22.3; 21.6; 22.6; 22.4; 22.4; 22.4; 22.1; 21.9; 23.1; 23.4; 23.4;
22.6; 22.5; 22.5; 22.1; 22.6; 22.3; 22.4; 21.8; 22.3; 22.1; 23.6;
20.8; 22.2; 23.1; 21.1; 21.7; 21.4; 21.6; 22.5; 21.2; 22.6; 22.2;
22.2; 21.4; 21.7; 23.2; 23.1; 22.3; 22.3; 21.1; 21.4; 21.5; 21.8;
22.8; 21.4; 20.7; 21.6; 23.2; 23.6; 22.7; 21.7; 23;   21.9; 22.6;
22.1; 22.2; 23.4; 21.5; 23;   22.8; 21.4; 23.2; 21.8; 21.2; 22;
22.4; 22.8; 23.2; 23.6];
[cb, ifreq, xmin, xmax, ifail] = nag_stat_frequency_table(k, x)
```
```

cb =

20.6986
21.2791
21.8597
22.4403
23.0209
23.6015
0

ifreq =

0
6
16
21
14
13
0

xmin =

20.7000

xmax =

23.6000

ifail =

0

```
```function g01ae_example
k = int64(7);
x = [22.3; 21.6; 22.6; 22.4; 22.4; 22.4; 22.1; 21.9; 23.1; 23.4; 23.4;
22.6; 22.5; 22.5; 22.1; 22.6; 22.3; 22.4; 21.8; 22.3; 22.1; 23.6;
20.8; 22.2; 23.1; 21.1; 21.7; 21.4; 21.6; 22.5; 21.2; 22.6; 22.2;
22.2; 21.4; 21.7; 23.2; 23.1; 22.3; 22.3; 21.1; 21.4; 21.5; 21.8;
22.8; 21.4; 20.7; 21.6; 23.2; 23.6; 22.7; 21.7; 23;   21.9; 22.6;
22.1; 22.2; 23.4; 21.5; 23;   22.8; 21.4; 23.2; 21.8; 21.2; 22;
22.4; 22.8; 23.2; 23.6];
[cb, ifreq, xmin, xmax, ifail] = g01ae(k, x)
```
```

cb =

20.6986
21.2791
21.8597
22.4403
23.0209
23.6015
0

ifreq =

0
6
16
21
14
13
0

xmin =

20.7000

xmax =

23.6000

ifail =

0

```