Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_prob_kolmogorov2 (g01ez)

## Purpose

nag_stat_prob_kolmogorov2 (g01ez) returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.

## Syntax

[result, ifail] = g01ez(n1, n2, d)
[result, ifail] = nag_stat_prob_kolmogorov2(n1, n2, d)

## Description

Let Fn1(x)${F}_{{n}_{1}}\left(x\right)$ and Gn2(x)${G}_{{n}_{2}}\left(x\right)$ denote the empirical cumulative distribution functions for the two samples, where n1${n}_{1}$ and n2${n}_{2}$ are the sizes of the first and second samples respectively.
The function nag_stat_prob_kolmogorov2 (g01ez) computes the upper tail probability for the Kolmogorov–Smirnov two sample two-sided test statistic Dn1,n2${D}_{{n}_{1},{n}_{2}}$, where
 Dn1,n2 = supx|Fn1(x) − Gn2(x)|. $Dn1,n2=supx|Fn1(x)-Gn2(x)|.$
The probability is computed exactly if n1,n210000${n}_{1},{n}_{2}\le 10000$ and max (n1,n2)2500$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 2500$ using a method given by Kim and Jenrich (1973). For the case where min (n1,n2) 10 % $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 10%$ of the max (n1,n2)$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)$ and min (n1,n2) 80 $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 80$ the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973).

## References

Conover W J (1980) Practical Nonparametric Statistics Wiley
Feller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions Ann. Math. Statist. 19 179–181
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Kim P J and Jenrich R I (1973) Tables of exact sampling distribution of the two sample Kolmogorov–Smirnov criterion Dmn(m < n)${D}_{mn}\left(m Selected Tables in Mathematical Statistics 1 80–129 American Mathematical Society
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill
Smirnov N (1948) Table for estimating the goodness of fit of empirical distributions Ann. Math. Statist. 19 279–281

## Parameters

### Compulsory Input Parameters

1:     n1 – int64int32nag_int scalar
The number of observations in the first sample, n1${n}_{1}$.
Constraint: n11${\mathbf{n1}}\ge 1$.
2:     n2 – int64int32nag_int scalar
The number of observations in the second sample, n2${n}_{2}$.
Constraint: n21${\mathbf{n2}}\ge 1$.
3:     d – double scalar
The test statistic Dn1,n2${D}_{{n}_{1},{n}_{2}}$, for the two sample Kolmogorov–Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples.
Constraint: 0.0d1.0$0.0\le {\mathbf{d}}\le 1.0$.

None.

None.

### Output Parameters

1:     result – double scalar
The result of the function.
2:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
 On entry, n1 < 1${\mathbf{n1}}<1$, or n2 < 1${\mathbf{n2}}<1$.
ifail = 2${\mathbf{ifail}}=2$
 On entry, d < 0.0${\mathbf{d}}<0.0$, or d > 1.0${\mathbf{d}}>1.0$.
ifail = 3${\mathbf{ifail}}=3$
The approximation solution did not converge in 500$500$ iterations. A tail probability of 1.0$1.0$ is returned by nag_stat_prob_kolmogorov2 (g01ez).

## Accuracy

The large sample distributions used as approximations to the exact distribution should have a relative error of less than 5% for most cases.

## Further Comments

The upper tail probability for the one-sided statistics, Dn1,n2 + ${D}_{{n}_{1},{n}_{2}}^{+}$ or Dn1,n2${D}_{{n}_{1},{n}_{2}}^{-}$, can be approximated by halving the two-sided upper tail probability returned by nag_stat_prob_kolmogorov2 (g01ez), that is p / 2$p/2$. This approximation to the upper tail probability for either Dn1,n2 + ${D}_{{n}_{1},{n}_{2}}^{+}$ or Dn1,n2${D}_{{n}_{1},{n}_{2}}^{-}$ is good for small probabilities, (e.g., p0.10$p\le 0.10$) but becomes poor for larger probabilities.
The time taken by the function increases with n1${n}_{1}$ and n2${n}_{2}$, until n1n2 > 10000${n}_{1}{n}_{2}>10000$ or max (n1,n2)2500$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\ge 2500$. At this point one of the approximations is used and the time decreases significantly. The time then increases again modestly with n1${n}_{1}$ and n2${n}_{2}$.

## Example

```function nag_stat_prob_kolmogorov2_example
n1 = int64(5);
n2 = int64(10);
d = 0.5;
[result, ifail] = nag_stat_prob_kolmogorov2(n1, n2, d)
```
```

result =

0.3506

ifail =

0

```
```function g01ez_example
n1 = int64(5);
n2 = int64(10);
d = 0.5;
[result, ifail] = g01ez(n1, n2, d)
```
```

result =

0.3506

ifail =

0

```

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013