the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973).

4 References

Conover W J (1980) Practical Nonparametric Statistics Wiley

Feller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions Ann. Math. Statist. 19 179–181

Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin

Kim P J and Jenrich R I (1973) Tables of exact sampling distribution of the two sample Kolmogorov–Smirnov criterion

D_{m n} (m < n)

Selected Tables in Mathematical Statistics 1 80–129 American Mathematical Society

Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill

Smirnov N (1948) Table for estimating the goodness of fit of empirical distributions Ann. Math. Statist. 19 279–281

5 Arguments

1: $n1$ – Integer Input: On entry: the number of observations in the first sample, $n_{1}$ .

Constraint: $n1 \geq 1$ .
2: $n2$ – Integer Input: On entry: the number of observations in the second sample, $n_{2}$ .

Constraint: $n2 \geq 1$ .
3: $d$ – Real (Kind=nag_wp) Input: On entry: the test statistic $D_{n_{1}, n_{2}}$ , for the two sample Kolmogorov–Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples.

Constraint: $0.0 \leq d \leq 1.0$ .
4: $ifail$ – Integer Input/Output: On entry: ifail must be set to $0$ , $- 1 or 1$ . If you are unfamiliar with this argument you should refer to Section 4 in the Introduction to the NAG Library FL Interface for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value $- 1 or 1$ is recommended. If the output of error messages is undesirable, then the value $1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is $0$ . When the value $- 1 or 1$ is used it is essential to test the value of ifail on exit.

On exit: $ifail = 0$ unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

ifail = 0

- 1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 1$: On entry, $n1 = 〈value〉$ and $n2 = 〈value〉$ .
Constraint: $n1 \geq 1$ and $n2 \geq 1$ .

$ifail = 2$: On entry, $d < 0.0$ or $d > 1.0$ : $d = 〈value〉$ .

$ifail = 3$: The Smirnov approximation used for large samples did not converge in $200$ iterations. The probability is set to $1.0$ .

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

The large sample distributions used as approximations to the exact distribution should have a relative error of less than 5% for most cases.

8 Parallelism and Performance

g01ezf is not threaded in any implementation.

9 Further Comments

The upper tail probability for the one-sided statistics,

D_{n_{1}, n_{2}}^{+}

D_{n_{1}, n_{2}}^{-}

, can be approximated by halving the two-sided upper tail probability returned by g01ezf, that is

p / 2

. This approximation to the upper tail probability for either

D_{n_{1}, n_{2}}^{+}

D_{n_{1}, n_{2}}^{-}

is good for small probabilities, (e.g.,

p \leq 0.10

) but becomes poor for larger probabilities.

The time taken by the routine increases with

n_{1}

and

n_{2}

, until

n_{1} n_{2} > 10000

\max (n_{1}, n_{2}) \geq 2500

. At this point one of the approximations is used and the time decreases significantly. The time then increases again modestly with

n_{1}

and

n_{2}

10 Example

The following example reads in

10

different sample sizes and values for the test statistic

D_{n_{1}, n_{2}}

. The upper tail probability is computed and printed for each case.

Interfaces: FL CL AD

NAG FL Interface Introduction

G01 (Stat) Chapter Contents

G01 (Stat) Chapter Introduction

g01ez: FL CL AD

NAG FL Interfaceg01ezf (prob_​kolmogorov2)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG FL Interface
g01ezf (prob_kolmogorov2)