NAG FL Interface
g01eyf (prob_kolmogorov1)
1
Purpose
g01eyf returns the upper tail probability associated with the one sample Kolmogorov–Smirnov distribution.
2
Specification
Fortran Interface
Real (Kind=nag_wp) 
:: 
g01eyf 
Integer, Intent (In) 
:: 
n 
Integer, Intent (Inout) 
:: 
ifail 
Real (Kind=nag_wp), Intent (In) 
:: 
d 

C Header Interface
#include <nag.h>
double 
g01eyf_ (const Integer *n, const double *d, Integer *ifail) 

C++ Header Interface
#include <nag.h> extern "C" {
double 
g01eyf_ (const Integer &n, const double &d, Integer &ifail) 
}

The routine may be called by the names g01eyf or nagf_stat_prob_kolmogorov1.
3
Description
Let ${S}_{n}\left(x\right)$ be the sample cumulative distribution function and ${F}_{0}\left(x\right)$ the hypothesised theoretical distribution function.
g01eyf returns the upper tail probability,
$p$, associated with the onesided Kolmogorov–Smirnov test statistic
${D}_{n}^{+}$ or
${D}_{n}^{}$, where these onesided statistics are defined as follows;
If
$n\le 100$ an exact method is used; for the details see
Conover (1980). Otherwise a large sample approximation derived by Smirnov is used; see
Feller (1948),
Kendall and Stuart (1973) or
Smirnov (1948).
4
References
Conover W J (1980) Practical Nonparametric Statistics Wiley
Feller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions Ann. Math. Statist. 19 179–181
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Siegel S (1956) Nonparametric Statistics for the Behavioral Sciences McGraw–Hill
Smirnov N (1948) Table for estimating the goodness of fit of empirical distributions Ann. Math. Statist. 19 279–281
5
Arguments

1:
$\mathbf{n}$ – Integer
Input

On entry: $n$, the number of observations in the sample.
Constraint:
${\mathbf{n}}\ge 1$.

2:
$\mathbf{d}$ – Real (Kind=nag_wp)
Input

On entry: contains the test statistic, ${D}_{n}^{+}$ or ${D}_{n}^{}$.
Constraint:
$0.0\le {\mathbf{d}}\le 1.0$.

3:
$\mathbf{ifail}$ – Integer
Input/Output

On entry:
ifail must be set to
$0$,
$1\text{or}1$. If you are unfamiliar with this argument you should refer to
Section 4 in the Introduction to the NAG Library FL Interface for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry, ${\mathbf{n}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n}}\ge 1$.
 ${\mathbf{ifail}}=2$

On entry, ${\mathbf{d}}<0.0$ or ${\mathbf{d}}>1.0$: ${\mathbf{d}}=\u2329\mathit{\text{value}}\u232a$.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
The large sample distribution used as an approximation to the exact distribution should have a relative error of less than $2.5$% for most cases.
8
Parallelism and Performance
g01eyf is not threaded in any implementation.
The upper tail probability for the twosided statistic, ${D}_{n}=\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({D}_{n}^{+},{D}_{n}^{}\right)$, can be approximated by twice the probability returned via g01eyf, that is $2p$. (Note that if the probability from g01eyf is greater than $0.5$ then the twosided probability should be truncated to $1.0$). This approximation to the tail probability for ${D}_{n}$ is good for small probabilities, (e.g., $p\le 0.10$) but becomes very poor for larger probabilities.
The time taken by the routine increases with $n$, until $n>100$. At this point the approximation is used and the time decreases significantly. The time then increases again modestly with $n$.
10
Example
The following example reads in $10$ different sample sizes and values for the test statistic ${D}_{n}$. The upper tail probability is computed and printed for each case.
10.1
Program Text
10.2
Program Data
10.3
Program Results