NAG Library Function Document
nag_prob_2_sample_ks (g01ezc) returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.
||nag_prob_2_sample_ks (Integer n1,
Let and denote the empirical cumulative distribution functions for the two samples, where and are the sizes of the first and second samples respectively.
The function nag_prob_2_sample_ks (g01ezc) computes the upper tail probability for the Kolmogorov–Smirnov two sample two-sided test statistic
The probability is computed exactly if
using a method given by Kim and Jenrich (1973)
. For the case where
the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973)
Conover W J (1980) Practical Nonparametric Statistics Wiley
Feller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions Ann. Math. Statist. 19 179–181
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Kim P J and Jenrich R I (1973) Tables of exact sampling distribution of the two sample Kolmogorov–Smirnov criterion Selected Tables in Mathematical Statistics 1 80–129 American Mathematical Society
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill
Smirnov N (1948) Table for estimating the goodness of fit of empirical distributions Ann. Math. Statist. 19 279–281
n1 – IntegerInput
On entry: the number of observations in the first sample, .
n2 – IntegerInput
On entry: the number of observations in the second sample, .
d – doubleInput
On entry: the test statistic , for the two sample Kolmogorov–Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples.
fail – NagError *Input/Output
The NAG error argument (see Section 3.6
in the Essential Introduction).
6 Error Indicators and Warnings
The Smirnov approximation used for large samples did not converge in iterations. The probability is set to .
On entry, and .
Constraint: and .
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG
On entry, or : .
The large sample distributions used as approximations to the exact distribution should have a relative error of less than 5% for most cases.
8 Parallelism and Performance
The upper tail probability for the one-sided statistics, or , can be approximated by halving the two-sided upper tail probability returned by nag_prob_2_sample_ks (g01ezc), that is . This approximation to the upper tail probability for either or is good for small probabilities, (e.g., ) but becomes poor for larger probabilities.
The time taken by the function increases with and , until or . At this point one of the approximations is used and the time decreases significantly. The time then increases again modestly with and .
The following example reads in different sample sizes and values for the test statistic . The upper tail probability is computed and printed for each case.
10.1 Program Text
Program Text (g01ezce.c)
10.2 Program Data
Program Data (g01ezce.d)
10.3 Program Results
Program Results (g01ezce.r)