# NAG Toolbox: nag_nonpar_gofstat_anddar (g08ch)

## Purpose

nag_nonpar_gofstat_anddar (g08ch) calculates the Anderson–Darling goodness-of-fit test statistic.

## Syntax

[result, y, ifail] = g08ch(issort, y, 'n', n)
[result, y, ifail] = nag_nonpar_gofstat_anddar(issort, y, 'n', n)

## Description

Denote by A2${A}^{2}$ the Anderson–Darling test statistic for n$n$ observations y1,y2,,yn${y}_{1},{y}_{2},\dots ,{y}_{n}$ of a variable Y$Y$ assumed to be standard uniform and sorted in ascending order, then:
 A2 = − n − S ; $A2 = -n-S ;$
where:
 n S = ∑ (2i − 1)/n [lnyi + ln(1 − yn − i + 1)]. i = 1
$S = ∑ i=1 n 2i-1 n [ ln⁡yi + ln( 1- y n-i+1 ) ] .$
When observations of a random variable X$X$ are non-uniformly distributed, the probability integral transformation (PIT):
 Y = F(X) , $Y=F(X) ,$
where F$F$ is the cumulative distribution function of the distribution of interest, yields a uniformly distributed random variable Y$Y$. The PIT is true only if all parameters of a distribution are known as opposed to estimated; otherwise it is an approximation.

## References

Anderson T W and Darling D A (1952) Asymptotic theory of certain ‘goodness-of-fit’ criteria based on stochastic processes Annals of Mathematical Statistics 23 193–212

## Parameters

### Compulsory Input Parameters

1:     issort – logical scalar
Set issort = true${\mathbf{issort}}=\mathbf{true}$ if the observations are sorted in ascending order; otherwise the function will sort the observations.
2:     y(n) – double array
n, the dimension of the array, must satisfy the constraint n > 1${\mathbf{n}}>1$.
yi${y}_{\mathit{i}}$, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$, the n$n$ observations.
Constraint: if issort = true${\mathbf{issort}}=\mathbf{true}$, the values must be sorted in ascending order. Each yi${y}_{i}$ must lie in the interval (0,1)$\left(0,1\right)$.

### Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array y.
n$n$, the number of observations.
Constraint: n > 1${\mathbf{n}}>1$.

### Output Parameters

1:     result – double scalar
The result of the function.
2:     y(n) – double array
If issort = false${\mathbf{issort}}=\mathbf{false}$, the data sorted in ascending order; otherwise the array is unchanged.
3:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
Constraint: n > 1${\mathbf{n}}>1$.
ifail = 3${\mathbf{ifail}}=3$
issort = true${\mathbf{issort}}=\mathbf{true}$ and the data in y is not sorted in ascending order.
ifail = 9${\mathbf{ifail}}=9$
The data in y must lie in the interval (0,1)$\left(0,1\right)$.

## Example

```function nag_nonpar_gofstat_anddar_example
n = 26;
x = [0.4782745, 1.2858962, 1.1163891, 2.0410619, 2.2648109, 0.0833660, ...
1.2527554, 0.4031288, 0.7808981, 0.1977674, 3.2539440, 1.8113504, ...
1.2279834, 3.9178773, 1.4494309, 0.1358438, 1.8061778, 6.0441929, ...
0.9671624, 3.2035042, 0.8067364, 0.4179364, 3.5351774, 0.3975414, ...
0.6120960, 0.1332589];
% Maximum likelihood estimate of mean
beta = mean(x);
% PIT, using exponential CDF with mean beta
y = 1 - exp(-x/beta);
% Let nag_nonpar_gofstat_anddar sort the (approximately) uniform variates
issort = false;

% Calculate a-squared
[a2, y, ifail] = nag_nonpar_gofstat_anddar(issort, y);
aa2 = (1+0.6/numel(y))*a2;

% Number of simulations
nsim = 888;
seed = [int64(206033)];
[state, ifail] = nag_rand_init_repeat(int64(1), int64(-1), seed);
[state, xsim, ifail] = nag_rand_dist_exp(int64(n*nsim), beta, state);

% Simulations loop
nupper = 0;
for j=1:nsim
k = (j-1)*n;
% Maximum likelihood estimate of mean
sbeta = mean(xsim(k+1:k+n));
% PIT
y = 1 - exp(-xsim(k+1:k+n)/sbeta);
% Calculate a-squared
[sa2, y, ifail] = nag_nonpar_gofstat_anddar(issort, y);
if sa2 > aa2
nupper = nupper + 1;
end
end

% Simulated upper tail probability value
p = nupper/(nsim+1);

% Results
fprintf('\nH0: data from exponential distribution with mean %10.4e\n', beta);
fprintf('Test statistic, A-squared: %8.4f\n', a2);
fprintf('Upper tail probability:    %8.4f\n', p);
```
```

H0: data from exponential distribution with mean 1.5240e+00
Test statistic, A-squared:   0.1616
Upper tail probability:      0.9798

```
