NAG C Library Function Document

nag_chi_sq_goodness_of_fit_test (g08cgc)

1
Purpose

nag_chi_sq_goodness_of_fit_test (g08cgc) computes the test statistic for the χ 2  goodness-of-fit test for data with a chosen number of class intervals.

2
Specification

#include <nag.h>
#include <nagg08.h>
void  nag_chi_sq_goodness_of_fit_test (Integer nclass, const Integer ifreq[], const double cint[], Nag_Distributions dist, const double par[], Integer npest, const double prob[], double *chisq, double *p, Integer *ndf, double eval[], double chisqi[], NagError *fail)

3
Description

The χ 2  goodness-of-fit test performed by nag_chi_sq_goodness_of_fit_test (g08cgc) is used to test the null hypothesis that a random sample arises from a specified distribution against the alternative hypothesis that the sample does not arise from the specified distribution.
Given a sample of size n , denoted by x 1 , x 2 , , x n , drawn from a random variable X , and that the data have been grouped into k  classes,
xc1, ci-1<xci, i=2,3,,k-1, x>ck-1,  
then the χ 2  goodness-of-fit test statistic is defined by:
X 2 = i=1 k O i - E i 2 E i  
where O i  is the observed frequency of the i th class, and E i  is the expected frequency of the i th class.
The expected frequencies are computed as
E i = p i × n ,  
where p i  is the probability that X  lies in the i th class, that is
p1=PXc1, pi=Pci-1<Xci, i=2,3,,k-1, pk=PX>ck-1.  
These probabilities are either taken from a common probability distribution or are supplied by you. The available probability distributions within this function are:
You must supply the frequencies and classes. Given a set of data and classes the frequencies may be calculated using nag_frequency_table (g01aec).
nag_chi_sq_goodness_of_fit_test (g08cgc) returns the χ 2  test statistic, X 2 , together with its degrees of freedom and the upper tail probability from the χ 2  distribution associated with the test statistic. Note that the use of the χ 2  distribution as an approximation to the distribution of the test statistic improves as the expected values in each class increase.

4
References

Conover W J (1980) Practical Nonparametric Statistics Wiley
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill

5
Arguments

1:     nclass IntegerInput
On entry: the number of classes, k , into which the data is divided.
Constraint: nclass2 .
2:     ifreq[nclass] const IntegerInput
On entry: ifreq[i-1]  must specify the frequency of the i th class, O i , for i=1,2,,k.
Constraint: ifreq[i-1] 0 , for i=1,2,, k.
3:     cint[nclass-1] const doubleInput
On entry: cint[i-1]  must specify the upper boundary value for the i th class, for i=1,2,,k - 1.
Constraints:
  • cint[0] < cint[1] < < cint[nclass-2] ;
  • For the exponential, gamma and χ 2  distributions cint[0] 0.0 .
4:     dist Nag_DistributionsInput
On entry: indicates for which distribution the test is to be carried out.
dist=Nag_Normal
The Normal distribution is used.
dist=Nag_Uniform
The uniform distribution is used.
dist=Nag_Exponential
The exponential distribution is used.
dist=Nag_ChiSquare
The χ 2  distribution is used.
dist=Nag_Gamma
The gamma distribution is used.
dist=Nag_UserProb
You must supply the class probabilities in the array prob.
Constraint: dist=Nag_Normal, Nag_Uniform, Nag_Exponential, Nag_ChiSquare, Nag_Gamma or Nag_UserProb.
5:     par[2] const doubleInput
On entry: par must contain the arguments of the distribution which is being tested. If you supply the probabilities (i.e., dist=Nag_UserProb) the array par is not referenced.
If a Normal distribution is used then par[0]  and par[1]  must contain the mean, μ , and the variance, σ 2 , respectively.
If a uniform distribution is used then par[0]  and par[1]  must contain the boundaries a  and b  respectively.
If an exponential distribution is used then par[0]  must contain the argument λ . par[1]  is not used.
If a χ 2  distribution is used then par[0]  must contain the number of degrees of freedom. par[1]  is not used.
If a gamma distribution is used par[0]  and par[1]  must contain the arguments α  and β  respectively.
Constraints:
  • if dist=Nag_Normal, par[1] > 0.0 ;
  • if dist=Nag_Uniform, par[0] < par[1]  and par[0] cint[0] ;
  • otherwise par[1] cint nclass-2 ;
  • if dist=Nag_Exponential, par[0] > 0.0 ;
  • if dist=Nag_ChiSquare, par[0] > 0.0 ;
  • if dist=Nag_Gamma, par[0]  and par[1] > 0.0 .
6:     npest IntegerInput
On entry: the number of estimated arguments of the distribution.
Constraint: 0 npest < nclass - 1 .
7:     prob[nclass] const doubleInput
On entry: if you are supplying the probability distribution (i.e., dist=Nag_UserProb) then prob[i-1]  must contain the probability that X  lies in the i th class.
If distNag_UserProb, prob is not referenced.
Constraint: if dist=Nag_UserProb, prob[i-1] > 0.0  and i=1 k prob[i-1] = 1.0 , for i=1,2,,k.
8:     chisq double *Output
On exit: the test statistic, X 2 , for the χ 2  goodness-of-fit test.
9:     p double *Output
On exit: the upper tail probability from the χ 2  distribution associated with the test statistic, X 2 , and the number of degrees of freedom.
10:   ndf Integer *Output
On exit: contains nclass - 1 - npest , the degrees of freedom associated with the test.
11:   eval[nclass] doubleOutput
On exit: eval[i-1]  contains the expected frequency for the i th class, E i , for i=1,2,,k.
12:   chisqi[nclass] doubleOutput
On exit: chisqi[i-1]  contains the contribution from the i th class to the test statistic, that is O i - E i 2 / E i , for i=1,2,,k.
13:   fail NagError *Input/Output
The NAG error argument (see Section 3.7 in How to Use the NAG Library and its Documentation).

6
Error Indicators and Warnings

NE_ARRAY_CONS
The contents of array prob are not valid.
Constraint: Sum of prob[i-1] = 1 , for i=1,2,,nclass, when dist=Nag_UserProb.
NE_ARRAY_INPUT
On entry, the values provided in par are invalid.
NE_BAD_PARAM
On entry, argument dist had an illegal value.
NE_G08CG_CLASS_VAL
This is a warning that expected values for certain classes are less than 1.0. This implies that one cannot be confident that the χ 2  distribution is a good approximation to the distribution of the test statistic.
NE_G08CG_CONV
The solution obtained when calculating the probability for a certain class for the gamma or χ 2  distribution did not converge in 600 iterations. The solution may be an adequate approximation.
NE_G08CG_FREQ
An expected frequency is equal to zero when the observed frequency is not.
NE_INT_2
On entry, npest=value , nclass=value .
Constraint: 0 npest < nclass - 1 .
NE_INT_ARG_LT
On entry, nclass=value.
Constraint: nclass2.
NE_INT_ARRAY_CONS
On entry, ifreq[value] = value.
Constraint: ifreq[i-1] 0 , for i=1,2,,nclass.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_NOT_STRICTLY_INCREASING
The sequence cint is not strictly increasing cint[value] = value, cint[value-1] = value.
NE_REAL_ARRAY_CONS
On entry, prob[value] = value.
Constraint: prob[i-1] > 0 , for i=1,2,,nclass, when dist=Nag_UserProb.
NE_REAL_ARRAY_ELEM_CONS
On entry, cint[0] = value.
Constraint: cint[0] 0.0 , if dist=Nag_ExponentialNag_ChiSquareNag_Gamma.

7
Accuracy

The computations are believed to be stable.

8
Parallelism and Performance

nag_chi_sq_goodness_of_fit_test (g08cgc) is not threaded in any implementation.

9
Further Comments

The time taken by nag_chi_sq_goodness_of_fit_test (g08cgc) is dependent both on the distribution chosen and on the number of classes, k .

10
Example

The example program applies the χ 2  goodness-of-fit test to test whether there is evidence to suggest that a sample of 100 observations generated by nag_rand_uniform (g05sqc) do not arise from a uniform distribution U 0,1 . The class intervals are calculated such that the interval (0,1) is divided into five equal classes. The frequencies for each class are calculated using nag_frequency_table (g01aec).

10.1
Program Text

Program Text (g08cgce.c)

10.2
Program Data

Program Data (g08cgce.d)

10.3
Program Results

Program Results (g08cgce.r)