Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_prob_hypergeom (g01bl)

## Purpose

nag_stat_prob_hypergeom (g01bl) returns the lower tail, upper tail and point probabilities associated with a hypergeometric distribution.

## Syntax

[plek, pgtk, peqk, ifail] = g01bl(n, l, m, k)
[plek, pgtk, peqk, ifail] = nag_stat_prob_hypergeom(n, l, m, k)

## Description

Let $X$ denote a random variable having a hypergeometric distribution with parameters $n$, $l$ and $m$ ($n\ge l\ge 0$, $n\ge m\ge 0$). Then
 $ProbX=k= m k n-m l-k n l ,$
where $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,l-\left(n-m\right)\right)\le k\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(l,m\right)$, $0\le l\le n$ and $0\le m\le n$.
The hypergeometric distribution may arise if in a population of size $n$ a number $m$ are marked. From this population a sample of size $l$ is drawn and of these $k$ are observed to be marked.
The mean of the distribution $\text{}=\frac{lm}{n}$, and the variance $\text{}=\frac{lm\left(n-l\right)\left(n-m\right)}{{n}^{2}\left(n-1\right)}$.
nag_stat_prob_hypergeom (g01bl) computes for given $n$, $l$, $m$ and $k$ the probabilities:
 $plek=ProbX≤k pgtk=ProbX>k peqk=ProbX=k .$
The method is similar to the method for the Poisson distribution described in Knüsel (1986).

## References

Knüsel L (1986) Computation of the chi-square and Poisson distribution SIAM J. Sci. Statist. Comput. 7 1022–1036

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{n}$int64int32nag_int scalar
The parameter $n$ of the hypergeometric distribution.
Constraint: ${\mathbf{n}}\ge 0$.
2:     $\mathrm{l}$int64int32nag_int scalar
The parameter $l$ of the hypergeometric distribution.
Constraint: $0\le {\mathbf{l}}\le {\mathbf{n}}$.
3:     $\mathrm{m}$int64int32nag_int scalar
The parameter $m$ of the hypergeometric distribution.
Constraint: $0\le {\mathbf{m}}\le {\mathbf{n}}$.
4:     $\mathrm{k}$int64int32nag_int scalar
The integer $k$ which defines the required probabilities.
Constraint: $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,{\mathbf{l}}-\left({\mathbf{n}}-{\mathbf{m}}\right)\right)\le {\mathbf{k}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{l}},{\mathbf{m}}\right)$.

None.

### Output Parameters

1:     $\mathrm{plek}$ – double scalar
The lower tail probability, $\mathrm{Prob}\left\{X\le k\right\}$.
2:     $\mathrm{pgtk}$ – double scalar
The upper tail probability, $\mathrm{Prob}\left\{X>k\right\}$.
3:     $\mathrm{peqk}$ – double scalar
The point probability, $\mathrm{Prob}\left\{X=k\right\}$.
4:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{n}}<0$.
${\mathbf{ifail}}=2$
 On entry, ${\mathbf{l}}<0$, or ${\mathbf{l}}>{\mathbf{n}}$.
${\mathbf{ifail}}=3$
 On entry, ${\mathbf{m}}<0$, or ${\mathbf{m}}>{\mathbf{n}}$.
${\mathbf{ifail}}=4$
 On entry, ${\mathbf{k}}<0$, or ${\mathbf{k}}>{\mathbf{l}}$, or ${\mathbf{k}}>{\mathbf{m}}$, or ${\mathbf{k}}<{\mathbf{l}}+{\mathbf{m}}-{\mathbf{n}}$.
${\mathbf{ifail}}=5$
 On entry, n is too large to be represented exactly as a double number.
${\mathbf{ifail}}=6$
 On entry, the variance (see Description) exceeds ${10}^{6}$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

Results are correct to a relative accuracy of at least ${10}^{-6}$ on machines with a precision of $9$ or more decimal digits, and to a relative accuracy of at least ${10}^{-3}$ on machines of lower precision (provided that the results do not underflow to zero).

The time taken by nag_stat_prob_hypergeom (g01bl) depends on the variance (see Description) and on $k$. For given variance, the time is greatest when $k\approx lm/n$ ($=$ the mean), and is then approximately proportional to the square-root of the variance.

## Example

This example reads values of $n$, $l$, $m$ and $k$ from a data file until end-of-file is reached, and prints the corresponding probabilities.
```function g01bl_example

fprintf('g01bl example results\n\n');

n = int64([10 40 155 1000]);
l = int64([ 2 10  35  444]);
m = int64([ 5  3 122  500]);
k = int64([ 1  2  22  220]);

fprintf('    n   l   m   k     plek      pgtk      peqk\n');
for i = 1:4
[plek, pgtk, peqk, ifail] = ...
g01bl(n(i), l(i), m(i), k(i));

fprintf('%5d%4d%4d%4d%10.5f%10.5f%10.5f\n', n(i), l(i), m(i), k(i), ...
plek, pgtk, peqk);
end

```
```g01bl example results

n   l   m   k     plek      pgtk      peqk
10   2   5   1   0.77778   0.22222   0.55556
40  10   3   2   0.98785   0.01215   0.13664
155  35 122  22   0.01101   0.98899   0.00779
1000 444 500 220   0.42429   0.57571   0.04913
```