Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_prob_hypergeom_vector (g01sl)

## Purpose

nag_stat_prob_hypergeom_vector (g01sl) returns a number of the lower tail, upper tail and point probabilities for the hypergeometric distribution.

## Syntax

[plek, pgtk, peqk, ivalid, ifail] = g01sl(n, l, m, k, 'ln', ln, 'll', ll, 'lm', lm, 'lk', lk)
[plek, pgtk, peqk, ivalid, ifail] = nag_stat_prob_hypergeom_vector(n, l, m, k, 'ln', ln, 'll', ll, 'lm', lm, 'lk', lk)

## Description

Let $X=\left\{{X}_{i}:i=1,2,\dots ,r\right\}$ denote a vector of random variables having a hypergeometric distribution with parameters ${n}_{i}$, ${l}_{i}$ and ${m}_{i}$. Then
 $Prob Xi = ki = mi ki ni - mi li - ki ni li ,$
where $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,{l}_{i}+{m}_{i}-{n}_{i}\right)\le {k}_{i}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({l}_{i},{m}_{i}\right)$, $0\le {l}_{i}\le {n}_{i}$ and $0\le {m}_{i}\le {n}_{i}$.
The hypergeometric distribution may arise if in a population of size ${n}_{i}$ a number ${m}_{i}$ are marked. From this population a sample of size ${l}_{i}$ is drawn and of these ${k}_{i}$ are observed to be marked.
The mean of the distribution $\text{}=\frac{{l}_{i}{m}_{i}}{{n}_{i}}$, and the variance $\text{}=\frac{{l}_{i}{m}_{i}\left({n}_{i}-{l}_{i}\right)\left({n}_{i}-{m}_{i}\right)}{{{n}_{i}}^{2}\left({n}_{i}-1\right)}$.
nag_stat_prob_hypergeom_vector (g01sl) computes for given ${n}_{i}$, ${l}_{i}$, ${m}_{i}$ and ${k}_{i}$ the probabilities: $\mathrm{Prob}\left\{{X}_{i}\le {k}_{i}\right\}$, $\mathrm{Prob}\left\{{X}_{i}>{k}_{i}\right\}$ and $\mathrm{Prob}\left\{{X}_{i}={k}_{i}\right\}$ using an algorithm similar to that described in Knüsel (1986) for the Poisson distribution.
The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See Vectorized Routines in the G01 Chapter Introduction for further information.

## References

Knüsel L (1986) Computation of the chi-square and Poisson distribution SIAM J. Sci. Statist. Comput. 7 1022–1036

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{n}\left({\mathbf{ln}}\right)$int64int32nag_int array
${n}_{i}$, the parameter of the hypergeometric distribution with ${n}_{i}={\mathbf{n}}\left(j\right)$, , for $i=1,2,\dots ,\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
Constraint: ${\mathbf{n}}\left(\mathit{j}\right)\ge 0$, for $\mathit{j}=1,2,\dots ,{\mathbf{ln}}$.
2:     $\mathrm{l}\left({\mathbf{ll}}\right)$int64int32nag_int array
${l}_{i}$, the parameter of the hypergeometric distribution with ${l}_{i}={\mathbf{l}}\left(j\right)$, .
Constraint: $0\le {l}_{i}\le {n}_{i}$.
3:     $\mathrm{m}\left({\mathbf{lm}}\right)$int64int32nag_int array
${m}_{i}$, the parameter of the hypergeometric distribution with ${m}_{i}={\mathbf{m}}\left(j\right)$, .
Constraint: $0\le {m}_{i}\le {n}_{i}$.
4:     $\mathrm{k}\left({\mathbf{lk}}\right)$int64int32nag_int array
${k}_{i}$, the integer which defines the required probabilities with ${k}_{i}={\mathbf{k}}\left(j\right)$, .
Constraint: $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,{l}_{i}+{m}_{i}-{n}_{i}\right)\le {k}_{i}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({l}_{i},{m}_{i}\right)$.

### Optional Input Parameters

1:     $\mathrm{ln}$int64int32nag_int scalar
Default: the dimension of the array n.
The length of the array n
Constraint: ${\mathbf{ln}}>0$.
2:     $\mathrm{ll}$int64int32nag_int scalar
Default: the dimension of the array l.
The length of the array l
Constraint: ${\mathbf{ll}}>0$.
3:     $\mathrm{lm}$int64int32nag_int scalar
Default: the dimension of the array m.
The length of the array m
Constraint: ${\mathbf{lm}}>0$.
4:     $\mathrm{lk}$int64int32nag_int scalar
Default: the dimension of the array k.
The length of the array k
Constraint: ${\mathbf{lk}}>0$.

### Output Parameters

1:     $\mathrm{plek}\left(:\right)$ – double array
The dimension of the array plek will be $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$
$\mathrm{Prob}\left\{{X}_{i}\le {k}_{i}\right\}$, the lower tail probabilities.
2:     $\mathrm{pgtk}\left(:\right)$ – double array
The dimension of the array pgtk will be $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$
$\mathrm{Prob}\left\{{X}_{i}>{k}_{i}\right\}$, the upper tail probabilities.
3:     $\mathrm{peqk}\left(:\right)$ – double array
The dimension of the array peqk will be $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$
$\mathrm{Prob}\left\{{X}_{i}={k}_{i}\right\}$, the point probabilities.
4:     $\mathrm{ivalid}\left(:\right)$int64int32nag_int array
The dimension of the array ivalid will be $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$
${\mathbf{ivalid}}\left(i\right)$ indicates any errors with the input arguments, with
${\mathbf{ivalid}}\left(i\right)=0$
No error.
${\mathbf{ivalid}}\left(i\right)=1$
 On entry, ${n}_{i}<0$.
${\mathbf{ivalid}}\left(i\right)=2$
 On entry, ${l}_{i}<0$, or ${l}_{i}>{n}_{i}$.
${\mathbf{ivalid}}\left(i\right)=3$
 On entry, ${m}_{i}<0$, or ${m}_{i}>{n}_{i}$.
${\mathbf{ivalid}}\left(i\right)=4$
 On entry, ${k}_{i}<0$, or ${k}_{i}>{l}_{i}$, or ${k}_{i}>{m}_{i}$, or ${k}_{i}<{l}_{i}+{m}_{i}-{n}_{i}$.
${\mathbf{ivalid}}\left(i\right)=5$
 On entry, ${n}_{i}$ is too large to be represented exactly as a real number.
${\mathbf{ivalid}}\left(i\right)=6$
 On entry, the variance (see Description) exceeds ${10}^{6}$.
5:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

W  ${\mathbf{ifail}}=1$
On entry, at least one value of n, l, m or k was invalid, or the variance was too large.
${\mathbf{ifail}}=2$
Constraint: ${\mathbf{ln}}>0$.
${\mathbf{ifail}}=3$
Constraint: ${\mathbf{ll}}>0$.
${\mathbf{ifail}}=4$
Constraint: ${\mathbf{lm}}>0$.
${\mathbf{ifail}}=5$
Constraint: ${\mathbf{lk}}>0$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

Results are correct to a relative accuracy of at least ${10}^{-6}$ on machines with a precision of $9$ or more decimal digits (provided that the results do not underflow to zero).

The time taken by nag_stat_prob_hypergeom_vector (g01sl) to calculate each probability depends on the variance (see Description) and on ${k}_{i}$. For given variance, the time is greatest when ${k}_{i}\approx {l}_{i}{m}_{i}/{n}_{i}$ ($=$ the mean), and is then approximately proportional to the square-root of the variance.

## Example

This example reads a vector of values for $n$, $l$, $m$ and $k$, and prints the corresponding probabilities.
```function g01sl_example

fprintf('g01sl example results\n\n');

n = [int64(10); 40; 155; 1000];
l = [int64( 2); 10;  35;  444];
m = [int64( 5);  3; 122;  500];
k = [int64( 1);  2;  22;  220];

[plek, pgtk, peqk, ivalid, ifail] = ...
g01sl(n, l, m, k);

fprintf('   n   l   m   k     plek      pgtk      peqk\n');
ln  = numel(n);
ll  = numel(l);
lm  = numel(m);
lk  = numel(k);
len = max ([ln, ll, lm, lk]);
for i=0:len-1
fprintf('%4d%4d%4d%4d%10.5f%10.5f%10.5f\n', n(mod(i,ln)+1), ...
l(mod(i,ll)+1), m(mod(i,lm)+1), k(mod(i,lk)+1), plek(i+1), ...
pgtk(i+1), peqk(i+1));
end

```
```g01sl example results

n   l   m   k     plek      pgtk      peqk
10   2   5   1   0.77778   0.22222   0.55556
40  10   3   2   0.98785   0.01215   0.13664
155  35 122  22   0.01101   0.98899   0.00779
1000 444 500 220   0.42429   0.57571   0.04913
```