The function may be called by the names: g05nfc or nag_rand_resample.
3Description
Given a vector $V$, of $n$ integer values, g05nfc selects $m$ elements with the probability of selecting ${V}_{j}$ proportional to a user-supplied weight, ${w}_{j}$. The sampling is done with replacement, so that each value, ${V}_{j}$, may appear more than once in the sample.
The most common usage case for g05nfc is where $V$ was obtained using some other sampling method, for example, importance sampling. In such a case, this function is used to perform resampling.
Several methods of calculating ${N}_{\mathit{j}}$, the number of times ${V}_{j}$ appears in the sample, are available:
•Multinomial Resampling:
The vector of counts; $\{{N}_{j}:j=1,2,\dots ,n\}$ is drawn from a multinomial distribution with probabilities given by the normalised weights, $\stackrel{~}{w}$.
where $\stackrel{~}{w}$ are the normalised weights, ${u}_{1}\sim U(0,\frac{1}{m})$, ${u}_{\mathit{i}}={u}_{1}+\frac{\mathit{i}-1}{m}$, for $\mathit{i}=2,3,\dots ,m$ and ${\sum}_{k=1}^{0}{\stackrel{~}{w}}_{k}$ is defined to be zero.
In other words, ${N}_{\mathit{j}}$ is the number of shifted and scaled uniform variates contained in bins defined by the partial sums of normalised weights.
•Residual Resampling:
${N}_{j}={N}_{j}^{S}+{N}_{j}^{R}$,
where ${N}_{j}^{S}=\lfloor m{\stackrel{~}{w}}_{j}\rfloor $ (i.e. the integer part of $m{\stackrel{~}{w}}_{j}$), and the vector of residual counts, $\{{N}_{j}^{R}:j=1,2,\dots ,n\}$ is drawn from a multinomial distribution with probabilities given by $\{{\stackrel{~}{w}}_{j}-\frac{{N}_{j}^{S}}{m}:j=1,2,\dots ,n\}$.
See g05tgc for more information on the multinomial distribution and Douc et al. (2005) for more details on the resampling methods.
If multiple samples are requested (${\mathbf{nrs}}>1$) then the chosen resampling method is performed independently for each sample.
One of the initialization functions g05kfc (for a repeatable sequence if computed sequentially) or g05kgc (for a non-repeatable sequence) must be called prior to the first call to g05nfc.
4References
Douc R,
Cappe O, and
Moulines E
(2005)
Comparison of resampling schemes for particle filtering
Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis
64–69
https://dx.doi.org/10.1109/ISPA.2005.195385
Li T,
Bolic M, and
Djuric P M
(2015)
Resampling Methods for Particle Filtering: Classification, implementation, and strategies
IEEE Signal Processing Magazinevol. 32, no. 3
70–86
https://dx.doi.org/10.1109/MSP.2014.2330626
5Arguments
1: $\mathbf{rtype}$ – IntegerInput
On entry: a flag indicating the resampling method to use.
Note: the dimension, dim, of the array ipop
must be at least
${\mathbf{n}}$, when ${\mathbf{ipop}}\phantom{\rule{0.25em}{0ex}}\text{is not}\phantom{\rule{0.25em}{0ex}}\mathbf{NULL}$;
otherwise ${\mathbf{ipop}}$ is not referenced and may be NULL.
On entry: $V$, the vector to be sampled from. If ${\mathbf{ipop}}\phantom{\rule{0.25em}{0ex}}\text{is}\phantom{\rule{0.25em}{0ex}}\mathbf{NULL}$ then the $V$ is assumed to be the set of values $(1,2,\dots ,{\mathbf{n}})$.
Elements of ipop with the same value are not combined, therefore, if ${\mathbf{wt}}\left[i\right]\ne 0,{\mathbf{wt}}\left[j\right]\ne 0$ and $i\ne j$ then there is a nonzero probability that the sample will contain both ${\mathbf{ipop}}\left[i\right]$ and ${\mathbf{ipop}}\left[j\right]$, irrespective of their values.
If the values to be returned in isampl are counts, i.e., ${\mathbf{otype}}=2$, then ipop is not referenced.
5: $\mathbf{m}$ – IntegerInput
On entry: $m$, the size of the sample required.
Constraint:
${\mathbf{m}}\ge 0$.
6: $\mathbf{nrs}$ – IntegerInput
On entry: the number of times to resample.
Constraint:
${\mathbf{nrs}}\ge 0$.
7: $\mathbf{otype}$ – IntegerInput
On entry: a flag indicating what is returned in isampl.
${\mathbf{otype}}=1$
The values returned in isampl are taken from the population.
Note: where ${\mathbf{ISAMPL}}(j,k)$ appears in this document, it refers to the array element
${\mathbf{isampl}}\left[(k-1)\times {\mathbf{pdisampl}}+j-1\right]$.
On exit: the selected samples.
If ${\mathbf{otype}}=1$ then each column of ISAMPL contains the $m$ values from $V$ that make up the sample. If ${\mathbf{otype}}=2$ then ${\mathbf{ISAMPL}}(j,k)$ contains the number of times that ${V}_{j}$ appears in the $k$th sample.
9: $\mathbf{pdisampl}$ – IntegerInput
On entry: the stride separating matrix row elements in the array isampl.
Constraints:
if ${\mathbf{otype}}=1$, ${\mathbf{pdisampl}}\ge {\mathbf{m}}$;
Note: the dimension, $\mathit{dim}$, of this array is dictated by the requirements of associated functions that must have been previously called. This array MUST be the same array passed as argument state in the previous call to nag_rand_init_repeatable (g05kfc) or nag_rand_init_nonrepeatable (g05kgc).
On entry: contains information on the selected base generator and its current state.
On exit: contains updated information on the state of the generator.
11: $\mathbf{fail}$ – NagError *Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).
6Error Indicators and Warnings
NE_ALLOC_FAIL
Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_ARRAY_SIZE
On entry, ${\mathbf{m}}=\u27e8\mathit{\text{value}}\u27e9$, ${\mathbf{otype}}=1$
and ${\mathbf{pdisampl}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{pdisampl}}\ge {\mathbf{m}}$.
On entry, ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$, ${\mathbf{otype}}=2$
and ${\mathbf{pdisampl}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{pdisampl}}\ge {\mathbf{n}}$.
NE_BAD_PARAM
On entry, argument $\u27e8\mathit{\text{value}}\u27e9$ had an illegal value.
NE_INT
On entry, ${\mathbf{m}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{m}}\ge 0$.
On entry, ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{n}}\ge 0$.
On entry, ${\mathbf{nrs}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{nrs}}\ge 0$.
On entry, ${\mathbf{otype}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{otype}}=1$ or $2$.
On entry, ${\mathbf{rtype}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{rtype}}=1$, $2$ or $3$.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_INVALID_STATE
On entry, state vector has been corrupted or not initialized.
NE_NEG_WEIGHT
On entry, $i=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{wt}}\left[i-1\right]=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{wt}}\left[i-1\right]\ge 0.0$.
NE_NO_LICENCE
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
NE_NON_ZERO_WEIGHTS
On entry, all the weights are zero.
Constraint: at least one weight must be nonzero.
NW_POTENTIAL_PROBLEM
There was no random component to the sample. Check the sample size and weights are as expected.
Specifically, check that more than one weight is nonzero.
If ${\mathbf{rtype}}=3$, also check the combination of m and weights.
7Accuracy
Not applicable.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
g05nfc is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
9Further Comments
It should be noted that whilst a given sample is a random selection from $V$, the ordering of the sample within isampl may not be. For example, when ${\mathbf{otype}}=1$ the values returned are likely to be in the same order that the values appear in $V$. If it is important that the returned values represent a random sample from $V$ rather than a ordered random sample then each sample should be randomly permuted via a subsequent call to g05ncc. The same applies to the order in which multiple samples are returned. One consequence of this is that if you call g05nfc once with ${\mathbf{nrs}}=1$, say, and then again (using the same initial values for state), with ${\mathbf{nrs}}=2$ the first column of ISAMPL may not be the same in both cases since, on the second call, the sample from the first call may be returned in the second column rather than the first.