NAG Library Function Document
nag_rank_ci_1var (g07eac)
1 Purpose
nag_rank_ci_1var (g07eac) computes a rank based (nonparametric) estimate and confidence interval for the location argument of a single population.
2 Specification
#include <nag.h> 
#include <nagg07.h> 
void 
nag_rank_ci_1var (Nag_RCIMethod method,
Integer n,
const double x[],
double clevel,
double *theta,
double *thetal,
double *thetau,
double *estcl,
double *wlower,
double *wupper,
NagError *fail) 

3 Description
Consider a vector of independent observations,
$x={\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)}^{\mathrm{T}}$ with unknown common symmetric density
$f\left({x}_{i}\theta \right)$. nag_rank_ci_1var (g07eac) computes the Hodges–Lehmann location estimator (see
Lehmann (1975)) of the centre of symmetry
$\theta $, together with an associated confidence interval. The Hodges–Lehmann estimate is defined as
Let
$m=\left(n\left(n+1\right)\right)/2$ and let
${a}_{\mathit{k}}$, for
$\mathit{k}=1,2,\dots ,m$ denote the
$m$ ordered averages
$\left({x}_{i}+{x}_{j}\right)/2$ for
$1\le i\le j\le n$. Then
 if $m$ is odd, $\hat{\theta}={a}_{k}$ where $k=\left(m+1\right)/2$;
 if $m$ is even, $\hat{\theta}=\left({a}_{k}+{a}_{k+1}\right)/2$ where $k=m/2$.
This estimator arises from inverting the onesample Wilcoxon signedrank test statistic,
$W\left(x{\theta}_{0}\right)$, for testing the hypothesis that
$\theta ={\theta}_{0}$. Effectively
$W\left(x{\theta}_{0}\right)$ is a monotonically decreasing step function of
${\theta}_{0}$ with
The estimate
$\hat{\theta}$ is the solution to the equation
$W\left(x\hat{\theta}\right)=\mu $; two methods are available for solving this equation. These methods avoid the computation of all the ordered averages
${a}_{k}$; this is because for large
$n$ both the storage requirements and the computation time would be excessive.
The first is an exact method based on a set partitioning procedure on the set of all ordered averages
$\left({x}_{i}+{x}_{j}\right)/2$ for
$i\le j$. This is based on the algorithm proposed by
Monahan (1984).
The second is an iterative algorithm, based on the Illinois method which is a modification of the
regula falsi method, see
McKean and Ryan (1977). This algorithm has proved suitable for the function
$W\left(x{\theta}_{0}\right)$ which is asymptotically linear as a function of
${\theta}_{0}$.
The confidence interval limits are also based on the inversion of the Wilcoxon test statistic.
Given a desired percentage for the confidence interval,
$1\alpha $, expressed as a proportion between
$0$ and
$1$, initial estimates for the lower and upper confidence limits of the Wilcoxon statistic are found from
and
where
${\Phi}^{1}$ is the inverse cumulative Normal distribution function.
${W}_{l}$ and
${W}_{u}$ are rounded to the nearest integer values. These estimates are then refined using an exact method if
$n\le 80$, and a Normal approximation otherwise, to find
${W}_{l}$ and
${W}_{u}$ satisfying
and
Let
${W}_{u}=mk$; then
${\theta}_{l}={a}_{k+1}$. This is the largest value
${\theta}_{l}$ such that
$W\left(x{\theta}_{l}\right)={W}_{u}$.
Let ${W}_{l}=k$; then ${\theta}_{u}={a}_{mk}$. This is the smallest value ${\theta}_{u}$ such that $W\left(x{\theta}_{u}\right)={W}_{l}$.
As in the case of $\hat{\theta}$, these equations may be solved using either the exact or the iterative methods to find the values ${\theta}_{l}$ and ${\theta}_{u}$.
Then $\left({\theta}_{l},{\theta}_{u}\right)$ is the confidence interval for $\theta $. The confidence interval is thus defined by those values of ${\theta}_{0}$ such that the null hypothesis, $\theta ={\theta}_{0}$, is not rejected by the Wilcoxon signedrank test at the $\left(100\times \alpha \right)\%$ level.
4 References
Lehmann E L (1975) Nonparametrics: Statistical Methods Based on Ranks Holden–Day
Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne
McKean J W and Ryan T A (1977) Algorithm 516: An algorithm for obtaining confidence intervals and point estimates based on ranks in the twosample location problem ACM Trans. Math. Software 10 183–185
Monahan J F (1984) Algorithm 616: Fast computation of the Hodges–Lehman location estimator ACM Trans. Math. Software 10 265–270
5 Arguments
 1:
method – Nag_RCIMethodInput
On entry: specifies the method to be used.
 ${\mathbf{method}}=\mathrm{Nag\_RCI\_Exact}$
 The exact algorithm is used.
 ${\mathbf{method}}=\mathrm{Nag\_RCI\_Approx}$
 The iterative algorithm is used.
Constraint:
${\mathbf{method}}=\mathrm{Nag\_RCI\_Exact}$ or $\mathrm{Nag\_RCI\_Approx}$.
 2:
n – IntegerInput
On entry:
$n$, the sample size.
Constraint:
${\mathbf{n}}\ge 2$.
 3:
x[n] – const doubleInput
On entry: the sample observations, ${x}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$.
 4:
clevel – doubleInput
On entry: the confidence interval desired.
For example, for a $95\%$ confidence interval set ${\mathbf{clevel}}=0.95$.
Constraint:
$0.0<{\mathbf{clevel}}<1.0$.
 5:
theta – double *Output
On exit: the estimate of the location, $\hat{\theta}$.
 6:
thetal – double *Output

On exit: the estimate of the lower limit of the confidence interval, ${\theta}_{l}$.
 7:
thetau – double *Output

On exit: the estimate of the upper limit of the confidence interval, ${\theta}_{u}$.
 8:
estcl – double *Output

On exit: an estimate of the actual percentage confidence of the interval found, as a proportion between $\left(0.0,1.0\right)$.
 9:
wlower – double *Output
On exit: the upper value of the Wilcoxon test statistic, ${W}_{u}$, corresponding to the lower limit of the confidence interval.
 10:
wupper – double *Output
On exit: the lower value of the Wilcoxon test statistic, ${W}_{l}$, corresponding to the upper limit of the confidence interval.
 11:
fail – NagError *Input/Output

The NAG error argument (see
Section 3.6 in the Essential Introduction).
6 Error Indicators and Warnings
 NE_ALLOC_FAIL
Dynamic memory allocation failed.
 NE_BAD_PARAM
On entry, argument $\u2329\mathit{\text{value}}\u232a$ had an illegal value.
 NE_CONVERGENCE
Warning. The iterative procedure to find an estimate of the lower confidence point had not converged in $100$ iterations.
Warning. The iterative procedure to find an estimate of Theta had not converged in $100$ iterations.
Warning. The iterative procedure to find an estimate of the upper confidence point had not converged in $100$ iterations.
 NE_INT
On entry, ${\mathbf{n}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n}}\ge 2$.
 NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact
NAG for assistance.
 NE_REAL
On entry,
clevel is out of range:
${\mathbf{clevel}}=\u2329\mathit{\text{value}}\u232a$.
 NE_SAMPLE_IDEN
Not enough information to compute an interval estimate since the whole sample is identical. The common value is returned in
theta,
thetal and
thetau.
7 Accuracy
nag_rank_ci_1var (g07eac) should produce results accurate to five significant figures in the width of the confidence interval; that is the error for any one of the three estimates should be less than $0.00001\times \left({\mathbf{thetau}}{\mathbf{thetal}}\right)$.
The time taken increases with the sample size $n$.
9 Example
The following program calculates a 95% confidence interval for $\theta $, a measure of symmetry of the sample of $50$ observations.
9.1 Program Text
Program Text (g07eace.c)
9.2 Program Data
Program Data (g07eace.d)
9.3 Program Results
Program Results (g07eace.r)