naginterfaces.library.nonpar.rank_regsn¶

naginterfaces.library.nonpar.rank_regsn(nv, y, x, idist, nmax, tol)[source]¶

rank_regsn calculates the parameter estimates, score statistics and their variance-covariance matrices for the linear model using a likelihood based on the ranks of the observations.

For full information please refer to the NAG Library document for g08ra

https://support.nag.com/numeric/nl/nagdoc_30/flhtml/g08/g08raf.html

Parameters

nvint, array-like, shape $(ns)$

The number of observations in the $i$ th sample, for $i = 1, 2, \dots, ns$ .

yfloat, array-like, shape $(s u m (n v))$

The observations in each sample. Specifically, $y [\sum_{k = 1}^{i - 1} n v [k - 1] + j - 1]$ must contain the $j$ th observation in the $i$ th sample.

xfloat, array-like, shape $(nsum, ip)$

The design matrices for each sample. Specifically, $x [\sum_{k = 1}^{i - 1} n v [k - 1] + j - 1, l - 1]$ must contain the value of the $l$ th explanatory variable for the $j$ th observation in the $i$ th sample.

idistint

The error distribution to be used in the analysis.

$i d i s t = 1$

Normal.

$i d i s t = 2$

Logistic.

$i d i s t = 3$

Extreme value.

$i d i s t = 4$

Double-exponential.

nmaxint

The value of the largest sample size.

tolfloat

The tolerance for judging whether two observations are tied. Thus, observations $Y_{i}$ and $Y_{j}$ are adjudged to be tied if $∣ ∣ Y_{i} - Y_{j} ∣ ∣ < t o l$ .

Returns

prvrfloat, ndarray, shape $(ip + 1, ip)$

The variance-covariance matrices of the score statistics and the parameter estimates, the former being stored in the upper triangle and the latter in the lower triangle. Thus for $1 \leq i \leq j \leq ip$ , $p r v r [i - 1, j - 1]$ contains an estimate of the covariance between the $i$ th and $j$ th score statistics. For $1 \leq j \leq i \leq ip - 1$ , $p r v r [i, j - 1]$ contains an estimate of the covariance between the $i$ th and $j$ th parameter estimates.

irankint, ndarray, shape $(n m a x)$

For the one sample case, $i r a n k$ contains the ranks of the observations.

zinfloat, ndarray, shape $(n m a x)$

For the one sample case, $z i n$ contains the expected values of the function $g (.)$ of the order statistics.

etafloat, ndarray, shape $(n m a x)$

For the one sample case, $e t a$ contains the expected values of the function $g' (.)$ of the order statistics.

vapvecfloat, ndarray, shape $(n m a x \times (n m a x + 1) / 2)$

For the one sample case, $v a p v e c$ contains the upper triangle of the variance-covariance matrix of the function $g (.)$ of the order statistics stored column-wise.

parestfloat, ndarray, shape $(4 \times ip + 1)$

The statistics calculated by the function.

The first $ip$ components of $p a r e s t$ contain the score statistics.

The next $ip$ elements contain the parameter estimates.

$p a r e s t [2 \times ip]$ contains the value of the $χ^{2}$ statistic.

The next $ip$ elements of $p a r e s t$ contain the standard errors of the parameter estimates.

Finally, the remaining $ip$ elements of $p a r e s t$ contain the $z$ -statistics.

Raises

NagValueError

(errno $1$ )

On entry, ${max}_{i} (n v [i]) = ⟨ v a l u e ⟩$ and $n m a x = ⟨ v a l u e ⟩$ .

Constraint: ${max}_{i} (n v [i]) = n m a x$ .

(errno $1$ )

On entry, $\sum_{i} (n v [i]) = ⟨ v a l u e ⟩$ and $nsum = ⟨ v a l u e ⟩$ .

Constraint: $\sum_{i} (n v [i]) = nsum$ .

(errno $1$ )

On entry, $⟨ v a l u e ⟩$ elements of $n v are < 1$ .

Constraint: $n v [i] \geq 1$ .

(errno $1$ )

On entry, $n m a x = ⟨ v a l u e ⟩$ and $ip = ⟨ v a l u e ⟩$ .

Constraint: $n m a x > ip$ .

(errno $1$ )

On entry, $ip = ⟨ v a l u e ⟩$ .

Constraint: $ip \geq 1$ .

(errno $1$ )

On entry, $t o l = ⟨ v a l u e ⟩$ .

Constraint: $t o l > 0.0$ .

(errno $1$ )

On entry, $ns = ⟨ v a l u e ⟩$ .

Constraint: $ns \geq 1$ .

(errno $2$ )

On entry, $i d i s t = ⟨ v a l u e ⟩$ .

On entry, $i d i s t = 1$ , $2$ , $3$ or $4$ .

(errno $3$ )

On entry, all the observations were adjudged to be tied.

(errno $4$ )

The matrix $X^{T} (B - A) X$ is either ill-conditioned or not positive definite.

(errno $5$ )

On entry, for $j = ⟨ v a l u e ⟩$ , $x [i, j - 1] = ⟨ v a l u e ⟩$ for all $i$ .

Constraint: $x [i, j - 1] \neq x [i + 1, j - 1]$ for at least one $i$ .

Notes

Analysis of data can be made by replacing observations by their ranks. The analysis produces inference for regression parameters arising from the following model.

For random variables $Y_{1}, Y_{2}, \dots, Y_{n}$ we assume that, after an arbitrary monotone increasing differentiable transformation, $h (.)$ , the model

h (Y_{i}) = x_{i}^{T} β + ϵ_{i}

holds, where $x_{i}$ is a known vector of explanatory variables and $β$ is a vector of $p$ unknown regression coefficients. The $ϵ_{i}$ are random variables assumed to be independent and identically distributed with a completely known distribution which can be one of the following: Normal, logistic, extreme value or double-exponential. In Pettitt (1982) an estimate for $β$ is proposed as $^β = M X^{T} a$ with estimated variance-covariance matrix $M$ . The statistics $a$ and $M$ depend on the ranks $r_{i}$ of the observations $Y_{i}$ and the density chosen for $ϵ_{i}$ .

The matrix $X$ is the $n \times p$ matrix of explanatory variables. It is assumed that $X$ is of rank $p$ and that a column or a linear combination of columns of $X$ is not equal to the column vector of $1$ or a multiple of it. This means that a constant term cannot be included in the model (1). The statistics $a$ and $M$ are found as follows. Let $ϵ_{i}$ have pdf $f (ϵ)$ and let $g = - f^{'} / f$ . Let $W_{1}, W_{2}, \dots, W_{n}$ be order statistics for a random sample of size $n$ with the density $f (.)$ . Define $Z_{i} = g (W_{i})$ , then $a_{i} = E (Z_{r_{i}})$ . To define $M$ we need $M^{- 1} = X^{T} (B - A) X$ , where $B$ is an $n \times n$ diagonal matrix with $B_{i i} = E (g^{'} (W_{r_{i}}))$ and $A$ is a symmetric matrix with $A_{i j} = c o v (Z_{r_{i}}, Z_{r_{j}})$ . In the case of the Normal distribution, the $Z_{1} < \dots < Z_{n}$ are standard Normal order statistics and $E (g^{'} (W_{i})) = 1$ , for $i = 1, 2, \dots, n$ .

The analysis can also deal with ties in the data. Two observations are adjudged to be tied if $∣ ∣ Y_{i} - Y_{j} ∣ ∣ < t o l$ , where $t o l$ is a user-supplied tolerance level.

Various statistics can be found from the analysis:

The score statistic $X^{T} a$ . This statistic is used to test the hypothesis $H_{0} : β = 0$ , see (e).
The estimated variance-covariance matrix $X^{T} (B - A) X$ of the score statistic in (a).
The estimate $^β = M X^{T} a$ .
The estimated variance-covariance matrix $M = {(X^{T} (B - A) X)}^{- 1}$ of the estimate $^β$ .
The $χ^{2}$ statistic $Q = {^β}^{T} M^{- 1}^β = a^{T} X {(X^{T} (B - A) X)}^{- 1} X^{T} a$ used to test $H_{0} : β = 0$ . Under $H_{0}$ , $Q$ has an approximate $χ^{2}$ -distribution with $p$ degrees of freedom.
The standard errors $M_{i i}^{1 / 2}$ of the estimates given in (c).
Approximate $z$ -statistics, i.e., $Z_{i} = {^β}_{i} / s e ({^β}_{i})$ for testing $H_{0} : β_{i} = 0$ . For $i = 1, 2, \dots, n$ , $Z_{i}$ has an approximate $N (0, 1)$ distribution.

In many situations, more than one sample of observations will be available. In this case we assume the model

h_{k} (Y_{k}) = X_{k}^{T} β + e_{k}, k = 1, 2, \dots, ns,

where $ns$ is the number of samples. In an obvious manner, $Y_{k}$ and $X_{k}$ are the vector of observations and the design matrix for the $k$ th sample respectively. Note that the arbitrary transformation $h_{k}$ can be assumed different for each sample since observations are ranked within the sample.

The earlier analysis can be extended to give a combined estimate of $β$ as $^β = D d$ , where

D^{- 1} = ns \sum k = 1 X_{k}^{T} (B_{k} - A_{k}) X_{k}

and

d = ns \sum k = 1 X_{k}^{T} a_{k},

with $a_{k}$ , $B_{k}$ and $A_{k}$ defined as $a$ , $B$ and $A$ above but for the $k$ th sample.

The remaining statistics are calculated as for the one sample case.

References: Pettitt, A N, 1982, Inference for the linear model using a likelihood based on ranks, J. Roy. Statist. Soc. Ser. B (44), 234–243

NAG and Python

Return to Front

naginterfaces.library.nonpar.rank_regsn¶

naginterfaces.library.nonpar.rank_​regsn¶

naginterfaces.library.nonpar.rank_regsn¶