nag_rank_regsn (g08rac) (PDF version)
g08 Chapter Contents
g08 Chapter Introduction
NAG Library Manual

NAG Library Function Document

nag_rank_regsn (g08rac)

+ Contents

    1  Purpose
    7  Accuracy

1  Purpose

nag_rank_regsn (g08rac) calculates the parameter estimates, score statistics and their variance-covariance matrices for the linear model using a likelihood based on the ranks of the observations.

2  Specification

#include <nag.h>
#include <nagg08.h>
void  nag_rank_regsn (Nag_OrderType order, Integer ns, const Integer nv[], const double y[], Integer p, const double x[], Integer pdx, Integer idist, Integer nmax, double tol, double prvr[], Integer pdparvar, Integer irank[], double zin[], double eta[], double vapvec[], double parest[], NagError *fail)

3  Description

Analysis of data can be made by replacing observations by their ranks. The analysis produces inference for regression arguments arising from the following model.
For random variables Y1,Y2,,Yn we assume that, after an arbitrary monotone increasing differentiable transformation, h., the model
hYi= xiT β+εi (1)
holds, where xi is a known vector of explanatory variables and β is a vector of p unknown regression coefficients. The εi are random variables assumed to be independent and identically distributed with a completely known distribution which can be one of the following: Normal, logistic, extreme value or double-exponential. In Pettitt (1982) an estimate for β is proposed as β^=MXTa with estimated variance-covariance matrix M. The statistics a and M depend on the ranks ri of the observations Yi and the density chosen for εi.
The matrix X is the n by p matrix of explanatory variables. It is assumed that X is of rank p and that a column or a linear combination of columns of X is not equal to the column vector of 1 or a multiple of it. This means that a constant term cannot be included in the model (1). The statistics a and M are found as follows. Let εi have pdf fε and let g=-f/f. Let W1,W2,,Wn be order statistics for a random sample of size n with the density f.. Define Zi=gWi, then ai=EZri. To define M we need M-1=XTB-AX, where B is an n by n diagonal matrix with Bii=EgWri and A is a symmetric matrix with Aij=covZri,Zrj. In the case of the Normal distribution, the Z1<<Zn are standard Normal order statistics and EgWi=1, for i=1,2,,n.
The analysis can also deal with ties in the data. Two observations are adjudged to be tied if Yi-Yj<tol, where tol is a user-supplied tolerance level.
Various statistics can be found from the analysis:
(a) The score statistic XTa. This statistic is used to test the hypothesis H0:β=0, see (e).
(b) The estimated variance-covariance matrix XTB-AX of the score statistic in (a).
(c) The estimate β^=MXTa.
(d) The estimated variance-covariance matrix M=XTB-AX -1 of the estimate β^.
(e) The χ2 statistic Q=β^TM-1β^=aTXXTB-AX -1XTa used to test H0:β=0. Under H0, Q has an approximate χ2-distribution with p degrees of freedom.
(f) The standard errors Mii 1/2 of the estimates given in (c).
(g) Approximate z-statistics, i.e., Zi=β^i/seβ^i for testing H0:βi=0. For i=1,2,,n, Zi has an approximate N0,1 distribution.
In many situations, more than one sample of observations will be available. In this case we assume the model
hkYk= XkT β+ek,  k=1,2,,ns,
where ns is the number of samples. In an obvious manner, Yk and Xk are the vector of observations and the design matrix for the kth sample respectively. Note that the arbitrary transformation hk can be assumed different for each sample since observations are ranked within the sample.
The earlier analysis can be extended to give a combined estimate of β as β^=Dd, where
D-1=k=1ns XkT Bk-AkXk
and
d=k= 1ns XkT ak ,
with ak, Bk and Ak defined as a, B and A above but for the kth sample.
The remaining statistics are calculated as for the one sample case.

4  References

Pettitt A N (1982) Inference for the linear model using a likelihood based on ranks J. Roy. Statist. Soc. Ser. B 44 234–243

5  Arguments

1:     orderNag_OrderTypeInput
On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by order=Nag_RowMajor. See Section 3.2.1.3 in the Essential Introduction for a more detailed explanation of the use of this argument.
Constraint: order=Nag_RowMajor or Nag_ColMajor.
2:     nsIntegerInput
On entry: the number of samples.
Constraint: ns1.
3:     nv[ns]const IntegerInput
On entry: the number of observations in the ith sample, for i=1,2,,ns.
Constraint: nv[i]1, for i=0,1,,ns-1.
4:     y[dim]const doubleInput
Note: the dimension, dim, of the array y must be at least i=1 ns nv[i-1].
On entry: the observations in each sample. Specifically, y[ k=1 i-1 nv[k-1]+j-1]  must contain the jth observation in the ith sample.
5:     pIntegerInput
On entry: the number of parameters to be fitted.
Constraint: p1.
6:     x[dim]const doubleInput
Note: the dimension, dim, of the array x must be at least
  • max1,pdx×p when order=Nag_ColMajor;
  • max1, i=1 ns nv[i-1]×pdx when order=Nag_RowMajor.
Where Xi,j appears in this document, it refers to the array element
  • x[j-1×pdx+i-1] when order=Nag_ColMajor;
  • x[i-1×pdx+j-1] when order=Nag_RowMajor.
On entry: the design matrices for each sample. Specifically, X k=1 i-1 nv[k-1] +j ,l  must contain the value of the lth explanatory variable for the jth observation in the ith sample.
Constraint: x must not contain a column with all elements equal.
7:     pdxIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array x.
Constraints:
  • if order=Nag_ColMajor, pdx i=1 ns nv[i-1];
  • if order=Nag_RowMajor, pdxp.
8:     idistIntegerInput
On entry: the error distribution to be used in the analysis.
idist=1
Normal.
idist=2
Logistic.
idist=3
Extreme value.
idist=4
Double-exponential.
Constraint: 1idist4.
9:     nmaxIntegerInput
On entry: the value of the largest sample size.
Constraint: nmax=max1insnv[i-1] and nmax>p.
10:   toldoubleInput
On entry: the tolerance for judging whether two observations are tied. Thus, observations Yi and Yj are adjudged to be tied if Yi-Yj<tol.
Constraint: tol>0.0.
11:   prvr[dim]doubleOutput
Note: the dimension, dim, of the array prvr must be at least
  • max1,pdparvar×p when order=Nag_ColMajor;
  • max1,p+1×pdparvar when order=Nag_RowMajor.
Where PRVRi,j appears in this document, it refers to the array element
  • prvr[j-1×pdparvar+i-1] when order=Nag_ColMajor;
  • prvr[i-1×pdparvar+j-1] when order=Nag_RowMajor.
On exit: the variance-covariance matrices of the score statistics and the parameter estimates, the former being stored in the upper triangle and the latter in the lower triangle. Thus for 1ijp, PRVRi,j contains an estimate of the covariance between the ith and jth score statistics. For 1jip-1, PRVRi+1,j contains an estimate of the covariance between the ith and jth parameter estimates.
12:   pdparvarIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array prvr.
Constraints:
  • if order=Nag_ColMajor, pdparvarp+1;
  • if order=Nag_RowMajor, pdparvarp.
13:   irank[nmax]IntegerOutput
On exit: for the one sample case, irank contains the ranks of the observations.
14:   zin[nmax]doubleOutput
On exit: for the one sample case, zin contains the expected values of the function g. of the order statistics.
15:   eta[nmax]doubleOutput
On exit: for the one sample case, eta contains the expected values of the function g. of the order statistics.
16:   vapvec[nmax×nmax+1/2]doubleOutput
On exit: for the one sample case, vapvec contains the upper triangle of the variance-covariance matrix of the function g. of the order statistics stored column-wise.
17:   parest[4×p+1]doubleOutput
On exit: the statistics calculated by the function.
The first p components of parest contain the score statistics.
The next p elements contain the parameter estimates.
parest[2×p] contains the value of the χ2 statistic.
The next p elements of parest contain the standard errors of the parameter estimates.
Finally, the remaining p elements of parest contain the z-statistics.
18:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument value had an illegal value.
NE_INT
On entry, idist is outside the range 1 to 4: idist=value.
On entry, ns=value.
Constraint: ns1.
On entry, p=value.
Constraint: p1.
On entry, pdparvar=value.
Constraint: pdparvar>0.
On entry, pdx=value.
Constraint: pdx>0.
NE_INT_2
On entry, nmax=value and p=value.
Constraint: nmax>p.
On entry, pdparvar=value and p=value.
Constraint: pdparvarp.
On entry, pdparvar=value and p=value.
Constraint: pdparvarp+1.
On entry, pdx=value and p=value.
Constraint: pdxp.
On entry, pdx=value and sum nv[i-1]=value.
Constraint: pdx the sum of nv[i-1].
NE_INT_ARRAY
On entry, nv[i]=value and ns=value.
Constraint: nv[i]1, for i=0,1,,ns-1.
NE_INT_ARRAY_ELEM_CONS
On entry M=value.
Constraint: M elements of array nv>0.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_MAT_ILL_DEFINED
The matrix XTB-AX is either singular or non positive definite.
NE_OBSERVATIONS
All the observations were adjudged to be tied.
NE_REAL
On entry, tol=value.
Constraint: tol>0.0.
NE_REAL_ARRAY_ELEM_CONS
On entry, all elements in column value of x are equal to value.
NE_SAMPLE
The largest sample size is value which is not equal to nmax, nmax=value.

7  Accuracy

The computations are believed to be stable.

8  Parallelism and Performance

nag_rank_regsn (g08rac) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
nag_rank_regsn (g08rac) makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the Users' Note for your implementation for any additional implementation-specific information.

9  Further Comments

The time taken by nag_rank_regsn (g08rac) depends on the number of samples, the total number of observations and the number of arguments fitted.
In extreme cases the parameter estimates for certain models can be infinite, although this is unlikely to occur in practice. See Pettitt (1982) for further details.

10  Example

A program to fit a regression model to a single sample of 20 observations using two explanatory variables. The error distribution will be taken to be logistic.

10.1  Program Text

Program Text (g08race.c)

10.2  Program Data

Program Data (g08race.d)

10.3  Program Results

Program Results (g08race.r)


nag_rank_regsn (g08rac) (PDF version)
g08 Chapter Contents
g08 Chapter Introduction
NAG Library Manual

© The Numerical Algorithms Group Ltd, Oxford, UK. 2014