NAG FL Interface
g08rbf (rank_​regsn_​censored)

Settings help

FL Name Style:


FL Specification Language:


1 Purpose

g08rbf calculates the parameter estimates, score statistics and their variance-covariance matrices for the linear model using a likelihood based on the ranks of the observations when some of the observations may be right-censored.

2 Specification

Fortran Interface
Subroutine g08rbf ( ns, nv, nsum, y, ip, x, ldx, icen, gamma, nmax, tol, prvr, ldprvr, irank, zin, eta, vapvec, parest, work, lwork, iwa, ifail)
Integer, Intent (In) :: ns, nv(ns), nsum, ip, ldx, icen(nsum), nmax, ldprvr, lwork
Integer, Intent (Inout) :: ifail
Integer, Intent (Out) :: irank(nmax), iwa(0)
Real (Kind=nag_wp), Intent (In) :: y(nsum), x(ldx,ip), gamma, tol
Real (Kind=nag_wp), Intent (Inout) :: prvr(ldprvr,ip)
Real (Kind=nag_wp), Intent (Out) :: zin(nmax), eta(nmax), vapvec(nmax*(nmax+1)/2), parest(4*ip+1), work(0)
C Header Interface
#include <nag.h>
void  g08rbf_ (const Integer *ns, const Integer nv[], const Integer *nsum, const double y[], const Integer *ip, const double x[], const Integer *ldx, const Integer icen[], const double *gamma, const Integer *nmax, const double *tol, double prvr[], const Integer *ldprvr, Integer irank[], double zin[], double eta[], double vapvec[], double parest[], double work[], const Integer *lwork, Integer iwa[], Integer *ifail)
The routine may be called by the names g08rbf or nagf_nonpar_rank_regsn_censored.

3 Description

Analysis of data can be made by replacing observations by their ranks. The analysis produces inference for the regression model where the location parameters of the observations, θi, for i=1,2,,n, are related by θ=Xβ. Here X is an n×p matrix of explanatory variables and β is a vector of p unknown regression parameters. The observations are replaced by their ranks and an approximation, based on Taylor's series expansion, made to the rank marginal likelihood. For details of the approximation see Pettitt (1982).
An observation is said to be right-censored if we can only observe Yj* with Yj*Yj. We rank censored and uncensored observations as follows. Suppose we can observe Yj, for j=1,2,,n, directly but Yj*, for j=n+1,,q and nq, are censored on the right. We define the rank rj of Yj, for j=1,2,,n, in the usual way; rj equals i if and only if Yj is the ith smallest amongst the Y1,Y2,,Yn. The right-censored Yj*, for j=n+1,n+2,,q, has rank rj if and only if Yj* lies in the interval [Y(rj),Y(rj+1)], with Y0=-, Y(n+1)=+ and Y(1)<<Y(n) the ordered Yj, for j=1,2,,n.
The distribution of the Y is assumed to be of the following form. Let FL (y)=ey/(1+ey), the logistic distribution function, and consider the distribution function Fγ(y) defined by 1-Fγ=[1-FL(y)] 1/γ . This distribution function can be thought of as either the distribution function of the minimum, X1,γ, of a random sample of size γ-1 from the logistic distribution, or as the Fγ(y-logγ) being the distribution function of a random variable having the F-distribution with 2 and 2γ-1 degrees of freedom. This family of generalized logistic distribution functions [Fγ(.);0γ<] naturally links the symmetric logistic distribution (γ=1) with the skew extreme value distribution (limγ0) and with the limiting negative exponential distribution (limγ). For this family explicit results are available for right-censored data. See Pettitt (1983) for details.
Let lR denote the logarithm of the rank marginal likelihood of the observations and define the q×1 vector a by a=lR(θ=0), and let the q×q diagonal matrix B and q×q symmetric matrix A be given by B-A=-lR(θ=0). Then various statistics can be found from the analysis.
  1. (a)The score statistic XTa. This statistic is used to test the hypothesis H0:β=0 (see (e)).
  2. (b)The estimated variance-covariance matrix of the score statistic in (a).
  3. (c)The estimate β^R=MXTa.
  4. (d)The estimated variance-covariance matrix M=(XT(B-A)X) −1 of the estimate β^R.
  5. (e)The χ2 statistic Q=β^RM-1​ ​β^r=aTX(XT(B-A)X) −1XTa, used to test H0:β=0. Under H0, Q has an approximate χ2-distribution with p degrees of freedom.
  6. (f)The standard errors Mii 1/2 of the estimates given in (c).
  7. (g)Approximate z-statistics, i.e., Zi=β^Ri/se(β^Ri) for testing H0:βi=0. For i=1,2,,n, Zi has an approximate N(0,1) distribution.
In many situations, more than one sample of observations will be available. In this case we assume the model,
hk (Yk) = XkT β+ek ,   k=1,2,,ns ,  
where ns is the number of samples. In an obvious manner, Yk and Xk are the vector of observations and the design matrix for the kth sample respectively. Note that the arbitrary transformation hk can be assumed different for each sample since observations are ranked within the sample.
The earlier analysis can be extended to give a combined estimate of β as β^=Dd, where
D-1=k=1nsXT(Bk-Ak)Xk  
and
d=k= 1ns XkT ak ,  
with ak, Bk and Ak defined as a, B and A above but for the kth sample.
The remaining statistics are calculated as for the one sample case.

4 References

Kalbfleisch J D and Prentice R L (1980) The Statistical Analysis of Failure Time Data Wiley
Pettitt A N (1982) Inference for the linear model using a likelihood based on ranks J. Roy. Statist. Soc. Ser. B 44 234–243
Pettitt A N (1983) Approximate methods using ranks for regression with censored data Biometrika 70 121–132

5 Arguments

1: ns Integer Input
On entry: the number of samples.
Constraint: ns1.
2: nv(ns) Integer array Input
On entry: the number of observations in the ith sample, for i=1,2,,ns.
Constraint: nv(i)1, for i=1,2,,ns.
3: nsum Integer Input
On entry: the total number of observations.
Constraint: nsum= i=1 ns nv(i) .
4: y(nsum) Real (Kind=nag_wp) array Input
On entry: the observations in each sample. Specifically, y( k=1 i-1 nv(k)+j ) must contain the jth observation in the ith sample.
5: ip Integer Input
On entry: the number of parameters to be fitted.
Constraint: ip1.
6: x(ldx,ip) Real (Kind=nag_wp) array Input
On entry: the design matrices for each sample. Specifically, x( k=1 i-1 nv(k) + j ,l) must contain the value of the lth explanatory variable for the jth observations in the ith sample.
Constraint: x must not contain a column with all elements equal.
7: ldx Integer Input
On entry: the first dimension of the array x as declared in the (sub)program from which g08rbf is called.
Constraint: ldxnsum.
8: icen(nsum) Integer array Input
On entry: defines the censoring variable for the observations in y.
icen(i)=0
If y(i) is uncensored.
icen(i)=1
If y(i) is censored.
Constraint: icen(i)=0 or 1, for i=1,2,,nsum.
9: gamma Real (Kind=nag_wp) Input
On entry: the value of the parameter defining the generalized logistic distribution. For gamma0.0001, the limiting extreme value distribution is assumed.
Constraint: gamma0.0.
10: nmax Integer Input
On entry: the value of the largest sample size.
Constraint: nmax=max1ins(nv(i)) and nmax>ip.
11: tol Real (Kind=nag_wp) Input
On entry: the tolerance for judging whether two observations are tied. Thus, observations Yi and Yj are adjudged to be tied if |Yi-Yj|<tol.
Constraint: tol>0.0.
12: prvr(ldprvr,ip) Real (Kind=nag_wp) array Output
On exit: the variance-covariance matrices of the score statistics and the parameter estimates, the former being stored in the upper triangle and the latter in the lower triangle. Thus for 1ijip, prvr(i,j) contains an estimate of the covariance between the ith and jth score statistics. For 1jiip-1, prvr(i+1,j) contains an estimate of the covariance between the ith and jth parameter estimates.
13: ldprvr Integer Input
On entry: the first dimension of the array prvr as declared in the (sub)program from which g08rbf is called.
Constraint: ldprvrip+1.
14: irank(nmax) Integer array Output
On exit: for the one sample case, irank contains the ranks of the observations.
15: zin(nmax) Real (Kind=nag_wp) array Output
On exit: for the one sample case, zin contains the expected values of the function g(.) of the order statistics.
16: eta(nmax) Real (Kind=nag_wp) array Output
On exit: for the one sample case, eta contains the expected values of the function g(.) of the order statistics.
17: vapvec(nmax×(nmax+1)/2) Real (Kind=nag_wp) array Output
On exit: for the one sample case, vapvec contains the upper triangle of the variance-covariance matrix of the function g(.) of the order statistics stored column-wise.
18: parest(4×ip+1) Real (Kind=nag_wp) array Output
On exit: the statistics calculated by the routine.
The first ip components of parest contain the score statistics.
The next ip elements contain the parameter estimates.
parest(2×ip+1) contains the value of the χ2 statistic.
The next ip elements of parest contain the standard errors of the parameter estimates.
Finally, the remaining ip elements of parest contain the z-statistics.
19: work(0) Real (Kind=nag_wp) array Output
20: lwork Integer Input
21: iwa(0) Integer array Output
On entry: are no longer required by g08rbf but is retained for backwards compatibility.
22: ifail Integer Input/Output
On entry: ifail must be set to 0, −1 or 1 to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of 0 causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of −1 means that an error message is printed while a value of 1 means that it is not.
If halting is not appropriate, the value −1 or 1 is recommended. If message printing is undesirable, then the value 1 is recommended. Otherwise, the value 0 is recommended. When the value -1 or 1 is used it is essential to test the value of ifail on exit.
On exit: ifail=0 unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry ifail=0 or −1, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
ifail=1
On entry, value elements of nv​ are ​<1.
Constraint: nv(i)1.
On entry, gamma=value.
Constraint: gamma0.0.
On entry, ip=value.
Constraint: ip1.
On entry, ldprvr=value and ip=value.
Constraint: ldprvrip+1.
On entry, ldx=value and nsum=value.
Constraint: ldxnsum.
On entry, maxinv(i)=value and nmax=value.
Constraint: maxinv(i)=nmax.
On entry, nmax=value and ip=value.
Constraint: nmax>ip.
On entry, ns=value.
Constraint: ns1.
On entry, tol=value.
Constraint: tol>0.0.
On entry, inv(i)=value and nsum=value.
Constraint: inv(i)=nsum.
ifail=2
On entry, value elements of icen are out of range.
Constraint: icen(i)=0 or 1, for all i.
ifail=3
On entry, all the observations were adjudged to be tied. You are advised to check the value supplied for tol.
ifail=4
The matrix XT(B-A)X is either ill-conditioned or not positive definite. This error should only occur with extreme rankings of the data.
ifail=5
On entry, for j=value, x(i,j)=value for all i.
Constraint: x(i,j)x(i+1,j) for at least one i.
ifail=-99
An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
ifail=-399
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
ifail=-999
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

The computations are believed to be stable.

8 Parallelism and Performance

g08rbf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g08rbf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

The time taken by g08rbf depends on the number of samples, the total number of observations and the number of parameters fitted.
In extreme cases the parameter estimates for certain models can be infinite, although this is unlikely to occur in practice. See Pettitt (1982) for further details.

10 Example

This example fits a regression model to a single sample of 40 observations using just one explanatory variable.

10.1 Program Text

Program Text (g08rbfe.f90)

10.2 Program Data

Program Data (g08rbfe.d)

10.3 Program Results

Program Results (g08rbfe.r)