NAG FL Interfaceg02qff (quantile_​linreg_​easy)

▸▿ Contents

Settings help

FL Name Style:

FL Specification Language:

1Purpose

g02qff performs a multiple linear quantile regression, returning the parameter estimates and associated confidence limits based on an assumption of Normal, independent, identically distributed errors. g02qff is a simplified version of g02qgf.

2Specification

Fortran Interface
 Subroutine g02qff ( n, m, x, y, ntau, tau, df, b, bl, bu, info,
 Integer, Intent (In) :: n, m, ntau Integer, Intent (Inout) :: ifail Integer, Intent (Out) :: info(ntau) Real (Kind=nag_wp), Intent (In) :: x(n,m), y(n), tau(ntau) Real (Kind=nag_wp), Intent (Out) :: df, b(m,ntau), bl(m,ntau), bu(m,ntau)
#include <nag.h>
 void g02qff_ (const Integer *n, const Integer *m, const double x[], const double y[], const Integer *ntau, const double tau[], double *df, double b[], double bl[], double bu[], Integer info[], Integer *ifail)
The routine may be called by the names g02qff or nagf_correg_quantile_linreg_easy.

3Description

Given a vector of $n$ observed values, $y=\left\{{y}_{i}:i=1,2,\dots ,n\right\}$, an $n×p$ design matrix $X$, a column vector, $x$, of length $p$ holding the $i$th row of $X$ and a quantile $\tau \in \left(0,1\right)$, g02qff estimates the $p$-element vector $\beta$ as the solution to
 $minimize β ∈ ℝ p ∑ i=1 n ρ τ ( y i -xiTβ)$ (1)
where ${\rho }_{\tau }$ is the piecewise linear loss function ${\rho }_{\tau }\left(z\right)=z\left(\tau -I\left(z<0\right)\right)$, and $I\left(z<0\right)$ is an indicator function taking the value $1$ if $z<0$ and $0$ otherwise.
g02qff assumes Normal, independent, identically distributed (IID) errors and calculates the asymptotic covariance matrix from
 $Σ = τ (1-τ) n (s(τ)) 2 (XTX)-1$
where $s$ is the sparsity function, which is estimated from the residuals, ${r}_{i}={y}_{i}-{x}_{i}^{\mathrm{T}}\stackrel{^}{\beta }$ (see Koenker (2005)).
Given an estimate of the covariance matrix, $\stackrel{^}{\Sigma }$, lower, ${\stackrel{^}{\beta }}_{L}$, and upper, ${\stackrel{^}{\beta }}_{U}$, limits for a $95%$ confidence interval are calculated for each of the $p$ parameters, via
 $β^ Li = β^ i - t n-p , 0.975 Σ^ ii , β^ Ui = β^ i + t n-p , 0.975 Σ^ ii$
where ${t}_{n-p,0.975}$ is the $97.5$ percentile of the Student's $t$ distribution with $n-k$ degrees of freedom, where $k$ is the rank of the cross-product matrix ${X}^{\mathrm{T}}X$.
Further details of the algorithms used by g02qff can be found in the documentation for g02qgf.

4References

Koenker R (2005) Quantile Regression Econometric Society Monographs, Cambridge University Press, New York

5Arguments

1: $\mathbf{n}$Integer Input
On entry: $n$, the number of observations in the dataset.
Constraint: ${\mathbf{n}}\ge 2$.
2: $\mathbf{m}$Integer Input
On entry: $p$, the number of variates in the model.
Constraint: $1\le {\mathbf{m}}<{\mathbf{n}}$.
3: $\mathbf{x}\left({\mathbf{n}},{\mathbf{m}}\right)$Real (Kind=nag_wp) array Input
On entry: $X$, the design matrix, with the $\mathit{i}$th value for the $\mathit{j}$th variate supplied in ${\mathbf{x}}\left(\mathit{i},\mathit{j}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{n}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{m}}$.
4: $\mathbf{y}\left({\mathbf{n}}\right)$Real (Kind=nag_wp) array Input
On entry: $y$, the observations on the dependent variable.
5: $\mathbf{ntau}$Integer Input
On entry: the number of quantiles of interest.
Constraint: ${\mathbf{ntau}}\ge 1$.
6: $\mathbf{tau}\left({\mathbf{ntau}}\right)$Real (Kind=nag_wp) array Input
On entry: the vector of quantiles of interest. A separate model is fitted to each quantile.
Constraint: $\sqrt{\epsilon }<{\mathbf{tau}}\left(\mathit{l}\right)<1-\sqrt{\epsilon }$ where $\epsilon$ is the machine precision returned by x02ajf, for $\mathit{l}=1,2,\dots ,{\mathbf{ntau}}$.
7: $\mathbf{df}$Real (Kind=nag_wp) Output
On exit: the degrees of freedom given by $n-k$, where $n$ is the number of observations and $k$ is the rank of the cross-product matrix ${X}^{\mathrm{T}}X$.
8: $\mathbf{b}\left({\mathbf{m}},{\mathbf{ntau}}\right)$Real (Kind=nag_wp) array Output
On exit: $\stackrel{^}{\beta }$, the estimates of the parameters of the regression model, with ${\mathbf{b}}\left(j,l\right)$ containing the coefficient for the variable in column $j$ of x, estimated for $\tau ={\mathbf{tau}}\left(l\right)$.
9: $\mathbf{bl}\left({\mathbf{m}},{\mathbf{ntau}}\right)$Real (Kind=nag_wp) array Output
On exit: ${\stackrel{^}{\beta }}_{L}$, the lower limit of a $95%$ confidence interval for $\stackrel{^}{\beta }$, with ${\mathbf{bl}}\left(j,l\right)$ holding the lower limit associated with ${\mathbf{b}}\left(j,l\right)$.
10: $\mathbf{bu}\left({\mathbf{m}},{\mathbf{ntau}}\right)$Real (Kind=nag_wp) array Output
On exit: ${\stackrel{^}{\beta }}_{U}$, the upper limit of a $95%$ confidence interval for $\stackrel{^}{\beta }$, with ${\mathbf{bu}}\left(j,l\right)$ holding the upper limit associated with ${\mathbf{b}}\left(j,l\right)$.
11: $\mathbf{info}\left({\mathbf{ntau}}\right)$Integer array Output
On exit: ${\mathbf{info}}\left(l\right)$ holds additional information concerning the model fitting and confidence limit calculations when $\tau ={\mathbf{tau}}\left(l\right)$.
Code Warning
$0$ Model fitted and confidence limits calculated successfully.
$1$ The routine did not converge whilst calculating the parameter estimates. The returned values are based on the estimate at the last iteration.
$2$ A singular matrix was encountered during the optimization. The model was not fitted for this value of $\tau$.
$8$ The routine did not converge whilst calculating the confidence limits. The returned limits are based on the estimate at the last iteration.
$16$ Confidence limits for this value of $\tau$ could not be calculated. The returned upper and lower limits are set to a large positive and large negative value respectively.
It is possible for multiple warnings to be applicable to a single model. In these cases the value returned in info is the sum of the corresponding individual nonzero warning codes.
12: $\mathbf{ifail}$Integer Input/Output
On entry: ifail must be set to $0$, $-1$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $-1$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $-1$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).

6Error Indicators and Warnings

If on entry ${\mathbf{ifail}}=0$ or $-1$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=11$
On entry, ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{n}}\ge 2$.
${\mathbf{ifail}}=21$
On entry, ${\mathbf{m}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: $1\le {\mathbf{m}}<{\mathbf{n}}$.
${\mathbf{ifail}}=51$
On entry, ${\mathbf{ntau}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{ntau}}\ge 1$.
${\mathbf{ifail}}=61$
On entry, ${\mathbf{tau}}\left(⟨\mathit{\text{value}}⟩\right)=⟨\mathit{\text{value}}⟩$.
Constraint: $\sqrt{\epsilon }<{\mathbf{tau}}\left(\mathit{l}\right)<1-\sqrt{\epsilon }$ where $\epsilon$ is the machine precision returned by x02ajf, for all ntau.
${\mathbf{ifail}}=111$
A potential problem occurred whilst fitting the model(s).
Additional information has been returned in info.
${\mathbf{ifail}}=-99$
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

Not applicable.

8Parallelism and Performance

g02qff is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g02qff makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

Calling g02qff is equivalent to calling g02qgf with
• ${\mathbf{sorder}}=1$,
• ${\mathbf{intcpt}}=\text{'N'}$,
• ${\mathbf{weight}}=\text{'U'}$,
• ${\mathbf{lddat}}={\mathbf{n}}$,
• setting each element of isx to $1$,
• ${\mathbf{ip}}={\mathbf{m}}$,
• ${\mathbf{Interval Method}}=\mathrm{IID}$, and
• ${\mathbf{Significance Level}}=0.95$.

10Example

A quantile regression model is fitted to Engels 1857 study of household expenditure on food. The model regresses the dependent variable, household food expenditure, against household income. An intercept is included in the model by augmenting the dataset with a column of ones.

10.1Program Text

Program Text (g02qffe.f90)

10.2Program Data

Program Data (g02qffe.d)

10.3Program Results

Program Results (g02qffe.r)