NAG CL Interface
g02gpc (glm_​predict)

Settings help

CL Name Style:


1 Purpose

g02gpc allows prediction from a generalized linear model fit via g02gac, g02gbc, g02gcc or g02gdc or a linear model fit via g02dac.

2 Specification

#include <nag.h>
void  g02gpc (Nag_Distributions errfn, Nag_Link link, Nag_IncludeMean mean, Integer n, const double x[], Integer tdx, Integer m, const Integer sx[], Integer ip, const double binom_t[], const double offset[], const double wt[], double scale, double ex_power, const double b[], const double cov[], Nag_Boolean vfobs, double eta[], double seeta[], double pred[], double sepred[], NagError *fail)
The function may be called by the names: g02gpc, nag_correg_glm_predict or nag_glm_predict.

3 Description

A generalized linear model consists of the following elements:
  1. (i)A suitable distribution for the dependent variable y.
  2. (ii)A linear model, with linear predictor η=Xβ, where X is a matrix of independent variables and β a column vector of p parameters.
  3. (iii)A link function g(.) between the expected value of y and the linear predictor, that is E(y)=μ=g(η).
In order to predict from a generalized linear model, that is estimate a value for the dependent variable, y, given a set of independent variables X, the matrix X must be supplied, along with values for the parameters β and their associated variance-covariance matrix, C. Suitable values for β and C are usually estimated by first fitting the prediction model to a training dataset with known responses, using for example g02gac, g02gbc, g02gcc or g02gdc. The predicted variable, and its standard error can then be obtained from:
y^ = g-1(η) ,   se(y^) = ( δg-1(x) δx ) η se(η) + Ifobs Var(y)  
where
η=o+Xβ ,   se(η) = diagXCXT ,  
o is a vector of offsets and Ifobs=0, if the variance of future observations is not taken into account, and 1 otherwise. Here diagA indicates the diagonal elements of matrix A.
If required, the variance for the ith future observation, Var(yi), can be calculated as:
Var(yi) = ϕ V(θ) wi  
where wi is a weight, ϕ is the scale (or dispersion) parameter, and V(θ) is the variance function. Both the scale parameter and the variance function depend on the distribution used for the y, with:
Poisson V(θ)=μi, ϕ=1
binomial V(θ)=μi(ti-μi)ti, ϕ=1
Normal V(θ)=1
gamma V(θ)=μi2
In the cases of a Normal and gamma error structure, the scale parameter (ϕ), is supplied by you. This value is usually obtained from the function used to fit the prediction model. In many cases, for a Normal error structure, ϕ=σ^2, i.e., the estimated variance.

4 References

McCullagh P and Nelder J A (1983) Generalized Linear Models Chapman and Hall

5 Arguments

1: errfn Nag_Distributions Input
On entry: indicates the distribution used to model the dependent variable, y.
errfn=Nag_Binomial
The binomial distribution is used.
errfn=Nag_Gamma
The gamma distribution is used.
errfn=Nag_Normal
The Normal (Gaussian) distribution is used.
errfn=Nag_Poisson
The Poisson distribution is used.
Constraint: errfn=Nag_Binomial, Nag_Gamma, Nag_Normal or Nag_Poisson.
On entry: indicates which link function is to be used.
link=Nag_Compl
A complementary log-log link is used.
link=Nag_Expo
An exponent link is used.
link=Nag_Logistic
A logistic link is used.
link=Nag_Iden
An identity link is used.
link=Nag_Log
A log link is used.
link=Nag_Probit
A probit link is used.
link=Nag_Reci
A reciprocal link is used.
link=Nag_Sqrt
A square root link is used.
Details on the functional form of the different links can be found in the G02 Chapter Introduction.
Constraints:
  • if errfn=Nag_Binomial, link=Nag_Compl, Nag_Logistic or Nag_Probit;
  • otherwise link=Nag_Expo, Nag_Iden, Nag_Log, Nag_Reci or Nag_Sqrt.
3: mean Nag_IncludeMean Input
On entry: indicates if a mean term is to be included.
mean=Nag_MeanInclude
A mean term, intercept, will be included in the model.
mean=Nag_MeanZero
The model will pass through the origin, zero-point.
Constraint: mean=Nag_MeanInclude or Nag_MeanZero.
4: n Integer Input
On entry: n, the number of observations.
Constraint: n1.
5: x[dim] const double Input
Note: the dimension, dim, of the array x must be at least n×tdx.
On entry: x[(i-1)×tdx+j-1] must contain the ith observation for the jth independent variable, for i=1,2,,n and j=1,2,,m.
6: tdx Integer Input
On entry: the stride separating matrix column elements in the array x.
Constraint: tdxm
7: m Integer Input
On entry: m, the total number of independent variables.
Constraint: m1.
8: sx[m] const Integer Input
On entry: indicates which independent variables are to be included in the model.
If sx[j-1]>0, the variable contained in the jth column of x is included in the regression model.
Constraints:
  • sx[j]0, for j=0,1,,m-1;
  • if mean=Nag_MeanInclude, exactly ip-1 values of sx must be >0;
  • if mean=Nag_MeanZero, exactly ip values of sx must be >0.
9: ip Integer Input
On entry: the number of independent variables in the model, including the mean or intercept if present.
Constraint: ip>0.
10: binom_t[dim] const double Input
Note: the dimension, dim, of the array binom_t must be at least
  • n, when errfn=Nag_Binomial.
On entry: if errfn=Nag_Binomial, binom_t[i-1] must contain the binomial denominator, ti, for the ith observation.
Otherwise binom_t is not referenced and may be NULL.
Constraint: if errfn=Nag_Binomial, binom_t[i-1]0.0, for i=1,2,,n.
11: offset[dim] const double Input
Note: the dimension, dim, of the array offset must be at least
  • n, when offsetis notNULL.
On entry: if an offset is required, then offset[i-1] must contain the value of the offset oi , for the ith observation. Otherwise offset must be NULL.
12: wt[dim] const double Input
Note: the dimension, dim, of the array wt must be at least
  • n, when wtis notNULL and vfobs=Nag_TRUE.
On entry: if weighted estimates are required then wt[i-1] must contain the weight, ω i for the ith observation. Otherwise wt must be supplied as NULL.
If wt[i-1] = 0.0 , the i th observation is not included in the model, in which case the effective number of observations is the number of observations with positive weights.
If wt= NULL, the effective number of observations is n .
If the variance of future observations is not included in the standard error of the predicted variable, wt is not referenced.
Constraint: if wtis notNULL and vfobs=Nag_TRUE, wt[i-1]0.0, for i=1,2,,n.
13: scale double Input
On entry: if errfn=Nag_Normal or Nag_Gamma and vfobs=Nag_TRUE, the scale parameter, ϕ.
Otherwise scale is not referenced and ϕ=1.
Constraint: if errfn=Nag_Normal or Nag_Gamma and vfobs=Nag_TRUE, scale>0.0.
14: ex_power double Input
On entry: if link=Nag_Expo, ex_power must contain the power of the exponential.
If linkNag_Expo, ex_power is not referenced.
Constraint: if link=Nag_Expo, ex_power0.0.
15: b[ip] const double Input
On entry: the model parameters, β.
If mean=Nag_MeanInclude, b[0] must contain the mean parameter and b[i] the coefficient of the variable contained in the jth independent x, where sx[j-1] is the ith positive value in the array sx.
If mean=Nag_MeanZero, b[i-1] must contain the coefficient of the variable contained in the jth independent x, where sx[j-1] is the ith positive value in the array sx.
16: cov[ip×(ip+1)/2] const double Input
On entry: the upper triangular part of the variance-covariance matrix, C, of the model parameters. This matrix should be supplied packed by column, i.e., the covariance between parameters βi and βj, that is the values stored in b[i-1] and b[j-1], should be supplied in cov[j×(j-1)/2+i-1], for i=1,2,,ip and j=i,,ip.
Constraint: the matrix represented in cov must be a valid variance-covariance matrix.
17: vfobs Nag_Boolean Input
On entry: if vfobs=Nag_TRUE, the variance of future observations is included in the standard error of the predicted variable (i.e., Ifobs=1), otherwise Ifobs=0.
18: eta[n] double Output
On exit: the linear predictor, η.
19: seeta[n] double Output
On exit: the standard error of the linear predictor, se(η).
20: pred[n] double Output
On exit: the predicted value, y^.
21: sepred[n] double Output
On exit: the standard error of the predicted value, se(y^). If pred[i-1] could not be calculated, g02gpc returns fail.code= NE_INVALID_PRED, and sepred[i-1] is set to -99.0.
22: fail NagError * Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_BAD_PARAM
On entry, argument value had an illegal value.
On entry, the error type and link function combination supplied is invalid.
NE_CHARACTER
On entry, link=value.
Constraint: if errfn=Nag_Binomial, link=Nag_Compl, Nag_Logistic or Nag_Probit,
otherwise, link=Nag_Expo, Nag_Iden, Nag_Log, Nag_Reci or Nag_Sqrt.
NE_INT
On entry, ip=value.
Constraint: ip>0.
On entry, m=value.
Constraint: m1.
On entry, n=value.
Constraint: n1.
NE_INT_2
On entry, tdx=value and m=value.
Constraint: tdxm.
NE_INT_ARRAY_CONS
On entry, sx[value]<0.
Constraint: sx[j-1]0.0, for j=1,2,,m.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_INVALID_PRED
At least one predicted value could not be calculated as required. sepred is set to -99.0 for affected predicted values.
NE_NO_LICENCE
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
NE_REAL
On entry, ex_power=0.0.
Constraint: if link=Nag_Expo, ex_power0.0.
On entry, scale=value.
Constraint: scale>0.0.
NE_REAL_ARRAY_CONS
On entry, binom_t[value]=value.
Constraint: binom_t[i-1]0.0, for all i.
On entry, cov[value]=value.
Constraint: cov[i]0.0 for at least one diagonal element.
On entry, wt[value]=value.
Constraint: wt[i-1]0.0, for all i.

7 Accuracy

Not applicable.

8 Parallelism and Performance

g02gpc makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

When using g02gpc following a call to g02dac you should set errfn=Nag_Normal, link=Nag_Iden, offset=N and scale=rssidf.

10 Example

The model
y = 1 β1 + β2 x + ε  
is fitted to a training dataset with five observations. The resulting model is then used to predict the response for two new observations.

10.1 Program Text

Program Text (g02gpce.c)

10.2 Program Data

Program Data (g02gpce.d)

10.3 Program Results

Program Results (g02gpce.r)