Note: please be advised that this function is classed as ‘experimental’ and its interface may be developed further in the future. Please see
Section 4 in How to Use the NAG Library
for further information.
g22ydc produces labels for the columns of a design matrix, model parameters and a vector of column inclusion flags suitable for use with functions in Chapter G02. Thus allowing for submodels to be fit using the same design matrix.
The function may be called by the names: g22ydc or nag_blgm_lm_submodel.
g22ydc is a utility function for use with g22yac,g22ybcandg22ycc. It can be used to construct labels for the columns for an design matrix, , created by g22ycc and return additional input vectors and flags required by a number of NAG Library model fitting functions.
Many of the analysis functions that require a design matrix to be supplied allow submodels to be defined through the use of a vector of ones or zeros indicating whether a column of should be included or excluded from the analyses (see for example sx in g02dacorg02gac). This allows nested models to be fit without having to reconstructed the design matrix for each analysis.
Let denote a model constructed by g22yac, a data matrix as described by g22ybc and be the corresponding design matrix constructed by g22ycc from and . A different model, is a submodel of if each term in , including the mean effect (intercept term) is also present in .
If is a submodel of , you can fit to using a design matrix whose columns are a subset of the columns of .
1: – Nag_DesignMatrixReturnInput
On entry: controls what labels are to be produced:
Labels for a submodel are required. The submodel must be supplied in hform.
Labels for the design matrix .
If hxdesc was returned by g02jfc in hlmm then is the design matrix associated with the fixed parameters.
Labels for the design matrix .
If hxdesc was returned by g02jfc in hlmm then is the part of the design matrix associated with the random parameters.
On exit: if , in order to fit the model to using , any analysis function should include an implicit mean effect (intercept term).
, if does not include a mean effect or the mean effect has been explicitly included in the design matrix.
5: – Integer *Output
On exit: , the number of parameters in the (sub)model, including the intercept if one is present. If , then the submodel is the one specified in hform otherwise the model is the one used when defining the design matrix described in hxdesc.
Let denote the number of terms in , denote the number of variables in the th term and denote the number of columns of corresponding to the th term. The required size of vinfo, denoted is given by:
If the model includes a mean effect, should be incremented by one.
The values , and are not trivial to calculate as they require the formula describing the model to be fully expanded and the contrast / dummy variable encoding to be known. Therefore, if lisx, lplab or lvinfo are too small and , NW_ARRAY_SIZE is returned and the required sizes for these arrays are returned in , and respectively.
12: – IntegerOutput
On exit: if , information encoding a description of the parameters in the model.
The encoding information can be extracted as follows:
(ii)Iterate from to .
3.Iterate from to .
(d)Increment by .
4.The th model parameter corresponds to the interaction between the variables held in columns of . Therefore, indicates a main effect, a two-way interaction, etc..
If , the th model parameter corresponds to the mean effect.
If , the corresponding variable is binary, ordinal or continuous. Otherwise, is the level for the corresponding variable for model parameter .
is a numeric flag indicating the contrast used in the case of a categorical variable. With indicating that dummy variables were used for variable in this term. The remaining six types of contrast; treatment contrasts (with respect to the first and last levels), sum contrasts (with respect to the first and last levels), Helmert contrasts and polynomial contrasts, as described in g22ycc, are identified by the integers one to six respectively.
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
Supplied value of what is not valid for the G22 handle supplied in hxdesc.
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
The model and the design matrix are not consistent. Term: . This is likely due to the design matrix being constructed in the presence of either a mean effect or main effect that is not present in the model.
The model and the design matrix are not consistent. The design matrix was constructed in the presence of a mean effect and the model does not include a mean effect.
The model and the design matrix are not consistent. The model includes a term not present in the design matrix. Term: .
On entry, one or more of lisx, lplab or lvinfo are nonzero, but too small. Minimum values are zero, or , and respectively. The minimum values are returned in the first three elements of vinfo.
The model and the design matrix are not consistent. The model specifies different contrasts to those used when the design matrix was constructed. The contrasts specified in hform will be ignored.
hxdesc has not passed through the model fitting function. The information returned by this function may not be consistent with results returned from the model fitting function if the data has been updated after the creation of hxdesc.
The model may not be as expected.
This is due to the model not containing the categorical variable adjusted to account for no mean effect when the design matrix was constructed.
Check the value of ip is as expected. If it is not then you will need to call g22ycc to reconstruct the design matrix for the model of interest.
On entry, plab is too short to hold the parameter labels. Long labels will be truncated. The longest parameter label is .
8Parallelism and Performance
g22ydc is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
This example performs a linear regression using g02dac. The linear regression model is defined via a text string which is parsed using g22yac and the design matrix associated with the model is generated using g22ycc. A submodel is then fit using the same design matrix.
Default parameter labels, as returned in plab are used for both models. An example of using the information returned in vinfo to construct more verbose parameter labels is given in g22ybc.