# Source code for naginterfaces.library.blgm

# -*- coding: utf-8 -*-
r"""
Module Summary
--------------
Interfaces for the NAG Mark 29.0 blgm Chapter.

blgm - Linear Model Specification

The functions in this module provide a mechanism for specifying a linear model using a text based modelling language and are intended to be used in conjunction with the model fitting functions from other modules, for example submodule :mod:~naginterfaces.library.correg.

--------
naginterfaces.library.examples.blgm :
This subpackage contains examples for the blgm module.
See also the :ref:library_blgm_ex subsection.

Functionality Index
-------------------

**Linear model**

construct design matrix: :meth:lm_design_matrix

data description: :meth:lm_describe_data

nested model: :meth:lm_submodel

specification from formula string: :meth:lm_formula

**Service functions**

destroy a G22 handle: :meth:handle_free

general option getting function: :meth:optget

general option setting function: :meth:optset

For full information please refer to the NAG Library document

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22intro.html
"""

[docs]def lm_formula(hform, formula):
r"""
lm_formula parses a text string containing a formula specifying a linear model and outputs a G22 handle to an internal data structure.
This G22 handle can then be passed to various functions in submodule blgm.
In particular, the G22 handle can be passed to :meth:lm_design_matrix to produce a design matrix or :meth:lm_submodel to produce a vector of column inclusion flags suitable for use with functions in submodule :mod:~naginterfaces.library.correg.

Note: this function uses optional algorithmic parameters, see also: :meth:optset, :meth:optget.

.. _g22ya-py2-py-doc:

For full information please refer to the NAG Library document for g22ya

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22yaf.html

.. _g22ya-py2-py-parameters:

**Parameters**
**hform** : Handle, modified in place
On entry: must be set to a null Handle, alternatively an existing G22 handle may be supplied in which case this function will destroy the supplied G22 handle as if :meth:handle_free had been called.

On exit: holds a G22 handle to the internal data structure containing a description of the model :math:\mathcal{M} as specified in :math:\mathrm{formula}. You **must not** change the G22 handle other than through functions in submodule blgm.

**formula** : str
A string containing the formula specifying :math:\mathcal{M}. See :ref:Notes <g22ya-py2-py-notes> for details on the allowed model syntax.

.. _g22ya-py2-py-other_params:

**Other Parameters**
**'Contrast'** : str
Default :math:\text{} = \texttt{'FIRST'}

This argument controls the default contrasts used for the categorical independent variables appearing in the model.
Six types of contrasts and dummy variables are available:

'FIRST'
Treatment contrasts relative to the first level of the variable will be used.

'LAST'
Treatment contrasts relative to the last level of the variable will be used.

'SUM FIRST'
Sum contrasts relative to the first level of the variable will be used.

'SUM LAST'
Sum contrasts relative to the last level of the variable will be used.

'HELMERT'
Helmert contrasts will be used.

'POLYNOMIAL'
Polynomial contrasts will be used.

'DUMMY'
Dummy variables will be used rather than a contrast.

See :meth:lm_design_matrix for more information on contrasts, their effect on the design matrix and how they are constructed.

This argument may have an instance identifier associated with it (see :meth:optset and :meth:optget).
The instance identifier must be the name of one of the variables appearing in the model supplied in :math:\mathrm{formula} when the G22 handle was created.
For example, CONTRAST : VAR1 = HELMERT would set Helmert contrasts for the variable named VAR1.

If no instance identifier is specified, the default contrast for all categorical variables in the model is changed, otherwise only the default contrast for the named variable is changed.

In some situations it might be necessary for a variable to use a different contrast, depending on where it appears in the model formula.
In order to allow contrasts to be specified on a term by term basis the :math:@ operator can be used in the model formula.
The syntax for this operator is :math:V_j@c, where :math:c is one of: F, L, SF, SL, H, P or D, corresponding to treatment contrasts relative to the first and last levels, sum contrasts relative to the first and last levels, Helmert contrasts, polynomial contrasts or dummy variables respectively.

If the contrast has not been explicitly specified via the :math:@ operator, the value obtained from the option 'Contrast' is used.

For example, setting :math:\mathrm{formula} to VAR1 + VAR1@H.VAR2@P + VAR2@H.VAR3, specifies that the variable named VAR1 should use the default contrasts in the first term and Helmert contrasts in the second term.
The variable named VAR2 should use polynomial contrasts in the second term and Helmert contrasts in the third term.
The variable named VAR3 should use the default contrasts in the third term.

**'Explicit Mean'** : str
Default :math:\text{} = \texttt{'NO'}

If :math:\text{‘Explicit Mean'} = \texttt{'YES'}, any mean effect included in the model will be explicitly added to the design matrix, :math:X, as a column of :math:1\ s.

If :math:\text{‘Explicit Mean'} = \texttt{'NO'}, it is assumed that the function to which :math:X will be passed treats the mean effect as a special case, see :math:\textit{mean} in :meth:correg.linregm_fit <naginterfaces.library.correg.linregm_fit> for example.

**'Formula'** : str
This argument returns a verbose version of the model formula specified in :math:\mathrm{formula}, expanded and simplified to only contain variable names, the operators :math:+ and :math:. and any contrast identifiers present.

**'Storage Order'** : str
Default :math:\text{} = \texttt{'OBSVAR'}

This option controls how the design matrix, :math:X, should be stored in its output array and only has an effect if the design matrix is being constructed using :meth:lm_design_matrix.

If :math:\text{‘Storage Order'} = \texttt{'OBSVAR'}, :math:X_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the design matrix is stored in :math:{\textit{x}}[i-1,j-1].

If :math:\text{‘Storage Order'} = \texttt{'VAROBS'}, :math:X_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the design matrix is stored in :math:{\textit{x}}[j-1,i-1].

Where :math:\textit{x} is the output argument of the same name in :meth:lm_design_matrix.

**'Subject'** : str
This argument gives the subject terms associated with the :math:\mathrm{formula} in a linear mixed effects model.

The supplied value must consist of a single term, representing either a single independent variable, or a single interaction term between two or more independent variables.
All variables in the subject term must not also appear in the model formula.

.. _g22ya-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
On entry, :math:\mathrm{hform} is not a null Handle or a recognised G22 handle.

(errno :math:21)
The formula contained a mismatched parenthesis.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:22)
An operator was missing.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:23)
Invalid use of an operator.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:24)
Invalid specification for the power operator.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:25)
Invalid specification for the colon operator.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:26)
Invalid specification for the mean.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:27)
Invalid variable name.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:28)
Missing variable name.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:29)
After processing, the model contains no terms.

(errno :math:30)
An invalid contrast specifier has been supplied.

The position in the formula string of the error is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:41)
On entry, an invalid :math:\textit{option} was supplied in :math:\mathrm{formula}.

(errno :math:42)
On entry, an :math:\textit{option} was supplied in :math:\mathrm{formula}, but the expected delimiter ':math:=' was not found.

(errno :math:43)
On entry, an :math:\textit{option} was supplied in :math:\mathrm{formula}, but the supplied :math:\textit{optval} was invalid.

**Warns**
**NagAlgorithmicWarning**
(errno :math:31)
A term contained a repeated variable with a different contrast specifier.

.. _g22ya-py2-py-notes:

**Notes**

**Background**

Let :math:D denote a data matrix with :math:n observations on :math:m_d independent variables, denoted :math:V_1,V_2,\ldots,V_{m_d}.
Let :math:y denote a vector of :math:n observations on a dependent variable.

A linear model, :math:\mathcal{M}, as the term is used in this function, expresses a relationship between the independent variables, :math:V_j, and the dependent variable.
This relationship can be expressed as a series of additive terms :math:T_1+T_2 + \cdots, with each term, :math:T_t, representing either a single independent variable :math:V_j, called the main effect of :math:V_j, or the interaction between two or more independent variables.
An interaction term, denoted here using the :math:. operator, allows the effect of an independent variable on the dependent variable to depend on the value of one or more other independent variables.
As an example, the three-way interaction between :math:V_1,V_2 and :math:V_3 is denoted :math:V_1.V_2.V_3 and describes a situation where the effect of one of these three variables is influenced by the value of the other two.

This function takes a description of :math:\mathcal{M}, supplied as a text string containing a formula, and outputs a G22 handle to an internal data structure.
This G22 handle can then be passed to :meth:lm_design_matrix to produce a design matrix for use in analysis functions from other modules, for example the regression functions of submodule :mod:~naginterfaces.library.correg.

A more detailed description of what is meant by a G22 handle can be found in the G22 Introduction <https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22intro.html#backsec_handles>__.

**Syntax**

In its most verbose form :math:\mathcal{M} can be described by one or more variable names, :math:V_j, and the two operators, :math:+ and :math:..
In order to allow a wide variety of models to be specified compactly this syntax is extended to six operators (:math:+, :math:., :math:*, :math:-, :math::, :math:\hat{}) and parentheses.

A formula describing the model is supplied to lm_formula via a character string which must obey the following rules:

(1) Variables can be denoted by arbitrary names, as long as

(i) The names used are a subset of those supplied to :meth:lm_describe_data when describing :math:D.

(#) The names do not contain any of the characters in :math:+.*-:\hat{} ()@.

(#) The :math:. operator denotes an interaction between two or more variables or terms, with :math:V_1.V_2.V_3 denoting the three-way interaction between :math:V_1, :math:V_2 and :math:V_3.

(#) A term in :math:\mathcal{M} can contain one or more variable names, separated using the :math:. operator, i.e., a term can be either a main effect or an interaction term between two or more variables.

(i) If a variable appears in an interaction term more than once, all subsequent appearances, after the first, are ignored, therefore, :math:V_1.V_2.V_1 is the same as :math:V_1.V_2.

(#) The ordering of the variables in an interaction term is ignored when comparing terms, therefore, :math:V_1.V_2 is the same as :math:V_2.V_1. This ordering may have an effect when the resulting G22 handle is passed to another function, for example :meth:lm_design_matrix.

(#) Applying the :math:. operator to two terms appends one to the other, for example, if :math:T_1 = V_1.V_2 and :math:T_2 = V_3.V_4, :math:T_1.T_2 = V_1.V_2.V_3.V_4.

(#) The :math:+ operator allows additional terms to be included in :math:\mathcal{M}, therefore, :math:T_1+T_2 is a model that includes terms :math:T_1 and :math:T_2.

(i) If a term is added to :math:\mathcal{M} more than once, all subsequent appearances, after the first, are ignored, therefore, :math:T_1+T_2+T_1 is the same as :math:T_1+T_2.

(#) The ordering of the terms is ignored whilst parsing the formula, therefore, :math:T_1+T_2 is the same as :math:T_2+T_1. This ordering may have an effect when the resulting G22 handle is passed to another function, for example :meth:lm_design_matrix.

(#) Internally, the terms are reordered so that all main effects come first, followed by two-way interactions, then three-way interactions, etc. The ordering within each of these categories is preserved.

(#) The :math:* operator can be used as a shorthand notation denoting the main effects and all interactions between the variables involved. Therefore, :math:T_1*T_2 is equivalent to :math:T_1+T_2+T_1.T_2 and :math:T_1*T_2*T_3 is equivalent to :math:T_1+T_2+T_3+T_1.T_2+T_1.T_3+T_2.T_3+T_1.T_2.T_3.

(#) The :math:- operator removes a term from :math:\mathcal{M}, therefore, :math:T_1*T_2*T_3-T_1.T_2.T_3 is equivalent to :math:T_1+T_2+T_3+T_1.T_2+T_1.T_3+T_2.T_3 as the three-way interaction, :math:T_1.T_2.T_3, usually present due to :math:T_1*T_2*T_3 has been removed.

(#) The :math:: operator is a shorthand way of specifying a series of variables, with :math:V_1:V_j being equivalent to :math:V_1+V_2 + \cdots +V_j.

(i) This operator can only be used if the variable names end in a numeric, therefore, :math:\text{VAR2}:\text{VAR4} would be valid, but :math:\text{FVAR}:\text{LVAR} would not.

(#) The root part of both variable names (i.e., the part before the trailing numeric, so :math:\text{VAR} in the valid example above) must be the same.

(#) The trailing numeric parts of the two variable names must be in ascending order.

(#) The :math:\hat{} operator is a shorthand notation for a series of :math:* operators. :math:\left(T_1+T_2+T_3\right)\hat{} 2 is equivalent to :math:\left(T_1+T_2+T_3\right)*\left(T_1+T_2+T_3\right) which in turn is equivalent to :math:T_1+T_2+T_3+T_1.T_2+T_1.T_3+T_2.T_3.

(i) This notation is present primarily for use with the :math:: operator in examples of the form, :math:\left(V_1:V_5\right)\hat{} 3 which specifies a model containing the main effects for variables :math:V_1 to :math:V_5 as well as all two - and three-way interactions.

(#) Using the :math:\hat{} operator on a single term has no effect, therefore, :math:T_2\hat{} 2 is the same as :math:T_2.

Precedence

Each operator has an associated default precedence, but this can be overridden through the use of parentheses.
The default precedence is:

(1) The :math:: operator, with the resulting expression is treated as if it was surrounded by parentheses. Therefore, :math:V_1+V_3:V_6*V_7 is equivalent to :math:V_1+\left(V_3+V_4+V_5+V_6\right)*V_7.

(#) The :math:\hat{} operator, with the resulting expression is treated as if it was surrounded by parentheses. Therefore, :math:\left(T_1+T_2+T_3\right)\hat{} 2.T_4 is equivalent to :math:\left(\left(T_1+T_2+T_3\right)\hat{} 2\right).T_4, which is the equivalent to :math:T_1.T_4+T_2.T_4+T_3.T_4+T_1.T_2.T_4+T_1.T_3.T_4+T_2.T_3.T_4.

(#) The :math:. operator, so :math:T_1*T_2.T_3 is equivalent to :math:T_1*\left(T_2.T_3\right).

(#) The :math:* operator.

(i) When using parentheses with the :math:* or :math:. operators the usual rules of multiplication apply, therefore, :math:\left(T_1+T_3.T_4\right).\left(T_5+T_7\right) is equivalent to :math:T_1.T_5+T_1.T_7+T_3.T_4.T_5+T_3.T_4.T_7 and :math:\left(T_1+T_3.T_4\right)*\left(T_5+T_7\right) is equivalent to :math:T_1+T_5+T_7+T_3.T_4+T_1.T_5+T_1.T_7+T_3.T_4.T_5+T_3.T_4.T_7.

(#) Syntax of the following form is invalid: :math:T_1o\left(T_2\right)oT_3, where :math:o indicates an operator, unless one or more of those operators are :math:+ and/or :math:-. Therefore, :math:T_1.\left(T_2+T_3\right)*T_4 is invalid, whilst :math:T_1.\left(T_2+T_3\right)+T_4 is valid.

(#) The :math:+ and :math:- operators have equal precedence.

(i) If the terms associated with a :math:- operator do not occur in the current expression they are ignored, therefore, :math:T_1+\left(T_2-T_1\right) is the equivalent to :math:T_1+T_2; the :math:\left(T_2-T_1\right) part of the expression is calculated first and results in :math:T_2 as the :math:T_1 term does not exist in this particular sub-expression so cannot be removed.

Mean Effect / Intercept Term

A mean effect (or intercept term) can be explicitly added to a formula by specifying :math:1 and can be explicitly excluded from the formula by specifying :math:-1.
For example, :math:1+V_1+V_2 indicates a model with the main effects of two variables and a mean effect, whereas :math:V_1+V_2-1 denotes the same model, but without the mean effect.
The mean indicator can appear anywhere in the formula string as long as it is not contained within parentheses.

If the mean effect is not explicitly mentioned in the model formula, the model is assumed to include a mean effect.

**Optional Parameters**

lm_formula accepts a number of optional parameters described in :ref:Other Parameters <g22ya-py2-py-other_params>.
Usually these parameters are set via call to :meth:optset, however when specifying a subject term in a mixed effects linear regression model it is often more convenient to supply the information along with the rest of the formula.
Therefore, writeable optional parameters can be set via the :math:\mathrm{formula} argument.
The delimiter :math:/ must be used between the main formula and the optional parameter.
For example, supplying a formula of the form :math:V_1+V_2/\text{SUBJECT} = V_3.V_4, would specify a model formula of :math:V_1+V_2 and set the optional parameter 'Subject' to :math:V_3.V_4.

--------
:meth:naginterfaces.library.examples.blgm.lm_formula_ex.main
:meth:naginterfaces.library.examples.correg.glm_binomial_ex.main
:meth:naginterfaces.library.examples.correg.lmm_init_combine_ex.main
"""
raise NotImplementedError

[docs]def lm_describe_data(hddesc, nobs, levels, vnames=None):
r"""
lm_describe_data describes a data matrix.

Note: this function uses optional algorithmic parameters, see also: :meth:optset, :meth:optget.

.. _g22yb-py2-py-doc:

For full information please refer to the NAG Library document for g22yb

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22ybf.html

.. _g22yb-py2-py-parameters:

**Parameters**
**hddesc** : Handle, modified in place
On entry: must be set to a null Handle, alternatively an existing G22 handle may be supplied in which case this function will destroy the supplied G22 handle as if :meth:handle_free had been called.

On exit: holds a G22 handle to the internal data structure containing a description of the data matrix, :math:D. You **must not** change the G22 handle other than through the functions in submodule blgm.

**nobs** : int
:math:n, the number of observations in the data matrix, :math:D.

**levels** : int, array-like, shape :math:\left(\textit{nvar}\right)
:math:\mathrm{levels}[\textit{j}-1] contains the number of levels associated with the :math:\textit{j}\ th variable of the data matrix, for :math:\textit{j} = 1,2,\ldots,\textit{nvar}.

If the :math:j\ th variable is binary, ordinal or continuous, :math:\mathrm{levels}[j-1] should be set to :math:1; otherwise :math:\mathrm{levels}[j-1] should be set to the number of levels associated with the :math:j\ th variable and the corresponding column of the data matrix is assumed to take the value :math:1 to :math:\mathrm{levels}[j-1].

**vnames** : None or str, array-like, shape :math:\left(\textit{lvnames}\right), optional
If :math:\mathrm{vnames} is not **None**, :math:\mathrm{vnames}[\textit{j}-1] must contain the name of the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,\textit{nvar}.

The names supplied in :math:\mathrm{vnames} should be at most :math:50 characters long and be unique.

If a name longer than :math:50 characters is supplied it will be truncated.

Variable names must not contain any of the characters +.*-:^()@.

.. _g22yb-py2-py-other_params:

**Other Parameters**
**'Number of Observations'** : int
:math:n, the number of observations in the data matrix.

**'Number of Variables'** : int
If queried, this option will return :math:m_d, the number of variables in the data matrix.

**'Storage Order'** : str
Default :math:\text{} = \texttt{'OBSVAR'}

This option states how the data matrix, :math:D, will be stored in its input array.

If :math:\text{‘Storage Order'} = \texttt{'OBSVAR'}, :math:D_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the data matrix is stored in :math:{\textit{dat}}[i-1,j-1].

If :math:\text{‘Storage Order'} = \texttt{'VAROBS'}, :math:D_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the data matrix is stored in :math:{\textit{dat}}[j-1,i-1].

Where :math:\textit{dat} is the input argument of the same name in :meth:lm_design_matrix.

.. _g22yb-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
On entry, :math:\mathrm{hddesc} is not a null Handle or a recognised G22 handle.

(errno :math:21)
On entry, :math:\mathrm{nobs} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nobs}\geq 0.

(errno :math:31)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq 0.

(errno :math:41)
On entry, :math:j = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{levels}[j-1] = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{levels}[\textit{i}-1]\geq 1.

(errno :math:51)
On entry, :math:\textit{lvnames} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{lvnames} = 0 or :math:\textit{nvar}.

(errno :math:61)
On entry, variable name :math:i contains one more invalid characters, :math:i = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:62)
On entry, variable names :math:i and :math:j are not unique, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:63)
On entry, variable names :math:i and :math:j are not unique (possibly due to truncation), :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

Maximum variable name length is :math:50.

**Warns**
**NagAlgorithmicWarning**
(errno :math:64)
At least one variable name was truncated to :math:50 characters. Each truncated name is unique and will be used in all output.

.. _g22yb-py2-py-notes:

**Notes**
Let :math:D denote a data matrix with :math:n observations on :math:m_d independent variables, denoted :math:V_1,V_2,\ldots,V_{m_d}.
The :math:j\ th independent variable, :math:V_j can be classified as either binary, categorical, ordinal or continuous, where:

Binary
:math:V_j can take the value :math:1 or :math:0.

Categorical
:math:V_j can take one of :math:L_j distinct values or levels. Each level represents a discrete category but does not necessarily imply an ordering. The value used to represent each level is, therefore, arbitrary and, by convention and for convenience, is taken to be the integers from :math:1 to :math:L_j.

Ordinal
As with a categorical variable :math:V_j can take one of :math:L_j distinct values or levels. However, unlike a categorical variable, the levels of an ordinal variable imply an ordering and hence the value used to represent each level is not arbitrary. For example, :math:V_j = 4 implies a value that is twice as large as :math:V_j = 2.

Continuous
:math:V_j can take any real value.

lm_describe_data returns a G22 handle containing a description of a data matrix, :math:D.
The data matrix makes no distinction between binary, ordinal or continuous variables.

A name can also be assigned to each variable.
If names are not supplied then the default vector of names, :math:\left\{\text{‘V1'},\text{‘V2'},\ldots \right\} is used.

--------
:meth:naginterfaces.library.examples.blgm.lm_formula_ex.main
:meth:naginterfaces.library.examples.correg.glm_binomial_ex.main
:meth:naginterfaces.library.examples.correg.lmm_init_combine_ex.main
"""
raise NotImplementedError

[docs]def lm_design_matrix(hform, hddesc, dat, hxdesc):
r"""
lm_design_matrix generates a design matrix from a data matrix and model description.

Note: this function uses optional algorithmic parameters, see also: :meth:optset, :meth:optget.

.. _g22yc-py2-py-doc:

For full information please refer to the NAG Library document for g22yc

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22ycf.html

.. _g22yc-py2-py-parameters:

**Parameters**
**hform** : Handle
A G22 handle to the internal data structure containing a description of the model :math:\mathcal{M} as returned in :math:\textit{hform} by :meth:lm_formula.

**hddesc** : Handle
A G22 handle to the internal data structure containing a description of the data matrix, :math:D as returned in :math:\textit{hddesc} by :meth:lm_describe_data.

**dat** : float, array-like, shape :math:\left(:, :\right)
The data matrix, :math:D. By default :math:D_{{ij}}, the :math:\textit{i}\ th value for the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,m_d, for :math:\textit{i} = 1,2,\ldots,n, should be supplied in :math:\mathrm{dat}[i-1,j-1].

If the option 'Storage Order', described in :meth:lm_describe_data, is set to 'VAROBS', :math:D_{{ij}} should be supplied in :math:\mathrm{dat}[j-1,i-1].

**hxdesc** : Handle, modified in place
On entry: must be set to a null Handle, alternatively an existing G22 handle may be supplied in which case this function will destroy the supplied G22 handle as if :meth:handle_free had been called.

On exit: holds a G22 handle to the internal data structure containing a description of the design matrix, :math:X. You **must not** change the G22 handle other than through the functions in submodule blgm.

**Returns**
**x** : float, ndarray, shape :math:\left(:, :\right)
The design matrix, :math:X. By default :math:X_{{ij}}, the :math:\textit{i}\ th value for the :math:\textit{j}\ th column, for :math:\textit{j} = 1,2,\ldots,m_x, for :math:\textit{i} = 1,2,\ldots,n, is returned in :math:\mathrm{x}[i-1,j-1].

If the option 'Storage Order', described in :meth:lm_formula, is set to 'VAROBS', :math:X_{{ij}} is returned in :math:\mathrm{x}[j-1,i-1].

.. _g22yc-py2-py-other_params:

**Other Parameters**
**'Formula'** : str
This option returns a verbose formula string describing the model, :math:\mathcal{M}, used to create the design matrix.
This formula will only contain variable names, the operators ':math:+' and ':math:.' and any contrast identifiers present.

**'Min Number of Columns'** : int
This option returns the minimum number of columns required to hold the design matrix, :math:X.
In most cases :math:\text{‘Min Number of Columns'} = \text{‘Number of Columns'}.
The one exception is when :math:\mathrm{errno} = 71, that is the size of :math:\mathrm{x} was too small but the data matrix given in :math:\mathrm{dat} can be used as the design matrix.
In this case, :math:\text{‘Number of Columns'} = m_x = m_d and :math:\text{‘Min Number of Columns'} holds the number of columns that would be required if only the relevant parts of :math:\mathrm{dat} were copied into a new array.

**'Number of Columns'** : int
This option returns :math:m_x, the number of columns in the design matrix.

**'Number of Observations'** : int
This option returns :math:n, the number of observations in the design matrix.

**'Storage Order'** : str
This option returns how the design matrix, :math:X, is stored in :math:\mathrm{x}.

If :math:\text{‘Storage Order'} = \texttt{'OBSVAR'}, :math:X_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the design matrix is stored in :math:\mathrm{x}[i-1,j-1].

If :math:\text{‘Storage Order'} = \texttt{'VAROBS'}, :math:X_{{ij}}, the value for the :math:j\ th variable of the :math:i\ th observation of the design matrix is stored in :math:\mathrm{x}[j-1,i-1].

It should be noted that 'Storage Order' is not writeable.
If you wish to change the storage order of the design matrix you need to change 'Storage Order' in :math:\mathrm{hform} as described in :ref:Other Parameters for lm_formula <g22ya-py2-py-other_params> prior to calling lm_design_matrix.

.. _g22yc-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
:math:\mathrm{hform} has not been initialized or is corrupt.

(errno :math:12)
:math:\mathrm{hform} is not a G22 handle as generated by :meth:lm_formula.

(errno :math:13)
A variable name used when creating :math:\mathrm{hform} is not present in :math:\mathrm{hddesc}.

Variable name: :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:21)
:math:\mathrm{hddesc} has not been initialized or is corrupt.

(errno :math:22)
:math:\mathrm{hddesc} is not a G22 handle as generated by :meth:lm_describe_data.

(errno :math:31)
On entry, column :math:j of the data matrix, :math:D, is not consistent with information supplied in :math:\mathrm{hddesc}, :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:41)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{lddat} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{lddat}\geq n.

(errno :math:42)
On entry, :math:m_d = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{lddat} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{lddat}\geq m_d.

(errno :math:51)
On entry, :math:m_d = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{sddat} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{sddat}\geq m_d.

(errno :math:52)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{sddat} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{sddat}\geq n.

(errno :math:61)
On entry, :math:\mathrm{hxdesc} is not a null Handle or a recognised G22 handle.

**Warns**
**NagAlgorithmicWarning**
(errno :math:14)
The model contains categorical variables, but no intercept or main effects terms have been requested.

Please check the design matrix returned matches the model you require.

(errno :math:32)
Column :math:j of the data matrix, :math:D, required rounding more than expected when being treated as a categorical variable, :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

.. _g22yc-py2-py-notes:

**Notes**
lm_design_matrix generates a design matrix from a data matrix and a model description.
Design matrices encapsulate the observed values of the independent variables and the required model in a form that can be used by many of the model fitting functions available in the NAG Library, for example those in submodule :mod:~naginterfaces.library.correg.

**Notation**

Let :math:D denote a data matrix with :math:n observations on :math:m_d independent variables, denoted by :math:V_j, for :math:\textit{j} = 1,2,\ldots,m_d.
If :math:V_j is a categorical variable, let :math:L_j denote the number of levels associated with it.
If :math:V_j is a binary, ordinal or continuous variable, let :math:L_j = 1.

Let :math:V_{{ji}} denote the :math:i\ th value of :math:V_j.

Let :math:\mathcal{M} denote a model made up of one or more terms, denoted by :math:T_i.
Each term consists of either a main effect or an interaction and hence can be described using one or more variable names :math:V_j and the interaction operator ':math:.'.
The operator ':math:+' is used to denote the addition of a term to the model.
Therefore, :math:\mathcal{M} = T_1+T_2+T_3 = V_1+V_2+{V_1.V_2} denotes a model with three terms, the first two terms being the main effects for variables :math:V_1 and :math:V_2 and the last term the interaction between them.
For simplicity we reorder the terms of the model by the number of variables in them, so main effects come first, then two-way interactions, then three-way interactions etc.
By default it is assumed that the model :math:\mathcal{M} contains a mean effect (or intercept term), if the mean effect is excluded, this will be denoted by ':math:-1', so :math:\mathcal{M} = T_1 is a model with one term and a mean effect and :math:\mathcal{M} = T_1-1 is the same model with the mean effect dropped.

lm_design_matrix generates an :math:n\times m_x design matrix, :math:X, from :math:D and :math:\mathcal{M}.

**Dummy Variables**

When constructing a design matrix, we cannot work directly with categorical variables.
Categorical variables must first be recoded into dummy variables.
A categorical variable :math:V_j requires :math:L_j dummy variables.
Let :math:\mathcal{D}^{{j}} denote an :math:n\times L_j matrix of dummy variables for :math:V_j defined as

.. math::
\mathcal{D}_{{li}}^j = \left\{\begin{array}{l} 1 \text{; if }V_{{ji}} = l, \\ 0 \text{; otherwise} \end{array}\right.

where :math:\mathcal{D}_l^j is the :math:l\ th column of :math:\mathcal{D}^j and :math:\mathcal{D}_{{li}}^j is the :math:i\ th element of :math:\mathcal{D}_l^j.

For a binary, ordinal or continuous variable, :math:\mathcal{D}_{{1i}}^j = V_{{ji}}.

**Full Design Matrix**

Given a model, :math:\mathcal{M}, and the matrices of dummy variables constructing the full design matrix :math:X_F is trivial.
Each term is processed in order and

(1) If term :math:i is a main effect, that is :math:T_i = V_j for some :math:j, :math:\mathcal{D}^j is copied into :math:X_F.

(#) If term :math:i is a two-way interaction, that is :math:T_i = {V_j.V_k}, for some :math:j\neq k, then

(i) Loop over :math:l_j = 1,2,\ldots L_j.

(#) Loop over :math:l_k = 1,2,\ldots L_k.

(#) Add a column to :math:X_F corresponding to the element-wise product of :math:\mathcal{D}_{l_j}^j and :math:\mathcal{D}_{{l_k}}^k.

(#) Higher interaction terms are handled in a similar manner as the two-way interactions by adding columns constructed from multiplying all combinations of the columns of the corresponding :math:\mathcal{D}\ s that correspond to the variables involved. In all cases, the variables towards the right hand side of a term are iterated over the quickest.

**Contrasts**

Using the full design matrix :math:X_F in an analysis can result in an overparameterized model.
This is due to :math:X_F often not being of full rank as the sum of all the dummy variables for a particular variable is a vector of ones.
This source of overparameterization can be alleviated by using a design matrix :math:X where (some) dummy variables are replaced by contrasts.
For a categorical variable :math:V_j the contrasts are a set of :math:L_j-1 functionally independent linear combinations of the dummy variables.

Whilst the choice of contrasts used in term :math:T_i will affect the individual model coefficients (parameters), it has no effect on the overall contribution of :math:T_i.

For a given variable :math:V_j, the contrasts can be represented by an :math:L_j\times L_j-1 matrix, :math:C_j.
The rows of :math:C_j correspond to a particular value of :math:V_j and the columns correspond to the values to use in the design matrix.

Six types of contrast are available in lm_design_matrix; two types of treatment contrasts, two types of sum contrasts, Helmert contrasts and polynomial contrasts.
Unless specified otherwise, the contrasts used by lm_design_matrix are treatment contrasts relative to the first level.
See the description of the option 'Contrast' in :meth:lm_formula for ways of changing the contrasts used.

Treatment Contrasts

Treatment contrasts are taken relative to either the first or last level of the variable.
For example, if :math:L_j = 4,

.. math::
C_j = \begin{pmatrix}0&0&0\\1&0&0\\0&1&0\\0&0&1\end{pmatrix}

would be the contrast matrix for :math:V_j using treatment contrasts relative to the first level.
The contrast matrix obtained when using treatment contrasts relative to the last level is similar, but the row of zeros appears at the bottom and all other rows are shifted up one.

Strictly speaking, the term contrast implies that each row in the contrast matrix sums to zero.
That is not the case for treatment contrasts, however they are included as this coding is commonly used in practice.

Sum Contrasts

Sum contrasts are similar to treatment contrasts and again can be taken relative to the first or last level of the variable.
Unlike treatment contrasts, sum contrasts effectively constrain the coefficients related to the variable to sum to zero.
For example, if :math:L_j = 4,

.. math::
C_j = \begin{pmatrix}1&0&0\\0&1&0\\0&0&1\\-1&-1&-1\end{pmatrix}

would be the contrast matrix for :math:V_j using treatment contrasts relative to the last level.
The contrast matrix obtained when using treatment contrasts relative to the first level is similar, but the row of :math:-1\ s appears at the top and all other rows are shifted down one.

Helmert Contrasts

With Helmert contrasts level :math:l of the variable is compared with the average effect of all previous levels.
For example, if :math:L_j = 4,

.. math::
C_j = \begin{pmatrix}-1&-1&-1\\1&-1&-1\\0&2&-1\\0&0&3\end{pmatrix}

would be the contrast matrix for :math:V_j using Helmert contrasts.

Polynomial Contrasts

With polynomial contrasts the entries in the columns of :math:C_j correspond in linear, quadratic, cubic, quartic, etc. terms to a hypothetical underlying numeric variable that takes equally spaced values at each level.
For example, if :math:L_j = 4,

.. math::
C_j = \begin{pmatrix}-0.67&0.50&-0.22\\-0.22&-0.50&0.67\\0.22&-0.50&-0.67\\0.67&0.50&0.22\end{pmatrix}

would be the contrast matrix for :math:V_j using polynomial contrasts.

When Contrasts Can Be Used

Depending on the specifics of the model, :math:\mathcal{M}, it may not be possible to always replace the :math:L_j dummy variables with :math:L_j-1 contrasts for all variables in all terms and retain the same model.
A simple example of this is a data matrix, :math:D, with four observations and two variables which have two and three levels respectively.
This data matrix might look something like:

.. math::
D = \begin{pmatrix}1&1\\2&3\\1&2\\2&2\end{pmatrix}

For the sake of argument, assume that our model contains the main effect for each variable, but does not contain a mean effect (or intercept term).
So using the notation established earlier, :math:\mathcal{M} = V_1+V_2-1.
The full design matrix, :math:X_F, for this data matrix and model would be

.. math::
X_F = \begin{pmatrix}1&0&&1&0&0\\0&1&&0&0&1\\1&0&&0&1&0\\0&1&&0&1&0\end{pmatrix}

However, :math:X_F is not of full rank (and hence :math:\mathcal{M} is overparameterized) because the sum of the first two columns is a vector of ones as is the sum of the last three columns.

In order to alleviate this we might try constructing :math:X_C where the dummy variables have been replaced by contrasts.
Assuming treatment contrasts, relative to the first level, we would have

.. math::
X_C = \begin{pmatrix}0&&0&0\\1&&0&1\\0&&1&0\\1&&1&0\end{pmatrix}

However, using :math:X_C makes an implicit assumption that the expected value of the dependent variable (the quantity being modelled) is zero when :math:V_1 = 1 and :math:V_2 = 1.
This assumption was not made when we used :math:X_F and hence the two design matrices are not equivalent.
One solution would be to use dummy variables for :math:V_1 and contrasts for :math:V_2, which would result in a design matrix, :math:X of

.. math::
X = \begin{pmatrix}1&0&&0&0\\0&1&&0&1\\1&0&&1&0\\0&1&&1&0\end{pmatrix}

Using :math:X would give an equivalent model to using :math:X_F.

The algorithm used by lm_design_matrix to decide which variables, in which terms, can be coded as contrasts and which need to be coded as dummy variables is described below.

Suppose :math:V_j is any variable that appears in term :math:T_i, let :math:T_{{i{}\left(j\right)}} denote the term obtained by dropping :math:V_j from :math:T_i.
For example, if :math:T_3 = {V_1.V_2.V_3}, :math:T_{{3{}\left(2\right)}} = {V_1.V_3}.
In this context, the empty term is taken to be the mean effect (or intercept term).
We say that :math:T_{{i\left(j\right)}} appears in :math:\mathcal{M} if there exists a term :math:T_k, :math:k < i, that contains all of the variables appearing in :math:T_{{i\left(j\right)}}.
In most cases :math:T_k = T_{{i\left(j\right)}}, but this is not required.
Note, as stated earlier, the terms in :math:\mathcal{M} are ordered by the number of variables in them.

A variable, :math:V_j in term :math:T_i is coded by contrasts if :math:T_{{i\left(j\right)}} appears in :math:\mathcal{M} and by dummy variables otherwise.
It is, therefore, possible for variable :math:V_j to be coded by contrasts in some terms and dummy variables in others within the same :math:X.

The above rule assumes the presence of a mean effect.
If no such effect is present in the model, the main effect of the first categorical variable is coded by dummy variables to compensate.
If no main effects appear in the model, the warning :math:\mathrm{errno} = 14 is returned.

A longer description and informal proof that the resulting :math:X is a suitable design matrix for the model of interest can be found in module two of Chambers and Hastie (1992).

**Mean Effect**

The mean effect (or intercept term) is included in a design matrix by adding a column of ones as the first column of :math:X.
However, many model fitting functions in the NAG Library handle the mean effect as a special case and do not require it to be explicitly added to the design matrix.
Therefore, by default, lm_design_matrix does not explicitly add the mean effect to the design matrix.
This behaviour can be changed via the option 'Explicit Mean' in :meth:lm_formula.

.. _g22yc-py2-py-references:

**References**
Chambers, J M and Hastie, T J, 1992, Statistical Models in S, Wadsworth and Brooks/Cole Computer Science Series

--------
:meth:naginterfaces.library.examples.correg.glm_binomial_ex.main
"""
raise NotImplementedError

[docs]def lm_submodel(what, hform, hxdesc, lisx, lplab, lvinfo, lenlab=210):
r"""
lm_submodel produces labels for the columns of a design matrix, model parameters and a vector of column inclusion flags suitable for use with functions in submodule :mod:~naginterfaces.library.correg.
Thus allowing for submodels to be fit using the same design matrix.

.. _g22yd-py2-py-doc:

For full information please refer to the NAG Library document for g22yd

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22ydf.html

.. _g22yd-py2-py-parameters:

**Parameters**
**what** : str
Controls what labels are to be produced:

:math:\mathrm{what} = \text{‘S'}

Labels for a submodel are required. The submodel must be supplied in :math:\mathrm{hform}.

:math:\mathrm{what} = \text{‘X'}

Labels for the design matrix :math:X.

If :math:\mathrm{hxdesc} was returned by :meth:correg.lmm_init <naginterfaces.library.correg.lmm_init> in :math:\textit{hlmm} then :math:X is the design matrix associated with the fixed parameters.

:math:\mathrm{what} = \text{‘Z'}

Labels for the design matrix :math:Z.

If :math:\mathrm{hxdesc} was returned by :meth:correg.lmm_init <naginterfaces.library.correg.lmm_init> in :math:\textit{hlmm} then :math:Z is the part of the design matrix associated with the random parameters.

:math:\mathrm{what} = \text{‘V'}

Labels for the variance components.

**hform** : Handle
A G22 handle to the internal data structure containing a description of the required submodel :math:\mathcal{M}_S, as returned in :math:\textit{hform} by :meth:lm_formula. If :math:\mathrm{what} != \text{‘S'} :math:\mathrm{hform} is not referenced and need not be set.

**hxdesc** : Handle
A G22 handle to the internal data structure containing a description of the design matrix, :math:D.

**lisx** : int
Length of :math:\mathrm{isx}.

**lplab** : int
The length of :math:\mathrm{plab}.

As :math:p\leq m_x+1, if labels are required, using :math:\mathrm{lplab} = m_x+1 will always be sufficient.

**lvinfo** : int
The length of :math:\mathrm{vinfo}.

Let :math:n_T denote the number of terms in :math:M_S, :math:n_{{Tt}} denote the number of variables in the :math:t\ th term and :math:m_{{xt}} denote the number of columns of :math:X corresponding to the :math:t\ th term.

The required size of :math:\mathrm{vinfo}, denoted :math:a is given by:

.. math::
a = \sum_{1}^{n_T}{m_{{xt}}\left(1+3n_{{Tt}}\right)}\text{.}

If the model includes a mean effect, :math:a should be incremented by one.

The values :math:n_T, :math:n_{{Tt}} and :math:m_{{xt}} are not trivial to calculate as they require the formula describing the model to be fully expanded and the contrast / dummy variable encoding to be known.

Therefore, if :math:\mathrm{lisx}, :math:\mathrm{lplab} or :math:\mathrm{lvinfo} are too small and :math:\mathrm{lvinfo}\geq 3, :math:\mathrm{errno} = 102 is returned and the required sizes for these arrays are returned in :math:\mathrm{vinfo}[0], :math:\mathrm{vinfo}[1] and :math:\mathrm{vinfo}[2] respectively.

**lenlab** : int, optional
Length of the strings allocated in :math:\mathrm{plab}. At most :math:\mathrm{lenlab} characters will be written into each element of :math:\mathrm{plab}.

**Returns**
**intcpt** : str
If :math:\mathrm{intcpt} = \text{‘M'}, in order to fit the model :math:\mathcal{M}_S to :math:D using :math:X, any analysis function should include an implicit mean effect (intercept term).

:math:\mathrm{intcpt} = \text{‘Z'}, if :math:\mathcal{M}_S does not include a mean effect or the mean effect has been explicitly included in the design matrix.

**ip** : int
:math:p, the number of parameters in the (sub)model, including the intercept if one is present. If :math:\mathrm{what} = \text{‘S'}, then the submodel is the one specified in :math:\mathrm{hform} otherwise the model is the one used when defining the design matrix described in :math:\mathrm{hxdesc}.

If :math:\mathrm{lisx} \neq 0, if :math:\mathrm{intcpt} = \text{‘Z'}, :math:p = \sum_{{i = 1}}^{m_x}\mathrm{isx}[i-1], otherwise :math:p = \sum_{{i = 1}}^{m_x}\mathrm{isx}[i-1]+1.

**isx** : None or int, ndarray, shape :math:\left(\mathrm{lisx}\right)
If :math:\mathrm{lisx} \neq 0, an array indicating which columns of the design matrix from the model specified in :math:\mathrm{hform} are to be used.

:math:\mathrm{isx}[j-1] = 0

The :math:j\ th column of the design matrix, :math:X, should not be included in the analysis.

:math:\mathrm{isx}[j-1] = 1

The :math:j\ th column of the design matrix, :math:X, should be included in the analysis.

If :math:\mathrm{lisx} = 0, :math:\mathrm{isx} is not referenced.

**plab** : None or str, ndarray, shape :math:\left(\min\left(\mathrm{ip},\mathrm{lplab}\right)\right)
If :math:\mathrm{lplab} \neq 0, the names associated with the :math:p parameters in the model.

If :math:\mathrm{intcpt} = \text{‘Z'}, the labels in :math:\mathrm{plab} are also the labels for the columns of design matrix used in the analysis.

If :math:\mathrm{intcpt} = \text{‘M'}, columns :math:\mathrm{plab}[1] to :math:\mathrm{plab}[p-1] are the corresponding column labels.

If a mean effect is present in :math:M_S, the corresponding label is always in :math:\mathrm{plab}[0].

If :math:\mathrm{lplab} = 0, :math:\mathrm{plab} is not referenced.

**vinfo** : None or int, ndarray, shape :math:\left(\mathrm{lvinfo}\right)
If :math:\mathrm{lvinfo} \neq 0, information encoding a description of the parameters in the model.

The encoding information can be extracted as follows:

(i) Set :math:k = 1.

(#) Iterate :math:j from :math:1 to :math:p.

(1) Set :math:b = \mathrm{vinfo}[k-1].

(#) Increment :math:k.

(#) Iterate :math:i from :math:1 to :math:b.

(a) Set :math:v_i = \mathrm{vinfo}[k-1].

(#) Set :math:l_i = \mathrm{vinfo}[k].

(#) Set :math:c_i = \mathrm{vinfo}[k+1].

(#) Increment :math:k by :math:3.

(#) The :math:j\ th model parameter corresponds to the interaction between the :math:b variables held in columns :math:v_1,v_2,\ldots,v_b of :math:D. Therefore, :math:b = 1 indicates a main effect, :math:b = 2 a two-way interaction, etc..

If :math:b = 0, the :math:j\ th model parameter corresponds to the mean effect.

If :math:l_i = 0, the corresponding variable :math:v_i is binary, ordinal or continuous.

Otherwise, :math:l_i is the level for the corresponding variable for model parameter :math:j.

:math:c_i is a numeric flag indicating the contrast used in the case of a categorical variable.

With :math:c_i = 0 indicating that dummy variables were used for variable :math:v_i in this term.

The remaining six types of contrast; treatment contrasts (with respect to the first and last levels), sum contrasts (with respect to the first and last levels), Helmert contrasts and polynomial contrasts, as described in :meth:lm_design_matrix, are identified by the integers one to six respectively.

If :math:\mathrm{lvinfo} = 0, :math:\mathrm{vinfo} is not referenced.

.. _g22yd-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
On entry, :math:\mathrm{what} = \langle\mathit{\boldsymbol{value}}\rangle was an illegal value.

(errno :math:12)
Supplied value of :math:\mathrm{what} is not valid for the G22 handle supplied in :math:\mathrm{hxdesc}.

(errno :math:21)
:math:\mathrm{hform} has not been initialized or is corrupt.

(errno :math:22)
:math:\mathrm{hform} is not a G22 handle as generated by :meth:lm_formula.

(errno :math:23)
A variable name used when creating :math:\mathrm{hform} is not present in :math:\mathrm{hxdesc}.

Variable name: :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:24)
The model and the design matrix are not consistent. The design matrix was constructed in the presence of a mean effect and the model does not include a mean effect.

(errno :math:25)
The model and the design matrix are not consistent. The model includes a term not present in the design matrix.

Term: :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:26)
The model and the design matrix are not consistent.

Term: :math:\langle\mathit{\boldsymbol{value}}\rangle.

This is likely due to the design matrix being constructed in the presence of either a mean effect or main effect that is not present in the model.

(errno :math:31)
:math:\mathrm{hxdesc} has not been initialized or is corrupt.

(errno :math:32)
:math:\mathrm{hxdesc} is not a G22 handle as generated by :meth:lm_design_matrix.

(errno :math:61)
On entry, :math:\mathrm{lisx} = \langle\mathit{\boldsymbol{value}}\rangle and :math:m_x = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{lisx} = 0 or :math:\mathrm{lisx}\geq m_x.

(errno :math:81)
On entry, :math:\mathrm{lplab} = \langle\mathit{\boldsymbol{value}}\rangle and :math:p = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{lplab} = 0 or :math:\mathrm{lplab}\geq p.

(errno :math:91)
On entry, :math:\mathrm{plab} is too short to hold the parameter labels. Long labels will be truncated.

The longest parameter label is :math:\langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:101)
On entry, :math:\mathrm{lvinfo} is too small.

:math:\mathrm{lvinfo} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{lvinfo} = 0 or :math:\mathrm{lvinfo}\geq \langle\mathit{\boldsymbol{value}}\rangle.

**Warns**
**NagAlgorithmicWarning**
(errno :math:27)
The model and the design matrix are not consistent. The model specifies different contrasts to those used when the design matrix was constructed. The contrasts specified in :math:\mathrm{hform} will be ignored.

(errno :math:28)
The model may not be as expected.

This is due to the model not containing the categorical variable adjusted to account for no mean effect when the design matrix was constructed.

(errno :math:33)
:math:\mathrm{hxdesc} has not passed through the model fitting function.

(errno :math:102)
On entry, one or more of :math:\mathrm{lisx}, :math:\mathrm{lplab} or :math:\mathrm{lvinfo} are nonzero, but too small.

Minimum values are zero, or :math:\langle\mathit{\boldsymbol{value}}\rangle, :math:\langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle respectively.

The minimum values are returned in the first three elements of :math:\mathrm{vinfo}.

.. _g22yd-py2-py-notes:

**Notes**
lm_submodel is a utility function for use with :meth:lm_formula, :meth:lm_describe_data and :meth:lm_design_matrix.
It can be used to construct labels for the columns for an :math:n\times m_x design matrix, :math:X, created by :meth:lm_design_matrix and return additional input vectors and flags required by a number of NAG Library model fitting functions.

Many of the analysis functions that require a design matrix to be supplied allow submodels to be defined through the use of a vector of ones or zeros indicating whether a column of :math:X should be included or excluded from the analyses (see for example :math:\textit{isx} in :meth:correg.linregm_fit <naginterfaces.library.correg.linregm_fit> or :meth:correg.glm_normal <naginterfaces.library.correg.glm_normal>).
This allows nested models to be fit without having to reconstructed the design matrix for each analysis.

Let :math:\mathcal{M} denote a model constructed by :meth:lm_formula, :math:D a data matrix as described by :meth:lm_describe_data and :math:X be the corresponding design matrix constructed by :meth:lm_design_matrix from :math:\mathcal{M} and :math:D.
A different model, :math:\mathcal{M}_S is a submodel of :math:\mathcal{M} if each term in :math:\mathcal{M}_S, including the mean effect (intercept term) is also present in :math:\mathcal{M}.

If :math:\mathcal{M}_S is a submodel of :math:\mathcal{M}, you can fit :math:\mathcal{M}_S to :math:D using a design matrix whose columns are a subset of the columns of :math:X.

--------
:meth:naginterfaces.library.examples.correg.glm_binomial_ex.main
:meth:naginterfaces.library.examples.correg.lmm_init_combine_ex.main
"""
raise NotImplementedError

[docs]def handle_free(handle):
r"""
handle_free destroys a G22 handle and deallocates all the memory used.

.. _g22za-py2-py-doc:

For full information please refer to the NAG Library document for g22za

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22zaf.html

.. _g22za-py2-py-parameters:

**Parameters**
**handle** : Handle, modified in place
On entry: the G22 handle to be destroyed.

On exit: the handle is destroyed and set to a null Handle.

.. _g22za-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:12)
:math:\mathrm{handle} has been corrupted.

(errno :math:13)
:math:\mathrm{handle} is a handle to an unknown data structure.

**Warns**
**NagAlgorithmicWarning**
(errno :math:11)
:math:\mathrm{handle} has not been initialized.

.. _g22za-py2-py-notes:

**Notes**
Each G22 handle should be deallocated to avoid memory leaks.
Therefore, handle_free should be called on all such handles which are no longer needed.
Please note that passing an uninitialized handle might cause unpredictable behaviour, including a crash of your program.
"""
raise NotImplementedError

[docs]def optset(handle, optstr):
r"""
optset is a general option setting function for functions in submodule blgm.
It can set a single option or reset all of them to their default.

.. _g22zm-py2-py-doc:

For full information please refer to the NAG Library document for g22zm

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22zmf.html

.. _g22zm-py2-py-parameters:

**Parameters**
**handle** : Handle
The G22 handle which **must** have been initialized by one of submodule blgm's initialization functions.

**optstr** : str
A string identifying the option, its value and, where required, the instance identifier.

Defaults

Resets all options to their default values.

:math:\textit{option} = \textit{optval}

Sets (all instances) of :math:\textit{option} to :math:\textit{optval}.

:math:\textit{option}:\textit{instance identifier} = \textit{optval}

Sets a single instance of :math:\textit{option} to :math:\textit{optval}.

:math:\textit{option} = \mathbf{default}

Resets (all instances) of :math:\textit{option} to their default value.

:math:\textit{option}:\textit{instance identifier} = \mathbf{default}

Resets a single instance of :math:\textit{option} to its default value.

:math:\mathrm{optstr} is case insensitive and :math:\textit{option}, instance identifier and :math:\textit{optval} may consist of one or more tokens separated by white space.

See the documentation of the individual functions in the G22 Introduction <https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22intro.html>__ for details of valid values for :math:\textit{option}, instance identifier and :math:\textit{optval}.

.. _g22zm-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
:math:\mathrm{handle} has not been initialized or is corrupt.

(errno :math:12)
:math:\mathrm{handle} is not a G22 handle.

(errno :math:21)
On entry, :math:\textit{option} was not recognized.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:22)
On entry, the expected delimiter ':math:=' was not found.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:23)
On entry, :math:\textit{option} is read only.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:24)
On entry, could not convert :math:\textit{optval} to an integer.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:25)
On entry, could not convert :math:\textit{optval} to a real.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:26)
On entry, :math:\textit{optval} is not a valid value for :math:\textit{option}.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:121)
Invalid instance identifier for :math:\textit{option}.

On entry, :math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:122)
Numeric instance identifier is out of range.

On entry, :math:\textit{instance identifier} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\langle\mathit{\boldsymbol{value}}\rangle\leq \textit{instance identifier} and :math:\textit{instance identifier}\leq \langle\mathit{\boldsymbol{value}}\rangle.

**Warns**
**NagAlgorithmicWarning**
(errno :math:123)
On entry, :math:\textit{option} cannot have an associated instance identifier. The supplied instance identifier was ignored.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

.. _g22zm-py2-py-notes:

**Notes**
optset can only be called on G22 handles.
Its purpose is to reset all options to their default values or set a single option to a user-supplied value.

Options and their values are, in general, presented as a character string of the form ':math:\textit{option} = \textit{optval}'; alphabetic characters can be supplied in either upper or lower case. :math:\textit{optval} will normally be either an integer, real or character value as defined in the description of the specific option.
In addition, all options can take an :math:\textit{optval} DEFAULT which resets the option to its default value.

In cases where an option may have multiple instances an instance identifier can be specified.
This is presented using the form ':math:\textit{option}:\textit{instance identifier} = \textit{optval}'.
In such cases, if the instance identifier is omitted, the value of all instances are changed.

Information relating to available option names, their corresponding valid values, whether the use of an instance identifier may be appropriate and what form it can take is given in the individual function documents.

--------
:meth:naginterfaces.library.examples.blgm.lm_formula_ex.main
"""
raise NotImplementedError

[docs]def optget(handle, optstr):
r"""
optget is a general option getting function for submodule blgm.
It is used to query the value of options.

.. _g22zn-py2-py-doc:

For full information please refer to the NAG Library document for g22zn

https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22znf.html

.. _g22zn-py2-py-parameters:

**Parameters**
**handle** : Handle
The G22 handle which **must** have been initialized by one of submodule blgm's initialization functions.

**optstr** : str
A string identifying the option and, where required, the instance identifier.

**identify**

Returns a string description of the G22 handle supplied in :math:\mathrm{handle}. See Further Comments <https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g22/g22znf.html#fcomments>__ for more details.

:math:\textit{option}

Returns the value of :math:\textit{option}. If there are multiple instances of :math:\textit{option}, the value of the first is returned. If not all instances of :math:\textit{option} have the same value, :math:\mathrm{errno} = 124 is returned.

:math:\textit{option}:\textit{instance identifier}

Returns the value of a single instance of :math:\textit{option}.

:math:\mathrm{optstr} is case insensitive and :math:\textit{option} and instance identifier may consist of one or more tokens separated by white space.

See the documentation of the individual submodule blgm functions for details of valid values for :math:\textit{option} and instance identifier.

**Returns**
**optvalue** : dict
The option-value dict, with the following keys:

'value' : float, int or str
The value of the requested option.

'annotation' : None or str

.. _g22zn-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:11)
:math:\mathrm{handle} has not been initialized or is corrupt.

(errno :math:12)
:math:\mathrm{handle} is not a G22 handle.

(errno :math:21)
On entry, :math:\textit{option} was not recognized: :math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:22)
On entry, :math:\textit{option} is not readable: :math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:121)
Invalid instance identifier for :math:\textit{option}.

On entry, :math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:122)
Numeric instance identifier is out of range.

On entry, :math:\textit{instance identifier} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\langle\mathit{\boldsymbol{value}}\rangle\leq \textit{instance identifier} and :math:\textit{instance identifier}\leq \langle\mathit{\boldsymbol{value}}\rangle.

**Warns**
**NagAlgorithmicWarning**
(errno :math:123)
On entry, :math:\textit{option} cannot have an associated instance identifier. The supplied instance identifier was ignored.

:math:\mathrm{optstr} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:124)
:math:\textit{option} has multiple instances. Information from the first instance has been returned.

.. _g22zn-py2-py-notes:

**Notes**
optget can only be called on G22 handles.
It can be used to query the current values of options.

The option of interest is presented as a character string of the form ':math:\textit{option}'

In cases where an option may have multiple instances in a particular G22 handle an instance identifier can be specified.
This is presented using the form ':math:\textit{option}:\textit{instance identifier}'.
In such cases, if the instance identifier is omitted, the value of the first instance is returned.
If the value of option is not the same for all instances and an instance identifier is omitted, a warning is raised.

Information relating to available option names, their corresponding valid values, whether the use of an instance identifier may be appropriate and what form it can take is given in the individual function documents.
"""
raise NotImplementedError