naginterfaces.library.correg.linregm_fit_onestep¶
- naginterfaces.library.correg.linregm_fit_onestep(istep, x, vname, isx, y, model, nterm, rss, idf, ifr, free, q, p, mean='M', wt=None, fin=2.0)[source]¶
linregm_fit_onestep
carries out one step of a forward selection procedure in order to enable the ‘best’ linear regression model to be found.For full information please refer to the NAG Library document for g02ee
https://www.nag.com/numeric/nl/nagdoc_29/flhtml/g02/g02eef.html
- Parameters
- istepint
Indicates which step in the forward selection process is to be carried out.
The process is initialized.
- xfloat, array-like, shape
must contain the th observation for the th independent variable, for , for .
- vnamestr, array-like, shape
must contain the name of the independent variable in column of , for .
- isxint, array-like, shape
Indicates which independent variables could be considered for inclusion in the regression.
The variable contained in the th column of is automatically included in the regression model, for .
The variable contained in the th column of is considered for inclusion in the regression model, for .
The variable in the th column is not considered for inclusion in the model, for .
- yfloat, array-like, shape
The dependent variable.
- modelstr, array-like, shape
If , need not be set.
If , must contain the values returned by the previous call to
linregm_fit_onestep
.- ntermint
If , need not be set.
If , must contain the value returned by the previous call to
linregm_fit_onestep
.- rssfloat
If , need not be set.
If , must contain the value returned by the previous call to
linregm_fit_onestep
.- idfint
If , need not be set.
If , must contain the value returned by the previous call to
linregm_fit_onestep
.- ifrint
If , need not be set.
If , must contain the value returned by the previous call to
linregm_fit_onestep
.- freestr, array-like, shape
If , need not be set.
If , must contain the values returned by the previous call to
linregm_fit_onestep
.- qfloat, array-like, shape
If , need not be set.
If , must contain the values returned by the previous call to
linregm_fit_onestep
.- pfloat, array-like, shape
If , need not be set.
If , must contain the values returned by the previous call to
linregm_fit_onestep
.- meanstr, length 1, optional
Indicates if a mean term is to be included.
A mean term, intercept, will be included in the model.
The model will pass through the origin, zero-point.
- wtNone or float, array-like, shape , optional
If provided must contain the weights to be used with the model.
If , the th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.
If is not provided the effective number of observations is .
- finfloat, optional
The critical value of the statistic for the term to be included in the model, .
- Returns
- istepint
Is incremented by .
- addvarbool
Indicates if a variable has been added to the model.
A variable has been added to the model.
No variable had an value greater than and none were added to the model.
- newvarstr
If , contains the name of the variable added to the model.
- chrssfloat
If , contains the change in the residual sum of squares due to adding variable .
- ffloat
If , contains the statistic for the inclusion of the variable in .
- modelstr, ndarray, shape
The names of the variables in the current model.
- ntermint
The number of independent variables in the current model, not including the mean, if any.
- rssfloat
The residual sums of squares for the current model.
- idfint
The degrees of freedom for the residual sum of squares for the current model.
- ifrint
The number of free independent variables, i.e., the number of variables not in the model that are still being considered for selection.
- freestr, ndarray, shape
The first values of contain the names of the free variables.
- exssfloat, ndarray, shape
The first values of contain what would be the change in regression sum of squares if the free variables had been added to the model, i.e., the extra sum of squares for the free variables. contains what would be the change in regression sum of squares if the variable had been added to the model.
- qfloat, ndarray, shape
The results of the decomposition for the current model:
the first column of contains (or where is the vector of weights if used);
the upper triangular part of columns to contain the matrix;
the strictly lower triangular part of columns to contain details of the matrix;
the remaining to columns of contain (or ),
where , or if .
- pfloat, ndarray, shape
The first elements of contain details of the decomposition, where , or if .
- Raises
- NagValueError
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, and .
Constraint: if , .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: or .
- (errno )
On entry, .
Constraint: or .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: , for .
- (errno )
On entry, number of forced variables .
- (errno )
Degrees of freedom for error will equal if new variable is added, i.e., the number of variables in the model plus is equal to the effective number of observations.
- (errno )
On entry, .
Constraint: must be large enough to accommodate the number of terms given by .
- (errno )
On entry, .
Constraint: , for .
- (errno )
On entry, , for all .
Constraint: at least one value of must be nonzero.
- Warns
- NagAlgorithmicWarning
- (errno )
On entry, the variables forced into the model are not of full rank, i.e., some of these variables are linear combinations of others.
- (errno )
There are no free variables, i.e., no element of .
- (errno )
The value of the change in the sum of squares is greater than the input value of . This may occur due to rounding errors if the true residual sum of squares for the new model is small relative to the residual sum of squares for the previous model.
- Notes
One method of selecting a linear regression model from a given set of independent variables is by forward selection. The following procedure is used:
Select the best fitting independent variable, i.e., the independent variable which gives the smallest residual sum of squares. If the -test for this variable is greater than a chosen critical value, , then include the variable in the model, else stop.
Find the independent variable that leads to the greatest reduction in the residual sum of squares when added to the current model.
If the -test for this variable is greater than a chosen critical value, , then include the variable in the model and go to (2), otherwise stop.
At any step the variables not in the model are known as the free terms.
linregm_fit_onestep
allows you to specify some independent variables that must be in the model, these are known as forced variables.The computational procedure involves the use of decompositions, the and the matrices being updated as each new variable is added to the model. In addition the matrix , where is the matrix of variables not included in the model, is updated.
linregm_fit_onestep
computes one step of the forward selection procedure at a call. The results produced at each step may be printed or used as inputs tolinregm_update()
, in order to compute the regression coefficients for the model fitted at that step. Repeated calls tolinregm_fit_onestep
should be made until is indicated.
- References
Draper, N R and Smith, H, 1985, Applied Regression Analysis, (2nd Edition), Wiley
Weisberg, S, 1985, Applied Linear Regression, Wiley