naginterfaces.library.smooth.fit_​spline_​parest

naginterfaces.library.smooth.fit_spline_parest(method, x, y, crit, wt=None, u=0.0, tol=0.0, maxcal=0)[source]

fit_spline_parest estimates the values of the smoothing parameter and fits a cubic smoothing spline to a set of data.

For full information please refer to the NAG Library document for g10ac

https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g10/g10acf.html

Parameters
methodstr, length 1

Indicates whether the smoothing parameter is to be found by minimization of the CV or GCV functions, or by finding the smoothing parameter corresponding to a specified degrees of freedom value.

Cross-validation is used.

The degrees of freedom are specified.

Generalized cross-validation is used.

xfloat, array-like, shape

The distinct and ordered values , for .

yfloat, array-like, shape

The values , for .

critfloat

If , the required degrees of freedom for the spline.

If or , need not be set.

wtNone or float, array-like, shape , optional

If , must contain the weights. Otherwise is not referenced and unit weights are assumed.

ufloat, optional

The upper bound on the smoothing parameter. If , will be used instead. See Further Comments for details on how this argument is used.

tolfloat, optional

The accuracy to which the smoothing parameter is required. should preferably be not much less than , where is the machine precision. If , will be used instead.

maxcalint, optional

The maximum number of spline evaluations to be used in finding the value of . If , will be used instead.

Returns
yhatfloat, ndarray, shape

The fitted values, , for .

cfloat, ndarray, shape

The spline coefficients. More precisely, the value of the spline approximation at is given by , where and .

rssfloat

The (weighted) residual sum of squares.

dffloat

The residual degrees of freedom. If this will be to the required accuracy.

resfloat, ndarray, shape

The (weighted) residuals, , for .

hfloat, ndarray, shape

The leverages, , for .

critfloat

If , the value of the cross-validation, or if , the value of the generalized cross-validation function, evaluated at the value of returned in .

rhofloat

The smoothing parameter, .

Raises
NagValueError
(errno )

On entry, .

Constraint: if , .

(errno )

On entry, .

Constraint: if , .

(errno )

On entry, is not valid: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, at least one element of .

(errno )

On entry, is not a strictly ordered array.

(errno )

For the specified degrees of freedom, : .

Warns
NagAlgorithmicWarning
(errno )

Accuracy of cannot be achieved: .

(errno )

iterations have been performed.

(errno )

Optimum value of lies above : .

Notes

In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

For a set of observations , for , the spline provides a flexible smooth function for situations in which a simple polynomial or nonlinear regression model is not suitable.

Cubic smoothing splines arise as the unique real-valued solution function , with absolutely continuous first derivative and squared-integrable second derivative, which minimizes

where is the (optional) weight for the th observation and is the smoothing parameter. This criterion consists of two parts: the first measures the fit of the curve and the second the smoothness of the curve. The value of the smoothing parameter weights these two aspects; larger values of give a smoother fitted curve but, in general, a poorer fit. For details of how the cubic spline can be fitted see Hutchinson and de Hoog (1985) and Reinsch (1967).

The fitted values, , and weighted residuals, , can be written as:

for a matrix . The residual degrees of freedom for the spline is and the diagonal elements of are the leverages.

The parameter can be estimated in a number of ways.

  1. The degrees of freedom for the spline can be specified, i.e., find such that for given .

  2. Minimize the cross-validation (CV), i.e., find such that the CV is minimized, where

  3. Minimize the generalized cross-validation (GCV), i.e., find such that the GCV is minimized, where

fit_spline_parest requires the to be strictly increasing. If two or more observations have the same value then they should be replaced by a single observation with equal to the (weighted) mean of the values and weight, , equal to the sum of the weights. This operation can be performed by data_order().

The algorithm is based on Hutchinson (1986). roots.contfn_brent_rcomm is used to solve for given and the method of opt.one_var_func is used to minimize the GCV or CV.

References

Hastie, T J and Tibshirani, R J, 1990, Generalized Additive Models, Chapman and Hall

Hutchinson, M F, 1986, Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines, ACM Trans. Math. Software (12), 150–153

Hutchinson, M F and de Hoog, F R, 1985, Smoothing noisy data with spline functions, Numer. Math. (47), 99–106

Reinsch, C H, 1967, Smoothing by spline functions, Numer. Math. (10), 177–183