naginterfaces.library.rand.subsamp_​xyw

naginterfaces.library.rand.subsamp_xyw(nt, x, statecomm, sordx=1, y=None, w=None)[source]

subsamp_xyw generates a dataset suitable for use with repeated random sub-sampling validation.

For full information please refer to the NAG Library document for g05pw

https://www.nag.com/numeric/nl/nagdoc_27.1/flhtml/g05/g05pwf.html

Parameters
ntint

, the number of observations in the training dataset.

xfloat, ndarray, shape , modified in place

Note: the required extent for this argument in dimension 1 is determined as follows: if : ; otherwise: .

Note: the required extent for this argument in dimension 2 is determined as follows: if : ; if : ; otherwise: .

The way the data is stored in is defined by .

If , contains the th observation for the th variable, for and .

If , contains the th observation for the th variable, for and .

On entry: must hold , the values of for the original dataset. This may be the same as updated by a previous call to subsamp_xyw.

On exit: values of for the training and validation datasets, with held in observations to and in observations to .

statecommdict, RNG communication object, modified in place

RNG communication structure.

This argument must have been initialized by a prior call to init_repeat() or init_nonrepeat().

sordxint, optional

Determines how variables are stored in .

yNone or float, ndarray, shape , optional, modified in place

Note: the required length for this argument is determined as follows: if : ; otherwise: .

Optionally, on entry: must hold , the values of for the original dataset. This may be the same as updated by a previous call to subsamp_xyw.

On exit, if not None on entry: values of for the training and validation datasets, with held in elements to and in elements to .

wNone or float, ndarray, shape , optional, modified in place

Note: the required length for this argument is determined as follows: if : ; otherwise: .

Optionally, on entry: must hold , the values of for the original dataset. This may be the same as updated by a previous call to subsamp_xyw.

On exit, if not None on entry: values of for the training and validation datasets, with held in elements to and in elements to .

Raises
NagValueError
(errno )

On entry, and .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, and .

Constraint: if , .

(errno )

On entry, and .

Constraint: if , .

(errno )

On entry, [‘state’] vector has been corrupted or not initialized.

Notes

Let denote a matrix of observations on variables and and each denote a vector of length . For example, might represent a matrix of independent variables, the dependent variable and the associated weights in a weighted regression.

subsamp_xyw generates a series of training datasets, denoted by the matrix, vector, vector triplet of observations, and validation datasets, denoted with observations. These training and validation datasets are generated by randomly assigning each observation to either the training dataset or the validation dataset.

The resulting datasets are suitable for use with repeated random sub-sampling validation.

One of the initialization functions init_repeat() (for a repeatable sequence if computed sequentially) or init_nonrepeat() (for a non-repeatable sequence) must be called prior to the first call to subsamp_xyw.