naginterfaces.library.correg.ssqmat

naginterfaces.library.correg.ssqmat(x, mean='M', wt=None)[source]

ssqmat calculates the sample means and sums of squares and cross-products, or sums of squares and cross-products of deviations from the mean, in a single pass for a set of data. The data may be weighted.

For full information please refer to the NAG Library document for g02bu

https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g02/g02buf.html

Parameters
xfloat, array-like, shape

must contain the th observation on the th variable, for , for .

meanstr, length 1, optional

Indicates whether ssqmat is to calculate sums of squares and cross-products, or sums of squares and cross-products of deviations about the mean.

The sums of squares and cross-products of deviations about the mean are calculated.

The sums of squares and cross-products are calculated.

wtNone or float, array-like, shape , optional

The optional weights of each observation. If weights are not provided then must be set to None, otherwise must contain the weight for the th observation.

Returns
swfloat

The sum of weights.

If , contains the number of observations, .

wmeanfloat, ndarray, shape

The sample means. contains the mean for the th variable.

cfloat, ndarray, shape

The cross-products.

If , contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products of deviations about the mean.

If , contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products.

These are stored packed by columns, i.e., the cross-product between the th and th variable, , is stored in .

Raises
NagValueError
(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, .

Constraint: , for .

Notes

ssqmat is an adaptation of West’s WV2 algorithm; see West (1979). This function calculates the (optionally weighted) sample means and (optionally weighted) sums of squares and cross-products or sums of squares and cross-products of deviations from the (weighted) mean for a sample of observations on variables , for . The algorithm makes a single pass through the data.

For the first observations let the mean of the th variable be , the cross-product about the mean for the th and th variables be and the sum of weights be . These are updated by the th observation, , for , with weight as follows:

and

The algorithm is initialized by taking , the first observation, and .

For the unweighted case and for all .

Note that only the upper triangle of the matrix is calculated and returned packed by column.

References

Chan, T F, Golub, G H and Leveque, R J, 1982, Updating Formulae and a Pairwise Algorithm for Computing Sample Variances, Compstat, Physica-Verlag

West, D H D, 1979, Updating mean and variance estimates: An improved method, Comm. ACM (22), 532–555