hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_correg_robustm_corr_user (g02hm)

Purpose

nag_correg_robustm_corr_user (g02hm) computes a robust estimate of the covariance matrix for user-supplied weight functions. The derivatives of the weight functions are not required.

Syntax

[user, cov, a, wt, theta, nit, ifail] = g02hm(ucv, indm, x, a, theta, 'user', user, 'n', n, 'm', m, 'bl', bl, 'bd', bd, 'maxit', maxit, 'nitmon', nitmon, 'tol', tol)
[user, cov, a, wt, theta, nit, ifail] = nag_correg_robustm_corr_user(ucv, indm, x, a, theta, 'user', user, 'n', n, 'm', m, 'bl', bl, 'bd', bd, 'maxit', maxit, 'nitmon', nitmon, 'tol', tol)
Note: the interface to this routine has changed since earlier releases of the toolbox:
Mark 22: n has been made optional
Mark 23: nitmon, tol now optional
.

Description

For a set of n n  observations on m m  variables in a matrix X X , a robust estimate of the covariance matrix, C C , and a robust estimate of location, θ θ , are given by
C = τ2(ATA)1,
C=τ2 (ATA) -1,
where τ2 τ2  is a correction factor and A A  is a lower triangular matrix found as the solution to the following equations.
zi = A(xiθ)
zi=A (xi-θ)
n
1/nw(zi2)zi = 0
i = 1
1n i= 1nw ( zi 2) zi=0
and
n
1/nu(zi2)ziziTv(zi2)I = 0,
i = 1
1ni=1nu ( zi 2) zi ziT -v ( zi 2) I=0,
where xi xi  is a vector of length m m  containing the elements of the i i th row of X X ,
zi zi  is a vector of length m m ,
I I  is the identity matrix and 0 0  is the zero matrix.
and w w  and u u  are suitable functions.
nag_correg_robustm_corr_user (g02hm) covers two situations:
(i) v(t) = 1 v (t) =1  for all t t ,
(ii) v(t) = u(t) v (t) =u (t) .
The robust covariance matrix may be calculated from a weighted sum of squares and cross-products matrix about θ θ  using weights wti = u(zi) wti=u ( zi ) . In case (i) a divisor of n n  is used and in case (ii) a divisor of i = 1nwti i=1nwti  is used. If w( . ) = sqrt(u( . )) w (.) =u (.) , then the robust covariance matrix can be calculated by scaling each row of X X  by sqrt(wti) wti  and calculating an unweighted covariance matrix about θ θ .
In order to make the estimate asymptotically unbiased under a Normal model a correction factor, τ2 τ2 , is needed. The value of the correction factor will depend on the functions employed (see Huber (1981) and Marazzi (1987)).
nag_correg_robustm_corr_user (g02hm) finds A A  using the iterative procedure as given by Huber; see Huber (1981).
Ak = (Sk + I)Ak1
Ak= (Sk+I) Ak-1
and
θjk = (bj)/(D1) + θjk 1,
θjk=bjD1+θjk- 1,
where Sk = (sjl) Sk= (sjl) , for j = 1,2,,m j=1,2,,m  and l = 1,2,,m l=1,2,,m  is a lower triangular matrix such that
sjl =
{ − min [max (hjl / D2, − BL),BL], j > l − min [max ((1/2)(hjj / D2 − 1), − BD),BD], j = l
,
sjl= { -min [max (hjl/D2,-BL) ,BL] , j>l -min [max (12 (hjj/D2-1) ,-BD) ,BD] , j=l ,
where and BD BD  and BL BL  are suitable bounds.
The value of τ τ  may be chosen so that C C  is unbiased if the observations are from a given distribution.
nag_correg_robustm_corr_user (g02hm) is based on routines in ROBETH; see Marazzi (1987).

References

Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Weights for bounded influence regression in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 3 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

Parameters

Compulsory Input Parameters

1:     ucv – function handle or string containing name of m-file
ucv must return the values of the functions u u  and w w  for a given value of its argument.
[user, u, w] = ucv(t, user)

Input Parameters

1:     t – double scalar
The argument for which the functions u u  and w w  must be evaluated.
2:     user – Any MATLAB object
ucv is called from nag_correg_robustm_corr_user (g02hm) with the object supplied to nag_correg_robustm_corr_user (g02hm).

Output Parameters

1:     user – Any MATLAB object
2:     u – double scalar
The value of the u u function at the point t.
3:     w – double scalar
The value of the w w function at the point t.
2:     indm – int64int32nag_int scalar
Indicates which form of the function v v  will be used.
indm = 1 indm=1
v = 1 v=1 .
indm1 indm1
v = u v=u .
3:     x(ldx,m) – double array
ldx, the first dimension of the array, must satisfy the constraint ldxn ldxn .
x(i,j) x (i,j)  must contain the i i th observation on the j j th variable, for i = 1,2,,n i=1,2,,n  and j = 1,2,,m j=1,2,,m .
4:     a(m × (m + 1) / 2 m× (m+1) /2 ) – double array
An initial estimate of the lower triangular real matrix A A . Only the lower triangular elements must be given and these should be stored row-wise in the array.
The diagonal elements must be 0 0 , and in practice will usually be > 0 >0 . If the magnitudes of the columns of X X  are of the same order, the identity matrix will often provide a suitable initial value for A A . If the columns of X X  are of different magnitudes, the diagonal elements of the initial value of A A  should be approximately inversely proportional to the magnitude of the columns of X X .
Constraint: a(j × (j1) / 2 + j)0.0 a (j× (j-1) /2+j) 0.0 , for j = 1,2,,m j=1,2,,m .
5:     theta(m) – double array
m, the dimension of the array, must satisfy the constraint 1mn 1mn .
An initial estimate of the location parameter, θj θj , for j = 1,2,,m j=1,2,,m .
In many cases an initial estimate of θj = 0 θj=0 , for j = 1,2,,m j=1,2,,m , will be adequate. Alternatively medians may be used as given by nag_univar_robust_1var_median (g07da).

Optional Input Parameters

1:     user – Any MATLAB object
user is not used by nag_correg_robustm_corr_user (g02hm), but is passed to ucv. Note that for large objects it may be more efficient to use a global variable which is accessible from the m-files than to use user.
2:     n – int64int32nag_int scalar
Default: The first dimension of the array x.
n n , the number of observations.
Constraint: n > 1 n>1 .
3:     m – int64int32nag_int scalar
Default: The dimension of the array theta and the second dimension of the array x. (An error is raised if these dimensions are not equal.)
m m , the number of columns of the matrix X X , i.e., number of independent variables.
Constraint: 1mn 1mn .
4:     bl – double scalar
The magnitude of the bound for the off-diagonal elements of Sk Sk , BL BL .
Default: 0.9 0.9
Constraint: bl > 0.0 bl>0.0 .
5:     bd – double scalar
The magnitude of the bound for the diagonal elements of Sk Sk , BD BD .
Default: 0.9 0.9
Constraint: bd > 0.0 bd>0.0 .
6:     maxit – int64int32nag_int scalar
The maximum number of iterations that will be used during the calculation of A A .
Default: 150 150
Constraint: maxit > 0 maxit>0 .
7:     nitmon – int64int32nag_int scalar
Indicates the amount of information on the iteration that is printed.
nitmon > 0 nitmon>0
The value of A A , θ θ  and δ δ  (see Section [Accuracy]) will be printed at the first and every nitmon iterations.
nitmon0 nitmon0
No iteration monitoring is printed.
When printing occurs the output is directed to the current advisory message channel (See nag_file_set_unit_advisory (x04ab).)
Default: 0 0
8:     tol – double scalar
The relative precision for the final estimate of the covariance matrix. Iteration will stop when maximum δ δ  (see Section [Accuracy]) is less than tol.
Default: 5e-5 5e-5
Constraint: tol > 0.0 tol>0.0 .

Input Parameters Omitted from the MATLAB Interface

ruser ldx wk

Output Parameters

1:     user – Any MATLAB object
2:     cov(m × (m + 1) / 2 m× (m+1) /2 ) – double array
A robust estimate of the covariance matrix, C C . The upper triangular part of the matrix C C is stored packed by columns (lower triangular stored by rows), that is Cij Cij is returned in cov(j × (j1) / 2 + i) cov (j× (j-1) /2+i) , ij ij .
3:     a(m × (m + 1) / 2 m× (m+1) /2 ) – double array
The lower triangular elements of the inverse of the matrix A A , stored row-wise.
4:     wt(n) – double array
wt(i) wt (i) contains the weights, wti = u(zi2) wti=u ( zi 2) , for i = 1,2,,n i=1,2,,n .
5:     theta(m) – double array
Contains the robust estimate of the location parameter, θj θj , for j = 1,2,,m j=1,2,,m .
6:     nit – int64int32nag_int scalar
The number of iterations performed.
7:     ifail – int64int32nag_int scalar
ifail = 0 ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

  ifail = 1 ifail=1
On entry,n1 n1 ,
orm < 1 m<1 ,
orn < m n<m ,
orldx < n ldx<n .
  ifail = 2 ifail=2
On entry,tol0.0 tol0.0 ,
ormaxit0 maxit0 ,
ordiagonal element of a = 0.0 a=0.0 ,
orbl0.0 bl0.0 ,
orbd0.0 bd0.0 .
  ifail = 3 ifail=3
A column of x has a constant value.
  ifail = 4 ifail=4
Value of u or w returned by ucv < 0 ucv<0 .
  ifail = 5 ifail=5
The function has failed to converge in maxit iterations.
W ifail = 6 ifail=6
Either the sum D1 D1  or the sum D2 D2  is zero. This may be caused by the functions u u  or w w  being too strict for the current estimate of A A  (or C C ). You should either try a larger initial estimate of A A  or make the u u  and w w  functions less strict.

Accuracy

On successful exit the accuracy of the results is related to the value of tol; see Section [Parameters]. At an iteration let
(i) d1 = d1=  the maximum value of |sjl| |sjl|
(ii) d2 = d2=  the maximum absolute change in wt(i) wt (i)
(iii) d3 = d3=  the maximum absolute relative change in θj θj
and let δ = max (d1,d2,d3) δ=max (d1,d2,d3) . Then the iterative procedure is assumed to have converged when δ < tol δ<tol

Further Comments

The existence of A A  will depend upon the function u u  (see Marazzi (1987)); also if X X  is not of full rank a value of A A  will not be found. If the columns of X X  are almost linearly related, then convergence will be slow.
If derivatives of the u u  and w w  functions are available then the method used in nag_correg_robustm_corr_user_deriv (g02hl) will usually give much faster convergence.

Example

function nag_correg_robustm_corr_user_example
indm = int64(1);
x = [3.4, 6.9, 12.2;
     6.4, 2.5, 15.1;
     4.9, 5.5, 14.2;
     7.3, 1.9, 18.2;
     8.8, 3.6, 11.7;
     8.4, 1.3, 17.9;
     5.3, 3.1, 15;
     2.7, 8.1, 7.7;
     6.1, 3, 21.9;
     5.3, 2.2, 13.9];
a = [1;
     0;
     1;
     0;
     0;
     1];
theta = [0;
     0;
     0];
user = [4, 2];
[user, covar, aOut, wt, thetaOut, nit, ifail] = ...
    nag_correg_robustm_corr_user(@ucv, indm, x, a, theta, 'user', user)

function [userp, u, w] = ucv(t, userp)
  cu = userp(1);
  u = 1.0;
  if (t ~= 0)
    t2 = t*t;
    if (t2 > cu)
      u = cu/t2;
    end
  end
  % w function and derivative
  cw = userp(2);
  if (t > cw)
    w = cw/t;
  else
    w = 1.0;
  end
 

user =

     4     2


covar =

    3.2779
   -3.6918
    5.2841
    4.7391
   -6.4087
   11.8373


aOut =

    0.5523
    1.0614
    0.9424
   -0.1880
    0.4776
    0.5021


wt =

    1.0000
    1.0000
    1.0000
    1.0000
    0.2339
    1.0000
    1.0000
    0.9385
    0.4012
    0.7579


thetaOut =

    5.6998
    3.8636
   14.7036


nit =

                   34


ifail =

                    0


function g02hm_example
indm = int64(1);
x = [3.4, 6.9, 12.2;
     6.4, 2.5, 15.1;
     4.9, 5.5, 14.2;
     7.3, 1.9, 18.2;
     8.8, 3.6, 11.7;
     8.4, 1.3, 17.9;
     5.3, 3.1, 15;
     2.7, 8.1, 7.7;
     6.1, 3, 21.9;
     5.3, 2.2, 13.9];
a = [1;
     0;
     1;
     0;
     0;
     1];
theta = [0;
     0;
     0];
user = [4, 2];
[user, covar, aOut, wt, thetaOut, nit, ifail] = ...
    g02hm(@ucv, indm, x, a, theta, 'user', user)

function [userp, u, w] = ucv(t, userp)
  cu = userp(1);
  u = 1.0;
  if (t ~= 0)
    t2 = t*t;
    if (t2 > cu)
      u = cu/t2;
    end
  end
  % w function and derivative
  cw = userp(2);
  if (t > cw)
    w = cw/t;
  else
    w = 1.0;
  end
 

user =

     4     2


covar =

    3.2779
   -3.6918
    5.2841
    4.7391
   -6.4087
   11.8373


aOut =

    0.5523
    1.0614
    0.9424
   -0.1880
    0.4776
    0.5021


wt =

    1.0000
    1.0000
    1.0000
    1.0000
    0.2339
    1.0000
    1.0000
    0.9385
    0.4012
    0.7579


thetaOut =

    5.6998
    3.8636
   14.7036


nit =

                   34


ifail =

                    0



PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013