nag_robust_m_estim_1var_usr (g07dcc) (PDF version)
g07 Chapter Contents
g07 Chapter Introduction
NAG C Library Manual

NAG Library Function Document

nag_robust_m_estim_1var_usr (g07dcc)

+ Contents

    1  Purpose
    7  Accuracy

1  Purpose

nag_robust_m_estim_1var_usr (g07dcc) computes an M-estimate of location with (optional) simultaneous estimation of scale, where you provide the weight functions.

2  Specification

#include <nag.h>
#include <nagg07.h>
void  nag_robust_m_estim_1var_usr (
double (*chi)(double t, Nag_Comm *comm),
double (*psi)(double t, Nag_Comm *comm),
Integer isigma, Integer n, const double x[], double beta, double *theta, double *sigma, Integer maxit, double tol, double rs[], Integer *nit, Nag_Comm *comm, NagError *fail)

3  Description

The data consists of a sample of size n, denoted by x1,x2,,xn, drawn from a random variable X.
The xi are assumed to be independent with an unknown distribution function of the form,
Fxi-θ/σ
where θ is a location argument, and σ is a scale argument. M-estimators of θ and σ are given by the solution to the following system of equations;
i=1nψxi-θ^/σ^ = 0 i=1nχxi-θ^/σ^ = n-1β
where ψ and χ are user-supplied weight functions, and β is a constant. Optionally the second equation can be omitted and the first equation is solved for θ^ using an assigned value of σ=σc.
The constant β should be chosen so that σ^ is an unbiased estimator when xi, for i=1,2,,n has a Normal distribution. To achieve this the value of β is calculated as:
β=Eχ=-χz12πexp-z22dz
The values of ψ xi-θ^σ^ σ^ are known as the Winsorized residuals.
The equations are solved by a simple iterative procedure, suggested by Huber:
σ^k=1βn-1 i=1nχ xi-θ^k-1σ^k-1 σ^k-12
and
θ^k=θ^k- 1+1n i= 1nψ xi-θ^k- 1σ^k σ^k
or
σ^k=σc
if σ is fixed.
The initial values for θ^ and σ^ may be user-supplied or calculated within nag_robust_m_estim_1var (g07dbc) as the sample median and an estimate of σ based on the median absolute deviation respectively.
nag_robust_m_estim_1var_usr (g07dcc) is based upon function LYHALG within the ROBETH library, see Marazzi (1987).

4  References

Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

5  Arguments

1:     chifunction, supplied by the userExternal Function
chi must return the value of the weight function χ for a given value of its argument. The value of χ must be non-negative.
The specification of chi is:
double  chi (double t, Nag_Comm *comm)
1:     tdoubleInput
On entry: the argument for which chi must be evaluated.
2:     commNag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to chi.
userdouble *
iuserInteger *
pPointer 
The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by chi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.2.1 in the Essential Introduction).
2:     psifunction, supplied by the userExternal Function
psi must return the value of the weight function ψ for a given value of its argument.
The specification of psi is:
double  psi (double t, Nag_Comm *comm)
1:     tdoubleInput
On entry: the argument for which psi must be evaluated.
2:     commNag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to psi.
userdouble *
iuserInteger *
pPointer 
The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by psi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.2.1 in the Essential Introduction).
3:     isigmaIntegerInput
On entry: the value assigned to isigma determines whether σ^ is to be simultaneously estimated.
isigma=0
The estimation of σ^ is bypassed and sigma is set equal to σc.
isigma=1
σ^ is estimated simultaneously.
4:     nIntegerInput
On entry: n, the number of observations.
Constraint: n>1.
5:     x[n]const doubleInput
On entry: the vector of observations, x1,x2,,xn.
6:     betadoubleInput
On entry: the value of the constant β of the chosen chi function.
Constraint: beta>0.0.
7:     thetadouble *Input/Output
On entry: if sigma>0, then theta must be set to the required starting value of the estimate of the location argument θ^. A reasonable initial value for θ^ will often be the sample mean or median.
On exit: the M-estimate of the location argument θ^.
8:     sigmadouble *Input/Output
On entry: the role of sigma depends on the value assigned to isigma as follows.
If isigma=1, sigma must be assigned a value which determines the values of the starting points for the calculation of θ^ and σ^. If sigma0.0, then nag_robust_m_estim_1var_usr (g07dcc) will determine the starting points of θ^ and σ^. Otherwise, the value assigned to sigma will be taken as the starting point for σ^, and theta must be assigned a relevant value before entry, see above.
If isigma=0, sigma must be assigned a value which determines the values of σc, which is held fixed during the iterations, and the starting value for the calculation of θ^. If sigma0, then nag_robust_m_estim_1var_usr (g07dcc) will determine the value of σc as the median absolute deviation adjusted to reduce bias (see nag_median_1var (g07dac)) and the starting point for θ. Otherwise, the value assigned to sigma will be taken as the value of σc and theta must be assigned a relevant value before entry, see above.
On exit: the M-estimate of the scale argument σ^, if isigma was assigned the value 1 on entry, otherwise sigma will contain the initial fixed value σc.
9:     maxitIntegerInput
On entry: the maximum number of iterations that should be used during the estimation.
Suggested value: maxit=50.
Constraint: maxit>0.
10:   toldoubleInput
On entry: the relative precision for the final estimates. Convergence is assumed when the increments for theta, and sigma are less than tol×max1.0,σk-1.
Constraint: tol>0.0.
11:   rs[n]doubleOutput
On exit: the Winsorized residuals.
12:   nitInteger *Output
On exit: the number of iterations that were used during the estimation.
13:   commNag_Comm *Communication Structure
The NAG communication argument (see Section 3.2.1.1 in the Essential Introduction).
14:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument value had an illegal value.
NE_FUN_RET_VAL
The chi function returned a negative value: chi=value.
NE_INT
On entry, isigma=value.
Constraint: isigma=0 or 1.
On entry, maxit=value.
Constraint: maxit>0.
On entry, n=value.
Constraint: n>1.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_REAL
On entry, beta=value.
Constraint: beta>0.0.
On entry, tol=value.
Constraint: tol>0.0.
NE_REAL_ARRAY_ELEM_CONS
All elements of x are equal.
NE_SIGMA_NEGATIVE
Current estimate of sigma is zero or negative: sigma=value.
NE_TOO_MANY_ITER
Number of iterations required exceeds maxit: maxit=value.
NE_ZERO_RESID
All winsorized residuals are zero.

7  Accuracy

On successful exit the accuracy of the results is related to the value of tol, see Section 5.

8  Further Comments

Standard forms of the functions ψ and χ are given in Hampel et al. (1986), Huber (1981) and Marazzi (1987). nag_robust_m_estim_1var (g07dbc) calculates M-estimates using some standard forms for ψ and χ.
When you supply the initial values, care has to be taken over the choice of the initial value of σ. If too small a value is chosen then initial values of the standardized residuals xi-θ^kσ  will be large. If the redescending ψ functions are used, i.e., ψ=0 if t>τ, for some positive constant τ, then these large values are Winsorized as zero. If a sufficient number of the residuals fall into this category then a false solution may be returned, see page 152 of Hampel et al. (1986).

9  Example

The following program reads in a set of data consisting of eleven observations of a variable X.
The psi and chi functions used are Hampel's Piecewise Linear Function and Hubers chi function respectively.
Using the following starting values various estimates of θ and σ are calculated and printed along with the number of iterations used:
(a) nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, σ is estimated simultaneously.
(b) You must supply the starting values, σ is estimated simultaneously.
(c) nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, σ is fixed.
(d) You must supply the starting values, σ is fixed.

9.1  Program Text

Program Text (g07dcce.c)

9.2  Program Data

Program Data (g07dcce.d)

9.3  Program Results

Program Results (g07dcce.r)


nag_robust_m_estim_1var_usr (g07dcc) (PDF version)
g07 Chapter Contents
g07 Chapter Introduction
NAG C Library Manual

© The Numerical Algorithms Group Ltd, Oxford, UK. 2012