Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_correg_coeffs_kspearman (g02bq)

## Purpose

nag_correg_coeffs_kspearman (g02bq) computes Kendall and/or Spearman nonparametric rank correlation coefficients for a set of data; the data array is preserved, and the ranks of the observations are not available on exit from the function.

## Syntax

[rr, ifail] = g02bq(x, itype, 'n', n, 'm', m)
[rr, ifail] = nag_correg_coeffs_kspearman(x, itype, 'n', n, 'm', m)
Note: the interface to this routine has changed since earlier releases of the toolbox:
 At Mark 22: n was made optional

## Description

The input data consists of $n$ observations for each of $m$ variables, given as an array
 $xij, i=1,2,…,nn≥2,j=1,2,…,mm≥2,$
where ${x}_{ij}$ is the $i$th observation on the $j$th variable.
The observations are first ranked, as follows.
For a given variable, $j$ say, each of the $n$ observations, ${x}_{1j},{x}_{2j},\dots ,{x}_{nj}$, has associated with it an additional number, the ‘rank’ of the observation, which indicates the magnitude of that observation relative to the magnitude of the other $n-1$ observations on that same variable.
The smallest observation for variable $j$ is assigned the rank $1$, the second smallest observation for variable $j$ the rank $2$, the third smallest the rank $3$, and so on until the largest observation for variable $j$ is given the rank $n$.
If a number of cases all have the same value for the given variable, $j$, then they are each given an ‘average’ rank – e.g., if in attempting to assign the rank $h+1$, $k$ observations were found to have the same value, then instead of giving them the ranks
 $h+1,h+2,…,h+k,$
all $k$ observations would be assigned the rank
 $2h+k+12$
and the next value in ascending order would be assigned the rank
 $h+k+ 1.$
The process is repeated for each of the $m$ variables.
Let ${y}_{ij}$ be the rank assigned to the observation ${x}_{ij}$ when the $j$th variable is being ranked.
The quantities calculated are:
(a) Kendall's tau rank correlation coefficients:
 $Rjk=∑h=1n∑i=1nsignyhj-yijsignyhk-yik nn-1-Tjnn-1-Tk , j,k=1,2,…,m,$
 and $\mathrm{sign}u=1$ if $u>0$ $\mathrm{sign}u=0$ if $u=0$ $\mathrm{sign}u=-1$ if $u<0$
and ${T}_{j}=\sum {t}_{j}\left({t}_{j}-1\right)$, ${t}_{j}$ being the number of ties of a particular value of variable $j$, and the summation being over all tied values of variable $j$
(b) Spearman's rank correlation coefficients:
 $Rjk*=nn2-1-6∑i=1n yij-yik 2-12Tj*+Tk* nn2-1-Tj*nn2-1-Tk* , j,k=1,2,…,m,$
where ${T}_{j}^{*}=\sum {t}_{j}\left({t}_{j}^{2}-1\right)$ where ${t}_{j}$ is the number of ties of a particular value of variable $j$, and the summation is over all tied values of variable $j$.

## References

Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{x}\left(\mathit{ldx},{\mathbf{m}}\right)$ – double array
ldx, the first dimension of the array, must satisfy the constraint $\mathit{ldx}\ge {\mathbf{n}}$.
${\mathbf{x}}\left(\mathit{i},\mathit{j}\right)$ must be set to data value ${x}_{\mathit{i}\mathit{j}}$, the value of the $\mathit{i}$th observation on the $\mathit{j}$th variable, for $\mathit{i}=1,2,\dots ,n$ and $\mathit{j}=1,2,\dots ,m$.
2:     $\mathrm{itype}$int64int32nag_int scalar
The type of correlation coefficients which are to be calculated.
${\mathbf{itype}}=-1$
Only Kendall's tau coefficients are calculated.
${\mathbf{itype}}=0$
Both Kendall's tau and Spearman's coefficients are calculated.
${\mathbf{itype}}=1$
Only Spearman's coefficients are calculated.
Constraint: ${\mathbf{itype}}=-1$, $0$ or $1$.

### Optional Input Parameters

1:     $\mathrm{n}$int64int32nag_int scalar
Default: the first dimension of the array x.
$n$, the number of observations or cases.
Constraint: ${\mathbf{n}}\ge 2$.
2:     $\mathrm{m}$int64int32nag_int scalar
Default: the second dimension of the array x.
$m$, the number of variables.
Constraint: ${\mathbf{m}}\ge 2$.

### Output Parameters

1:     $\mathrm{rr}\left(\mathit{ldrr},{\mathbf{m}}\right)$ – double array
The requested correlation coefficients.
If only Kendall's tau coefficients are requested (${\mathbf{itype}}=-1$), ${\mathbf{rr}}\left(j,k\right)$ contains Kendall's tau for the $j$th and $k$th variables.
If only Spearman's coefficients are requested (${\mathbf{itype}}=1$), ${\mathbf{rr}}\left(j,k\right)$ contains Spearman's rank correlation coefficient for the $j$th and $k$th variables.
If both Kendall's tau and Spearman's coefficients are requested (${\mathbf{itype}}=0$), the upper triangle of rr contains the Spearman coefficients and the lower triangle the Kendall coefficients. That is, for the $\mathit{j}$th and $\mathit{k}$th variables, where $\mathit{j}$ is less than $\mathit{k}$, ${\mathbf{rr}}\left(\mathit{j},\mathit{k}\right)$ contains the Spearman rank correlation coefficient, and ${\mathbf{rr}}\left(\mathit{k},\mathit{j}\right)$ contains Kendall's tau, for $\mathit{j}=1,2,\dots ,m$ and $\mathit{k}=1,2,\dots ,m$.
(Diagonal terms, ${\mathbf{rr}}\left(j,j\right)$, are unity for all three values of itype.)
2:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{n}}<2$.
${\mathbf{ifail}}=2$
 On entry, ${\mathbf{m}}<2$.
${\mathbf{ifail}}=3$
 On entry, $\mathit{ldx}<{\mathbf{n}}$, or $\mathit{ldrr}<{\mathbf{m}}$.
${\mathbf{ifail}}=4$
 On entry, ${\mathbf{itype}}<-1$, or ${\mathbf{itype}}>1$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

The method used is believed to be stable.

The time taken by nag_correg_coeffs_kspearman (g02bq) depends on $n$ and $m$.

## Example

This example reads in a set of data consisting of nine observations on each of three variables. The program then calculates and prints both Kendall's tau and Spearman's rank correlation coefficients for all three variables.
```function g02bq_example

fprintf('g02bq example results\n\n');

x = [1.7,  1, 0.5;
2.8,  4, 3.0;
0.6,  6, 2.5;
1.8,  9, 6.0;
0.99, 4, 2.5;
1.4,  2, 5.5;
1.8,  9, 7.5;
2.5,  7, 0.0;
0.99, 5, 3.0];
[n,m] = size(x);
fprintf('Number of variables (columns) = %d\n', m);
fprintf('Number of cases     (rows)    = %d\n\n', n);
disp('Data matrix is:-');
disp(x);
itype = int64(0);

[rr, ifail] = g02bq(x, itype);

fprintf('Matrix of rank correlation coefficients:\n');
fprintf('Upper triangle -- Spearman''s\n');
fprintf('Lower triangle -- Kendall''s tau\n\n');
disp(rr);

```
```g02bq example results

Number of variables (columns) = 3
Number of cases     (rows)    = 9

Data matrix is:-
1.7000    1.0000    0.5000
2.8000    4.0000    3.0000
0.6000    6.0000    2.5000
1.8000    9.0000    6.0000
0.9900    4.0000    2.5000
1.4000    2.0000    5.5000
1.8000    9.0000    7.5000
2.5000    7.0000         0
0.9900    5.0000    3.0000

Matrix of rank correlation coefficients:
Upper triangle -- Spearman's
Lower triangle -- Kendall's tau

1.0000    0.2246    0.1186
0.0294    1.0000    0.3814
0.1176    0.2353    1.0000

```