g02 Chapter Contents
g02 Chapter Introduction
NAG Library Manual

# NAG Library Function Documentnag_ken_spe_corr_coeff (g02brc)

## 1  Purpose

nag_ken_spe_corr_coeff (g02brc) calculates Kendall and Spearman rank correlation coefficients.

## 2  Specification

 #include #include
 void nag_ken_spe_corr_coeff (Integer n, Integer m, const double x[], Integer tdx, const Integer svar[], const Integer sobs[], double corr[], Integer tdc, NagError *fail)

## 3  Description

nag_ken_spe_corr_coeff (g02brc) calculates both the Kendall rank correlation coefficients and the Spearman rank correlation coefficients.
The data consists of $n$ observations for each of $m$ variables:
where ${x}_{\mathit{i}j}$ is the $\mathit{i}$th observation on the $j$th variable. The function eliminates any variable ${x}_{\mathit{i}j}$ , for $\mathit{i}=1,2,\dots ,n$, where the argument ${\mathbf{svar}}\left[\mathit{j}-1\right]=0$, and any observation ${x}_{i\mathit{j}}$, for $\mathit{j}=1,2,\dots ,m$, where the argument ${\mathbf{sobs}}\left[i-1\right]=0$.
The observations are first ranked as follows:
For a given variable, $j$ say, each of the observations ${x}_{\mathit{i}j}$ for which ${\mathbf{sobs}}\left[\mathit{i}-1\right]>0$, for $\mathit{i}=1,2,\dots ,n$, has associated with it an additional number, the rank of the observation, which indicates the magnitude of that observation relative to the magnitudes of the other observations on that same variable for which ${\mathbf{sobs}}\left[i-1\right]>0$.
The smallest of these valid observations for variable $j$ is assigned the rank 1, the second smallest observation for variable $j$ the rank 2, and so on until the largest such observation is given the rank ${n}_{s}$, where ${n}_{s}$ is the number of observations for which ${\mathbf{sobs}}\left[i-1\right]>0$.
If a number of cases all have the same value for a given variable, $j$, then they are each given an ‘average’ rank — e.g., if in attempting to assign the rank $h+1$, $k$ observations for which ${\mathbf{sobs}}\left[i-1\right]>0$ were found to have the same value, then instead of giving them the ranks $h+1,h+2,\dots ,h+k$ all $k$ observations would be assigned the rank $\frac{2h+k+1}{2}$ and the next value in ascending order would be assigned the rank $h+k+1\text{.}$ The process is repeated for each of the $m$ variables for which ${\mathbf{svar}}\left[j-1\right]>0$.
Let ${y}_{i\mathit{j}}$ be the rank assigned to the observation ${x}_{i\mathit{j}}$ when the $\mathit{j}$th variable is being ranked. For those observations, $i$, for which ${\mathbf{sobs}}\left[i-1\right]=0$, ${y}_{i\mathit{j}}=0$ , for $\mathit{j}=1,2,\dots ,m$.
For variables $j,k$ the following are computed:
(a) Kendall's tau correlation coefficients:
 $R jk = ∑ h=1 n ∑ i=1 n sign y hj - y ij sign y hk - y ik n s n s - 1 - T j n s n s - 1 - T k j , k = 1 , 2 , … , m ;$
 where ${n}_{s}$ is the number of observations for which ${\mathbf{sobs}}\left[i-1\right]>0$, and sign $u=1$ if $u>0$, sign $u=0$ if $u=0$, sign $u=-1$ if $u<0$,
and ${T}_{j}=\sum {t}_{j}\left({t}_{j}-1\right)$ where ${t}_{j}$ is the number of ties of a particular value of variable $j$, and the summation is over all tied values of variable $j$.
(b) Spearman's rank correlation coefficients:
 $R jk = n s n s 2 - 1 - 6 ∑ i=1 n y ij - y ik 2 - 1 2 T j + T k n s n s 2 - 1 - T j n s n s 2 - 1 - T k j , k = 1 , 2 , … , m ;$
where ${n}_{s}$ is the number of observations for which ${\mathbf{sobs}}\left[i-1\right]>0$, and ${T}_{j}=\sum {t}_{j}\left({t}_{j}^{2}-1\right)$ where ${t}_{j}$ is the number of ties of a particular value of variable $j$, and the summation is over all tied values of variable $j$.
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill

## 5  Arguments

1:    $\mathbf{n}$IntegerInput
On entry: the number of observations in the dataset.
Constraint: ${\mathbf{n}}\ge 2$.
2:    $\mathbf{m}$IntegerInput
On entry: the number of variables.
Constraint: ${\mathbf{m}}\ge 2$.
3:    $\mathbf{x}\left[{\mathbf{n}}×{\mathbf{tdx}}\right]$const doubleInput
On entry: ${\mathbf{x}}\left[\mathit{i}-1×{\mathbf{tdx}}+\mathit{j}-1\right]$ must contain the $\mathit{i}$th observation on the $\mathit{j}$th variable, for $\mathit{i}=1,2,\dots ,n$ and $\mathit{j}=1,2,\dots ,m$.
4:    $\mathbf{tdx}$IntegerInput
On entry: the stride separating matrix column elements in the array x.
Constraint: ${\mathbf{tdx}}\ge {\mathbf{m}}$.
5:    $\mathbf{svar}\left[{\mathbf{m}}\right]$const IntegerInput
On entry: ${\mathbf{svar}}\left[j-1\right]$ indicates which variables are to be included, for the $j$th variable to be included, ${\mathbf{svar}}\left[j-1\right]>0$. If all variables are to be included then a NULL pointer (Integer *)0 may be supplied.
Constraint: ${\mathbf{svar}}\left[\mathit{j}-1\right]\ge 0$, and there is at least one positive element, for $\mathit{j}=1,2,\dots ,m$.
6:    $\mathbf{sobs}\left[{\mathbf{n}}\right]$const IntegerInput
On entry: ${\mathbf{sobs}}\left[i-1\right]$ indicates which observations are to be included, for the $i$th observation to be included, ${\mathbf{sobs}}\left[i-1\right]>0$. If all observations are to be included then a NULL pointer (Integer *)0 may be supplied.
Constraint: ${\mathbf{sobs}}\left[\mathit{i}-1\right]\ge 0$, and there are at least two positive elements, for $\mathit{i}=1,2,\dots ,n$.
7:    $\mathbf{corr}\left[{\mathbf{m}}×{\mathbf{tdc}}\right]$doubleOutput
On exit: the upper ${n}_{s}$ by ${n}_{s}$ part of corr contains the correlation coefficients, the upper triangle contains the Spearman coefficients and the lower triangle, the Kendall coefficients. That is, for the $j$th and $k$th variables, where $j$ is less than $k$, ${\mathbf{corr}}\left[j-1×{\mathbf{tdc}}+k-1\right]$ contains the Spearman rank correlation coefficient, and ${\mathbf{corr}}\left[k-1×{\mathbf{tdc}}+j-1\right]$ contains Kendall's tau, for $j,k=1,2,\dots ,{n}_{s}$. The diagonal will be set to 1.
8:    $\mathbf{tdc}$IntegerInput
On entry: the stride separating matrix column elements in the array corr.
Constraint: ${\mathbf{tdc}}\ge {\mathbf{m}}$.
9:    $\mathbf{fail}$NagError *Input/Output
The NAG error argument (see Section 2.7 in How to Use the NAG Library and its Documentation).

## 6  Error Indicators and Warnings

NE_2_INT_ARG_LT
On entry, ${\mathbf{tdc}}=〈\mathit{\text{value}}〉$ while ${\mathbf{m}}=〈\mathit{\text{value}}〉$. These arguments must satisfy ${\mathbf{tdc}}\ge {\mathbf{m}}$.
On entry, ${\mathbf{tdx}}=〈\mathit{\text{value}}〉$ while ${\mathbf{m}}=〈\mathit{\text{value}}〉$. These arguments must satisfy ${\mathbf{tdx}}\ge {\mathbf{m}}$.
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_INT_ARG_LT
On entry, ${\mathbf{m}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{m}}\ge 2$.
On entry, ${\mathbf{n}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{n}}\ge 2$.
NE_INT_ARRAY_1
Value $〈\mathit{\text{value}}〉$ given to ${\mathbf{sobs}}\left[〈\mathit{\text{value}}〉\right]$ not valid. Correct range for elements of sobs is ${\mathbf{sobs}}\left[i\right]\ge 0$.
Value $〈\mathit{\text{value}}〉$ given to ${\mathbf{svar}}\left[〈\mathit{\text{value}}〉\right]$ not valid. Correct range for elements of svar is ${\mathbf{svar}}\left[i\right]\ge 0$.
NE_INTERNAL_ERROR
An initial error has occurred in this function. Check the function call and any array sizes.
NE_SOBS_LOW
On entry, sobs must contain at least 2 positive elements.
Too few observations have been selected.
NE_SVAR_LOW
No variables have been selected.
On entry, svar must contain at least 1 positive element.

## 7  Accuracy

The computations are believed to be stable.

## 8  Parallelism and Performance

nag_ken_spe_corr_coeff (g02brc) is not threaded in any implementation.

None.

## 10  Example

A program to calculate the Kendall and Spearman rank correlation coefficients from a set of data.

### 10.1  Program Text

Program Text (g02brce.c)

### 10.2  Program Data

Program Data (g02brce.d)

### 10.3  Program Results

Program Results (g02brce.r)