NAG C Library Function Document

nag_mv_dendrogram (g03ehc)

1
Purpose

nag_mv_dendrogram (g03ehc) produces a dendrogram from the results of nag_mv_hierar_cluster_analysis (g03ecc).

2
Specification

#include <nag.h>
#include <nagg03.h>
void  nag_mv_dendrogram (Nag_DendOrient orient, Integer n, const double dord[], double dmin, double dstep, Integer nsym, char ***c, NagError *fail)

3
Description

Hierarchical cluster analysis, as performed by nag_mv_hierar_cluster_analysis (g03ecc) can be represented by a tree that shows at which distance the clusters merge. Such a tree is known as a dendrogram. See Everitt (1974) and Krzanowski (1990) for examples of dendrograms. A simple example is,
Figure 1
Figure 1
The end-points of the dendrogram represent the objects that have been clustered. They should be in a suitable order as given by nag_mv_hierar_cluster_analysis (g03ecc). Object 1 is always the first object. In the example above the height represents the distance at which the clusters merge.
The dendrogram is produced in an array of character pointers using the ordering and distances provided by nag_mv_hierar_cluster_analysis (g03ecc). Suitable characters are used to represent parts of the tree.
There are four possible orientations for the dendrogram. The example above has the end-points at the bottom of the diagram which will be referred to as south. If the dendrogram was the other way around with the end-points at the top of the diagram then the orientation would be north. If the end-points are at the left-hand or right-hand side of the diagram the orientation is west or east. Different symbols are used for east/west and north/south orientations.

4
References

Everitt B S (1974) Cluster Analysis Heinemann
Krzanowski W J (1990) Principles of Multivariate Analysis Oxford University Press

5
Arguments

1:     orient Nag_DendOrientInput
On entry: indicates which orientation the dendrogram is to take.
orient=Nag_DendNorth
The end-points of the dendrogram are to the north.
orient=Nag_DendSouth
The end-points of the dendrogram are to the south.
orient=Nag_DendEast
The end-points of the dendrogram are to the east.
orient=Nag_DendWest
The end-points of the dendrogram are to the west.
Constraint: orient=Nag_DendNorth, Nag_DendSouth, Nag_DendEast or Nag_DendWest.
2:     n IntegerInput
On entry: the number of objects in the cluster analysis.
Constraint: n2 .
3:     dord[n] const doubleInput
On entry: the array dord as output by nag_mv_hierar_cluster_analysis (g03ecc). dord contains the distances, in dendrogram order, at which clustering takes place.
Constraint: dord[n-1] dord[i-1] , for i=1,2,,n-1.
4:     dmin doubleInput
On entry: the clustering distance at which the dendrogram begins.
Constraint: dmin0.0 .
5:     dstep doubleInput
On entry: the distance represented by one symbol of the dendrogram.
Constraint: dstep>0.0 .
6:     nsym IntegerInput
On entry: the number of character positions used in the dendrogram. Hence the clustering distance at which the dendrogram terminates is given by dmin + nsym × dstep .
Constraint: nsym1 .
7:     c char ***Input/Output
On entry/exit: a pointer to an array of character pointers, containing consecutive lines of the dendrogram. The memory to which c points is allocated internally.
orient=Nag_DendNorth or Nag_DendSouth
The number of lines in the dendrogram is nsym.
orient=Nag_DendEast or Nag_DendWest
The number of lines in the dendrogram is n.
The storage pointed to by this pointer must be freed using nag_mv_dend_free (g03xzc).
8:     fail NagError *Input/Output
The NAG error argument (see Section 3.7 in How to Use the NAG Library and its Documentation).

6
Error Indicators and Warnings

NE_BAD_PARAM
On entry, argument orient had an illegal value.
NE_DENDROGRAM_ARRAY
On entry, n=value , dord[value] = value.
Constraint: dord[n-1] dord[i-1] , i = 1 , 2 , , n - 1 .
NE_INT_ARG_LT
On entry, n=value.
Constraint: n2.
On entry, nsym=value.
Constraint: nsym1.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_REAL_ARG_LE
On entry, dstep must not be less than or equal to 0.0: dstep=value .
NE_REAL_ARG_LT
On entry, dmin must not be less than 0.0: dmin=value .

7
Accuracy

Not applicable.

8
Parallelism and Performance

nag_mv_dendrogram (g03ehc) is not threaded in any implementation.

9
Further Comments

The scale of the dendrogram is controlled by dstep. The smaller the value of dstep the greater the amount of detail that will be given. However, nsym will have to be larger to give the full dendrogram. The range of distances represented by the dendrogram is dmin to nsym×dstep . The values of dmin, dstep and nsym can thus be set so that only part of the dendrogram is produced.
The dendrogram does not include any labelling of the objects. You can print suitable labels using the ordering given by the array iord returned by nag_mv_hierar_cluster_analysis (g03ecc).

10
Example

Data consisting of three variables on five objects are read in. Euclidean squared distances are computed using nag_mv_distance_mat (g03eac) and median clustering performed by nag_mv_hierar_cluster_analysis (g03ecc). nag_mv_dendrogram (g03ehc) is used to produce a dendrogram with orientation east and a dendrogram with orientation south. The two dendrograms are printed.
Note the use of nag_mv_dend_free (g03xzc) to free the memory allocated internally to the character array pointed to by c.

10.1
Program Text

Program Text (g03ehce.c)

10.2
Program Data

Program Data (g03ehce.d)

10.3
Program Results

Program Results (g03ehce.r)