NAG CL Interface
produces a dendrogram from the results of g03ecc
The function may be called by the names: g03ehc, nag_mv_cluster_hier_dendrogram or nag_mv_dendrogram.
Hierarchical cluster analysis, as performed by g03ecc
can be represented by a tree that shows at which distance the clusters merge. Such a tree is known as a dendrogram. See Everitt (1974)
and Krzanowski (1990)
for examples of dendrograms. A simple example is,
The end-points of the dendrogram represent the objects that have been clustered. They should be in a suitable order as given by g03ecc
. Object 1 is always the first object. In the example above the height represents the distance at which the clusters merge.
The dendrogram is produced in an array of character pointers using the ordering and distances provided by g03ecc
. Suitable characters are used to represent parts of the tree.
There are four possible orientations for the dendrogram. The example above has the end-points at the bottom of the diagram which will be referred to as south. If the dendrogram was the other way around with the end-points at the top of the diagram then the orientation would be north. If the end-points are at the left-hand or right-hand side of the diagram the orientation is west or east. Different symbols are used for east/west and north/south orientations.
Everitt B S (1974) Cluster Analysis Heinemann
Krzanowski W J (1990) Principles of Multivariate Analysis Oxford University Press
: indicates which orientation the dendrogram is to take.
- The end-points of the dendrogram are to the north.
- The end-points of the dendrogram are to the south.
- The end-points of the dendrogram are to the east.
- The end-points of the dendrogram are to the west.
, , or .
On entry: the number of objects in the cluster analysis.
– const double
: the array dord
as output by g03ecc
contains the distances, in dendrogram order, at which clustering takes place.
, for .
On entry: the clustering distance at which the dendrogram begins.
On entry: the distance represented by one symbol of the dendrogram.
On entry: the number of character positions used in the dendrogram. Hence the clustering distance at which the dendrogram terminates is given by .
– char ***
: a pointer to an array of character pointers, containing consecutive lines of the dendrogram. The memory to which c
points is allocated internally.
- The number of lines in the dendrogram is nsym.
- The number of lines in the dendrogram is n.
The storage pointed to by this pointer must be freed using g03xzc
– NagError *
The NAG error argument (see Section 7
in the Introduction to the NAG Library CL Interface).
Error Indicators and Warnings
On entry, argument orient
had an illegal value.
On entry, , .
Constraint: , .
On entry, .
On entry, .
An internal error has occurred in this function. Check the function call
and any array sizes. If the call is correct then please contact NAG
On entry, dstep
must not be less than or equal to 0.0:
On entry, dmin
must not be less than 0.0:
Parallelism and Performance
Background information to multithreading can be found in the Multithreading
g03ehc is not threaded in any implementation.
The scale of the dendrogram is controlled by dstep
. The smaller the value of dstep
the greater the amount of detail that will be given. However, nsym
will have to be larger to give the full dendrogram. The range of distances represented by the dendrogram is dmin
. The values of dmin
can thus be set so that only part of the dendrogram is produced.
The dendrogram does not include any labelling of the objects. You can print suitable labels using the ordering given by the array iord
returned by g03ecc
Data consisting of three variables on five objects are read in. Euclidean squared distances are computed using g03eac
and median clustering performed by g03ecc
is used to produce a dendrogram with orientation east and a dendrogram with orientation south. The two dendrograms are printed.
Note the use of g03xzc
to free the memory allocated internally to the character array pointed to by c