naginterfaces.library.mv.cluster_​hier_​indicator

naginterfaces.library.mv.cluster_hier_indicator(cd, iord, dord, k, dlevel)[source]

cluster_hier_indicator computes a cluster indicator variable from the results of cluster_hier().

For full information please refer to the NAG Library document for g03ej

https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g03/g03ejf.html

Parameters
cdfloat, array-like, shape

The clustering distances in increasing order as returned by cluster_hier().

iordint, array-like, shape

The objects in dendrogram order as returned by cluster_hier().

dordfloat, array-like, shape

The clustering distances corresponding to the order in .

kint

Indicates if a specified number of clusters is required.

If then cluster_hier_indicator will attempt to find clusters.

If then cluster_hier_indicator will find the clusters based on the distance given in .

dlevelfloat

If , must contain the distance at which clusters are produced. Otherwise need not be set.

Returns
kint

The number of clusters produced, .

dlevelfloat

If on entry, contains the distance at which the required number of clusters are found. Otherwise remains unchanged.

icint, ndarray, shape

indicates to which of clusters the th object belongs, for .

Raises
NagValueError
(errno )

On entry, .

Constraint: .

(errno )

On entry, and .

(errno )

On entry, and .

Constraint: .

(errno )

On entry the values of and are not compatible.

(errno )

On entry the values of are not in increasing order.

Warns
NagAlgorithmicWarning
(errno )

No clustering is performed when .

(errno )

All data is merged when .

(errno )

All data merged into one cluster at , .

(errno )

No clustering takes place below , .

(errno )

The precise number of clusters requested is not possible because of tied clustering distances.

Notes

In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Given a distance or dissimilarity matrix for objects, cluster analysis aims to group the objects into a number of more or less homogeneous groups or clusters. With agglomerative clustering methods (see cluster_hier()), a hierarchical tree is produced by starting with clusters each with a single object and then at each of stages, merging two clusters to form a larger cluster until all objects are in a single cluster. cluster_hier_indicator takes the information from the tree and produces the clusters that exist at a given distance. This is equivalent to taking the dendrogram (see cluster_hier_dendrogram()) and drawing a line across at a given distance to produce clusters.

As an alternative to giving the distance at which clusters are required, you can specify the number of clusters required and cluster_hier_indicator will compute the corresponding distance. However, it may not be possible to compute the number of clusters required due to ties in the distance matrix.

If there are clusters then the indicator variable will assign a value between and to each object to indicate to which cluster it belongs. Object always belongs to cluster .

References

Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press