NAG CL Interface
g03fcc performs non-metric (ordinal) multidimensional scaling.
The function may be called by the names: g03fcc, nag_mv_multidimscal_ordinal or nag_mv_ordinal_multidimscale.
For a set of
objects, a distance or dissimilarity matrix
can be calculated such that
is a measure of how ‘far apart’ objects
have been recorded for each observation this measure may be based on Euclidean distance,
, or some other calculation such as the number of variables for which
. Alternatively, the distances may be the result of a subjective assessment. For a given distance matrix, multidimensional scaling produces a configuration of
points in a chosen number of dimensions,
, such that the distance between the points in some way best matches the distance matrix. For some distance measures, such as Euclidean distance, the size of distance is meaningful, for other measures of distance all that can be said is that one distance is greater or smaller than another. For the former, metric scaling can be used, see g03fac
, for the latter, a non-metric scaling is more appropriate.
For non-metric multidimensional scaling, the criterion used to measure the closeness of the fitted distance matrix to the observed distance matrix is known as
is given by,
is the Euclidean squared distance between points
is the fitted distance obtained when
is monotonically regressed on
, that is,
is monotonic relative to
and is obtained from
with the smallest number of changes. So
is a measure of by how much the set of points preserve the order of the distances in the original distance matrix. Non-metric multidimensional scaling seeks to find the set of points that minimize the
An alternate measure is squared
in which the distances in
are replaced by squared distances.
In order to perform a non-metric scaling, an initial configuration of points is required. This can be obtained from principal coordinate analysis, see g03fac
. Given an initial configuration, g03fcc
uses the optimization function e04dgc
to find the configuration of points that minimizes
. The function e04dgc
uses a conjugate gradient algorithm. g03fcc
will find an optimum that may only be a local optimum, to be more sure of finding a global optimum several different initial configurations should be used; these can be obtained by randomly perturbing the original initial configuration using functions from the G05 Chapter Introduction
Chatfield C and Collins A J (1980) Introduction to Multivariate Analysis Chapman and Hall
Krzanowski W J (1990) Principles of Multivariate Analysis Oxford University Press
: indicates whether
is to be used as the criterion.
- is used.
- is used.
On entry: the number of objects in the distance matrix, .
On entry: the number of dimensions used to represent the data, .
– const double
: the lower triangle of the distance matrix
stored packed by rows. That is
is missing then set
; For further comments on missing values see Section 9
Note: the th element of the matrix is stored in .
th row must contain an initial estimate of the coordinates for the
. One method of computing these is to use g03fac
On exit: the th row contains coordinates for the th point, .
: the stride separating matrix column elements in the array x
– double *
On exit: the value of or at the final iteration.
: auxiliary outputs. If
, the first
elements contain the distances,
, for the points returned in x
, the second set of
contains the distances
ordered by the input distances,
, the third set of
elements contains the monotonic distances,
, ordered by the input distances,
and the final set of
elements contains fitted monotonic distances,
, for the points in x
corresponding to distances which are input as missing are set to zero. If
, the results are as above except that the squared distances are returned.
Each distance matrix is stored in lower triangular packed form in the same way as the input matrix .
– Nag_E04_Opt *
: a pointer to a structure of type Nag_E04_Opt whose members are optional parameters for e04dgc
. These structure members offer the means of adjusting some of the argument values of the algorithm and on output will supply further details of the results. You are referred to the e04dgc
document for further details.
The default values used by g03fcc
when the options argument is set to the NAG defined null pointer, E04_DEFAULT
, are as follows:
If a different value is required for any of these four structure members or if other options available in e04dgc
are to be used, then the structure options
should be declared and initialized by a call to e04xxc
and supplied as an argument to g03fcc
. In this case, the structure members listed above except for
will have the default values as specified above;
in this case.
– NagError *
The NAG error argument (see Section 7
in the Introduction to the NAG Library CL Interface).
Error Indicators and Warnings
On entry, while . These arguments must satisfy .
On entry, while . These arguments must satisfy .
Dynamic memory allocation failed.
On entry, argument type
had an illegal value.
On entry, .
Additional error messages are output if the optimization fails to converge or if the options are set incorrectly, Details of these can be found in the e04dgc
An internal error has occurred in this function. Check the function call
and any array sizes. If the call is correct then please contact NAG
All elements of array
Constraint: At least one element of d
must be positive.
After a successful optimization, the relative accuracy of should be approximately , as specified by .
Parallelism and Performance
Background information to multithreading can be found in the Multithreading
g03fcc is not threaded in any implementation.
Missing values in the input distance matrix can be specified by a negative value and providing there are not more than about two thirds of the values missing, the algorithm may still work. However, the function g03fac
does not allow for missing values so an alternative method of obtaining an initial set of coordinates is required. It may be possible to estimate the missing values with some form of average and then use g03fac
to give an initial set of coordinates.
The data, given by Krzanowski (1990)
, are dissimilarities between water vole populations in Europe. Initial estimates are provided by the first two principal coordinates computed.