Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_contab_binary (g11sa)

## Purpose

nag_contab_binary (g11sa) fits a latent variable model (with a single factor) to data consisting of a set of measurements on individuals in the form of binary-valued sequences (generally referred to as score patterns). Various measures of goodness-of-fit are calculated along with the factor (theta) scores.

## Syntax

[x, irl, a, c, niter, alpha, pigam, cm, g, expp, obs, exf, y, iob, rlogl, chi, idf, siglev, ifail] = g11sa(n, gprob, x, irl, a, c, cgetol, chisqr, 'ip', ip, 'ns', ns, 'iprint', iprint, 'maxit', maxit)
[x, irl, a, c, niter, alpha, pigam, cm, g, expp, obs, exf, y, iob, rlogl, chi, idf, siglev, ifail] = nag_contab_binary(n, gprob, x, irl, a, c, cgetol, chisqr, 'ip', ip, 'ns', ns, 'iprint', iprint, 'maxit', maxit)

## Description

Given a set of p$p$ dichotomous variables = (x1,x2,,xp)$\stackrel{~}{x}={\left({x}_{1},{x}_{2},\dots ,{x}_{p}\right)}^{\prime }$, where ${}^{\prime }$ denotes vector or matrix transpose, the objective is to investigate whether the association between them can be adequately explained by a latent variable model of the form (see Bartholomew (1980) and Bartholomew (1987))
 G{πi(θ)} = αi0 + αi1θ. $G{πi(θ)}=αi0+αi1θ.$ (1)
The xi${x}_{i}$ are called item responses and take the value 0$0$ or 1$1$. θ$\theta$ denotes the latent variable assumed to have a standard Normal distribution over a population of individuals to be tested on p$p$ items. Call πi(θ) = P(xi = 1θ)${\pi }_{i}\left(\theta \right)=P\left({x}_{i}=1\mid \theta \right)$ the item response function: it represents the probability that an individual with latent ability θ$\theta$ will produce a positive response (1) to item i$i$. αi0${\alpha }_{i0}$ and αi1${\alpha }_{i1}$ are item parameters which can assume any real values. The set of parameters, αi1${\alpha }_{\mathit{i}1}$, for i = 1,2,,p$\mathit{i}=1,2,\dots ,p$, being coefficients of the unobserved variable θ$\theta$, can be interpreted as ‘factor loadings’.
G$G$ is a function selected by you as either Φ1${\Phi }^{-1}$ or logit, mapping the interval (0,1)$\left(0,1\right)$ onto the whole real line. Data from a random sample of n$n$ individuals takes the form of the matrices X$X$ and R$R$ defined below:
Xs × p =
 [ x11 x12 … x1p x21 x22 … x2p ⋮ ⋮ ⋮ xs1 xs2 … xsp ]
=
 [ x̃1 x̃2 ⋮ x̃s ]
,  Rs × 1 =
 [ r1 r2 ⋮ rs ]
$Xs×p=[ x11 x12 … x1p x21 x22 … x2p ⋮ ⋮ ⋮ xs1 xs2 … xsp ]=[ x~1 x~2 ⋮ x~s ] , Rs×1=[ r1 r2 ⋮ rs ]$
where l = (xl1,xl2,,xlp) ${\stackrel{~}{x}}_{l}=\left({x}_{l1},{x}_{l2},\dots ,{x}_{lp}\right)$ denotes the l$l$th score pattern in the sample, rl${r}_{l}$ the frequency with which l${\stackrel{~}{x}}_{l}$ occurs and s$s$ the number of different score patterns observed. (Thus l = 1srl = n$\sum _{l=1}^{s}{r}_{l}=n$). It can be shown that the log-likelihood function is proportional to
 s ∑ rllogPl, l = 1
$∑ l=1 s rl log⁡Pl ,$
where
 ∞ Pl = P(x̃ = x̃l) = ∫ P(x̃ = x̃l ∣ θ)φ(θ)dθ − ∞
$Pl = P ( x~ = x~l ) = ∫ -∞ ∞ P ( x~ = x~l ∣ θ ) ϕ(θ) dθ$
(2)
(φ(θ)$\varphi \left(\theta \right)$ being the probability density function of a standard Normal random variable).
Pl${P}_{l}$ denotes the unconditional probability of observing score pattern l${\stackrel{~}{x}}_{l}$. The integral in (2) is approximated using Gauss–Hermite quadrature. If we take G(z) = logitz = log(z/(1z))$G\left(z\right)=\mathrm{logit}z=\mathrm{log}\left(\frac{z}{1-z}\right)$ in (1) and reparameterise as follows,
 αi = αi1, πi = logit − 1αi0,
$αi = αi1, πi = logit-1⁡αi0,$
then (1) reduces to the logit model (see Bartholomew (1980))
 πi(θ) = (πi)/( πi + (1 − πi) exp( − αiθ) ) . $πi(θ) = πi πi + (1-πi) exp( - αi θ ) .$
If we take G(z) = Φ1(z)$G\left(z\right)={\Phi }^{-1}\left(z\right)$ (where Φ$\Phi$ is the cumulative distribution function of a standard Normal random variable) and reparameterise as follows,
 αi = (αi1)/(sqrt((1 + αi12))) γi = ( − αi0)/(sqrt((1 + αi12)))
,
$αi = αi1(1+αi12) γi = -αi0(1+αi12) ,$
then (1) reduces to the probit model (see Bock and Aitkin (1981))
 πi(θ) = φ ((αiθ − γi)/(sqrt((1 − αi2)))) . $πi(θ)=ϕ (αiθ-γi (1-αi2) ) .$
An E-M algorithm (see Bock and Aitkin (1981)) is used to maximize the log-likelihood function. The number of quadrature points used is set initially to 10$10$ and once convergence is attained increased to 20$20$.
The theta score of an individual responding in score pattern l${\stackrel{~}{x}}_{l}$ is computed as the posterior mean, i.e., E(θl)$E\left(\theta \mid {\stackrel{~}{x}}_{l}\right)$. For the logit model the component score Xl = j = 1pαjxlj${X}_{l}=\sum _{j=1}^{p}{\alpha }_{j}{x}_{lj}$ is also calculated. (Note that in calculating the theta scores and measures of goodness-of-fit nag_contab_binary (g11sa) automatically reverses the coding on item j$j$ if αj < 0${\alpha }_{j}<0$; it is assumed in the model that a response at the one level is showing a higher measure of latent ability than a response at the zero level.)
The frequency distribution of score patterns is required as input data. If your data is in the form of individual score patterns (uncounted), then nag_contab_binary_service (g11sb) may be used to calculate the frequency distribution.

## References

Bartholomew D J (1980) Factor analysis for categorical data (with Discussion) J. Roy. Statist. Soc. Ser. B 42 293–321
Bartholomew D J (1987) Latent Variable Models and Factor Analysis Griffin
Bock R D and Aitkin M (1981) Marginal maximum likelihood estimation of item parameters: Application of an E-M algorithm Psychometrika 46 443–459

## Parameters

### Compulsory Input Parameters

1:     n – int64int32nag_int scalar
n$n$, the number of individuals in the sample.
Constraint: n7${\mathbf{n}}\ge 7$.
2:     gprob – logical scalar
Must be set equal to true if G(z) = Φ1(z)$G\left(z\right)={\Phi }^{-1}\left(z\right)$ and false if G(z) = logitz$G\left(z\right)=\mathrm{logit}z$.
3:     x(ldx,ip) – logical array
ldx, the first dimension of the array, must satisfy the constraint ldxns$\mathit{ldx}\ge {\mathbf{ns}}$.
The first s$s$ rows of x must contain the s$s$ different score patterns. The l$l$th row of x must contain the l$l$th score pattern with x(l,j)${\mathbf{x}}\left(l,j\right)$ set equal to true if xlj = 1${x}_{lj}=1$ and false if xlj = 0${x}_{lj}=0$. All rows of x must be distinct.
4:     irl(ns) – int64int32nag_int array
ns, the dimension of the array, must satisfy the constraint 2 × ip < nsmin (2ip,n)$2×{\mathbf{ip}}<{\mathbf{ns}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({2}^{{\mathbf{ip}}},{\mathbf{n}}\right)$.
The i$i$th component of irl must be set equal to the frequency with which the i$i$th row of x occurs.
Constraints:
• irl(i)0${\mathbf{irl}}\left(\mathit{i}\right)\ge 0$, for i = 1,2,,s$\mathit{i}=1,2,\dots ,s$;
• i = 1sirl(i) = n$\sum _{i=1}^{s}{\mathbf{irl}}\left(i\right)=n$.
5:     a(ip) – double array
ip, the dimension of the array, must satisfy the constraint ip3${\mathbf{ip}}\ge 3$.
a(j)${\mathbf{a}}\left(j\right)$ must be set equal to an initial estimate of αj1${\alpha }_{j1}$. In order to avoid divergence problems with the E-M algorithm you are strongly advised to set all the a(j)${\mathbf{a}}\left(j\right)$ to 0.5$0.5$.
6:     c(ip) – double array
ip, the dimension of the array, must satisfy the constraint ip3${\mathbf{ip}}\ge 3$.
c(j)${\mathbf{c}}\left(j\right)$ must be set equal to an initial estimate of αj0${\alpha }_{j0}$. In order to avoid divergence problems with the E-M algorithm you are strongly advised to set all the c(j)${\mathbf{c}}\left(j\right)$ to 0.0$0.0$.
7:     cgetol – double scalar
The accuracy to which the solution is required.
If cgetol is set to 10l${10}^{-l}$ and on exit ${\mathbf{ifail}}={\mathbf{0}}$ or 7${\mathbf{7}}$, then all elements of the gradient vector will be smaller than 10l${10}^{-l}$ in absolute value. For most practical purposes the value 104${10}^{-4}$ should suffice. You should be wary of setting cgetol too small since the convergence criterion may then have become too strict for the machine to handle.
If cgetol has been set to a value which is less than the square root of the machine precision, ε$\epsilon$, then nag_contab_binary (g11sa) will use the value sqrt(ε)$\sqrt{\epsilon }$ instead.
8:     chisqr – logical scalar
If chisqr is set equal to true, then a likelihood ratio statistic will be calculated (see chi).
If chisqr is set equal to false, no such statistic will be calculated.

### Optional Input Parameters

1:     ip – int64int32nag_int scalar
Default: The dimension of the arrays a, c and the second dimension of the array x. (An error is raised if these dimensions are not equal.)
p$p$, the number of dichotomous variables.
Constraint: ip3${\mathbf{ip}}\ge 3$.
2:     ns – int64int32nag_int scalar
Default: The dimension of the array irl and the first dimension of the array x. (An error is raised if these dimensions are not equal.)
ns must be set equal to the number of different score patterns in the sample, s$s$.
Constraint: 2 × ip < nsmin (2ip,n)$2×{\mathbf{ip}}<{\mathbf{ns}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({2}^{{\mathbf{ip}}},{\mathbf{n}}\right)$.
3:     iprint – int64int32nag_int scalar
The frequency with which the maximum likelihood search function is to be monitored.
iprint > 0${\mathbf{iprint}}>0$
The search is monitored once every iprint iterations, and when the number of quadrature points is increased, and again at the final solution point.
iprint = 0${\mathbf{iprint}}=0$
The search is monitored once at the final point.
iprint < 0${\mathbf{iprint}}<0$
The search is not monitored at all.
iprint should normally be set to a small positive number.
Default: 1$1$
4:     maxit – int64int32nag_int scalar
The maximum number of iterations to be made in the maximum likelihood search. There will be an error exit (see Section [Error Indicators and Warnings]) if the search function has not converged in maxit iterations.
Default: 1000$1000$
Constraint: maxit1${\mathbf{maxit}}\ge 1$.

### Input Parameters Omitted from the MATLAB Interface

ldx ishow ldcm ldexpp xl w lw

### Output Parameters

1:     x(ldx,ip) – logical array
ldxns$\mathit{ldx}\ge {\mathbf{ns}}$.
Given a valid parameter set then the first s$s$ rows of x still contain the s$s$ different score patterns. However, the following points should be noted:
 (i) If the estimated factor loading for the j$j$th item is negative then that item is re-coded, i.e., 0$0$s and 1$1$s (or true and false) in the j$j$th column of x are interchanged. (ii) The rows of x will be reordered so that the theta scores corresponding to rows of x are in increasing order of magnitude.
2:     irl(ns) – int64int32nag_int array
Given a valid parameter set then the first s$s$ components of irl are reordered as are the rows of x.
3:     a(ip) – double array
a(j)${\mathbf{a}}\left(\mathit{j}\right)$ contains the latest estimate of αj1${\alpha }_{\mathit{j}1}$, for j = 1,2,,p$\mathit{j}=1,2,\dots ,p$. (Because of possible recoding all elements of a will be positive.)
4:     c(ip) – double array
c(j)${\mathbf{c}}\left(\mathit{j}\right)$ contains the latest estimate of αj0${\alpha }_{\mathit{j}0}$, for j = 1,2,,p$\mathit{j}=1,2,\dots ,p$.
5:     niter – int64int32nag_int scalar
Given a valid parameter set then niter contains the number of iterations performed by the maximum likelihood search function.
6:     alpha(ip) – double array
Given a valid parameter set then alpha(j)${\mathbf{alpha}}\left(j\right)$ contains the latest estimate of αj${\alpha }_{j}$. (Because of possible recoding all elements of alpha will be positive.)
7:     pigam(ip) – double array
Given a valid parameter set then pigam(j)${\mathbf{pigam}}\left(j\right)$ contains the latest estimate of either πj${\pi }_{j}$ if gprob = false${\mathbf{gprob}}=\mathbf{false}$ (logit model) or γj${\gamma }_{j}$ if gprob = true${\mathbf{gprob}}=\mathbf{true}$ (probit model).
8:     cm(ldcm,2 × ip$2×{\mathbf{ip}}$) – double array
ldcm2 × ip$\mathit{ldcm}\ge 2×{\mathbf{ip}}$.
Given a valid parameter set then the strict lower triangle of cm contains the correlation matrix of the parameter estimates held in alpha and pigam on exit. The diagonal elements of cm contain the standard errors. Thus:
 cm(2 × i − 1,2 × i − 1)${\mathbf{cm}}\left(2×i-1,2×i-1\right)$ = standard error (alpha(i))$\left({\mathbf{alpha}}\left(i\right)\right)$ cm(2 × i,2 × i)${\mathbf{cm}}\left(2×i,2×i\right)$ = standard error (pigam(i))$\left({\mathbf{pigam}}\left(i\right)\right)$ cm(2 × i,2 × i − 1)${\mathbf{cm}}\left(2×i,2×i-1\right)$ = correlation (pigam(i),alpha(i))$\left({\mathbf{pigam}}\left(i\right),{\mathbf{alpha}}\left(i\right)\right)$,
for i = 1,2,,p$i=1,2,\dots ,p$;
 cm(2 × i − 1,2 × j − 1)${\mathbf{cm}}\left(2×i-1,2×j-1\right)$ = correlation (alpha(i),alpha(j))$\left({\mathbf{alpha}}\left(i\right),{\mathbf{alpha}}\left(j\right)\right)$ cm(2 × i,2 × j)${\mathbf{cm}}\left(2×i,2×j\right)$ = correlation (pigam(i),pigam(j))$\left({\mathbf{pigam}}\left(i\right),{\mathbf{pigam}}\left(j\right)\right)$ cm(2 × i − 1,2 × j)${\mathbf{cm}}\left(2×i-1,2×j\right)$ = correlation (alpha(i),pigam(j))$\left({\mathbf{alpha}}\left(i\right),{\mathbf{pigam}}\left(j\right)\right)$ cm(2 × i,2 × j − 1)${\mathbf{cm}}\left(2×i,2×j-1\right)$ = correlation (alpha(j),pigam(i))$\left({\mathbf{alpha}}\left(j\right),{\mathbf{pigam}}\left(i\right)\right)$,
for j = 1,2,,i1$j=1,2,\dots ,i-1$.
If the second derivative matrix cannot be computed then all the elements of cm are returned as zero.
9:     g(2 × ip$2×{\mathbf{ip}}$) – double array
Given a valid parameter set then g contains the estimated gradient vector corresponding to the final point held in the arrays alpha and pigam. g(2 × j1)${\mathbf{g}}\left(2×\mathit{j}-1\right)$ contains the derivative of the log-likelihood with respect to alpha(j)${\mathbf{alpha}}\left(\mathit{j}\right)$, for j = 1,2,,p$\mathit{j}=1,2,\dots ,p$. g(2 × j)${\mathbf{g}}\left(2×\mathit{j}\right)$ contains the derivative of the log-likelihood with respect to pigam(j)${\mathbf{pigam}}\left(\mathit{j}\right)$, for j = 1,2,,p$\mathit{j}=1,2,\dots ,p$.
10:   expp(ldexpp,ip) – double array
ldexppip$\mathit{ldexpp}\ge {\mathbf{ip}}$.
Given a valid parameter set then expp(i,j)${\mathbf{expp}}\left(i,j\right)$ contains the expected percentage of individuals in the sample who respond positively to items i$i$ and j$j$ (ji$j\le i$), corresponding to the final point held in the arrays alpha and pigam.
11:   obs(ldexpp,ip) – double array
ldexppip$\mathit{ldexpp}\ge {\mathbf{ip}}$.
Given a valid parameter set then obs(i,j)${\mathbf{obs}}\left(i,j\right)$ contains the observed percentage of individuals in the sample who responded positively to items i$i$ and j$j$ (ji$j\le i$).
12:   exf(ns) – double array
Given a valid parameter set then exf(l)${\mathbf{exf}}\left(l\right)$ contains the expected frequency of the l$l$th score pattern (l$l$th row of x), corresponding to the final point held in the arrays alpha and pigam.
13:   y(ns) – double array
Given a valid parameter set then y(l)${\mathbf{y}}\left(l\right)$ contains the estimated theta score corresponding to the l$l$th row of x, for the final point held in the arrays alpha and pigam.
14:   iob(ns) – int64int32nag_int array
Given a valid parameter set then iob(l)${\mathbf{iob}}\left(l\right)$ contains the number of items in the l$l$th row of x for which the response was positive (true).
15:   rlogl – double scalar
Given a valid parameter set then rlogl contains the value of the log-likelihood kernel corresponding to the final point held in the arrays alpha and pigam, namely
 s ∑ irl(l) × log(exf(l) / n). l = 1
$∑l=1 sirll×log(exfl/n).$
16:   chi – double scalar
If chisqr was set equal to true on entry, then given a valid parameter set, chi will contain the value of the likelihood ratio statistic corresponding to the final parameter estimates held in the arrays alpha and pigam, namely
 s 2 × ∑ irl(l) × log(exf(l) / irl(l)). l = 1
$2×∑l=1 sirll×log(exfl/irll).$
The summation is over those elements of irl which are positive. If exf(l)${\mathbf{exf}}\left(l\right)$ is less than 5.0$5.0$, then adjacent score patterns are pooled (the score patterns in x being first put in order of increasing theta score).
If chisqr has been set equal to false, then chi is not used.
17:   idf – int64int32nag_int scalar
If chisqr was set equal to true on entry, then given a valid parameter set, idf will contain the degrees of freedom associated with the likelihood ratio statistic, chi.
 idf = s0 − 2 × p${\mathbf{idf}}={s}_{0}-2×p$ if s0 < 2p${s}_{0}<{2}^{p}$; idf = s0 − 2 × p − 1${\mathbf{idf}}={s}_{0}-2×p-1$ if s0 = 2p${s}_{0}={2}^{p}$,
where s0${s}_{0}$ denotes the number of terms summed to calculate chi (s0 = s${s}_{0}=s$ only if there is no pooling).
If chisqr has been set equal to false, then idf is not used.
18:   siglev – double scalar
If chisqr was set equal to true on entry, then given a valid parameter set, siglev will contain the significance level of chi based on idf degrees of freedom. If idf is zero or negative then siglev is set to zero.
If chisqr was set equal to false, then siglev is not used.
19:   ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Note: nag_contab_binary (g11sa) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

ifail = 1${\mathbf{ifail}}=1$
 On entry, ip < 3${\mathbf{ip}}<3$, or n < 7${\mathbf{n}}<7$, or ns ≤ 2 × ip${\mathbf{ns}}\le 2×{\mathbf{ip}}$, or ns > n${\mathbf{ns}}>{\mathbf{n}}$, or ns > 2ip${\mathbf{ns}}>{2}^{{\mathbf{ip}}}$, or two or more rows of x are identical, or ldx < ns$\mathit{ldx}<{\mathbf{ns}}$, or ∑ l = 1nsirl(l) ≠ n$\sum _{l=1}^{{\mathbf{ns}}}{\mathbf{irl}}\left(l\right)\ne {\mathbf{n}}$, or at least one of irl(l) < 0${\mathbf{irl}}\left(\mathit{l}\right)<0$, for l = 1,2, … ,ns$\mathit{l}=1,2,\dots ,{\mathbf{ns}}$, or maxit < 1${\mathbf{maxit}}<1$, or ishow < 0$\mathit{ishow}<0$, or ishow > 7$\mathit{ishow}>7$, or ldcm < ip + ip$\mathit{ldcm}<{\mathbf{ip}}+{\mathbf{ip}}$, or ldexpp < ip$\mathit{ldexpp}<{\mathbf{ip}}$, or lw < 4 × ip × (ip + 16)$\mathit{lw}<4×{\mathbf{ip}}×\left({\mathbf{ip}}+16\right)$.
ifail = 2${\mathbf{ifail}}=2$
For at least one of the ip items the responses are all at the same level. To proceed, you must delete this item from the dataset.
ifail = 3${\mathbf{ifail}}=3$
There have been maxit iterations of the maximum likelihood search function. If steady increases in the log-likelihood kernel were monitored up to the point where this exit occurred, then the exit probably occurred simply because maxit was set too small, so the calculations should be restarted from the final point held in a and c. This type of exit may also indicate that there is no maximum to the likelihood surface.
ifail = 4${\mathbf{ifail}}=4$
One of the elements of a has exceeded 10.0$10.0$ in absolute value (see Section [Heywood Cases]). If steady increases in the log-likelihood kernel were monitored up to the point where this exit occurred then this exit may indicate that there is no maximum to the likelihood surface. You are advised to restart the calculations from a different point to see whether the E-M algorithm moves off in the same direction.
ifail = 5${\mathbf{ifail}}=5$
This indicates a failure in nag_matop_real_symm_posdef_inv (f01ab) which is used to invert the second derivative matrix for calculating the variance-covariance matrix of parameter estimates. It was also found that maxit iterations had been performed (see ${\mathbf{ifail}}={\mathbf{3}}$). The elements of cm will then have been set to zero on exit. You are advised to restart the calculations with a larger value for maxit.
ifail = 6${\mathbf{ifail}}=6$
This indicates a failure in nag_matop_real_symm_posdef_inv (f01ab) which is used to invert the second derivative matrix for calculating the variance-covariance matrix of parameter estimates. It was also found that one of the elements of a had exceeded 10.0$10.0$ in absolute value (see ${\mathbf{ifail}}={\mathbf{4}}$). The elements of cm will have then been set to zero on exit. You are advised to restart the calculations from a different point to see whether the E-M algorithm moves off in the same direction.
W ifail = 7${\mathbf{ifail}}=7$
If chisqr was set equal to true on entry, so that a likelihood ratio statistic was calculated, then ${\mathbf{ifail}}={\mathbf{7}}$ merely indicates that the value of idf on exit is 0$\text{}\le 0$, i.e., the χ2${\chi }^{2}$ statistic is meaningless. In this case siglev is returned as zero. All other output information should be correct, i.e., can be treated as if ${\mathbf{ifail}}={\mathbf{0}}$ on exit.

## Accuracy

On exit from nag_contab_binary (g11sa) if ${\mathbf{ifail}}={\mathbf{0}}$ or 7${\mathbf{7}}$ then the following condition will be satisfied:
 max {|g(i)|} < cgetol. 1 ≤ i ≤ 2 × p
$max 1≤i≤2×p { |gi| } < cgetol .$
If ${\mathbf{ifail}}={\mathbf{3}}$ or 5${\mathbf{5}}$ on exit (i.e., maxit iterations have been performed but the above condition does not hold), then the elements in a, c, alpha and pigam may still be good approximations to the maximum likelihood estimates. You are advised to inspect the elements of g to see whether this is confirmed.

### Timing

The number of iterations required in the maximum likelihood search depends upon the number of observed variables, p$p$, and the distance of the starting point you supplied from the solution. The number of multiplications and divisions performed in an iteration is proportional to p$p$.

### Initial Estimates

You are strongly advised to use the recommended starting values for the elements of a and c. Divergence may result from values you supplied even if they are very close to the solution. Divergence may also occur when an item has nearly all its responses at one level.

### Heywood Cases

As in normal factor analysis, Heywood cases can often occur, particularly when p$p$ is small and n$n$ not very big. To overcome this difficulty the maximum likelihood search function is terminated when the absolute value of one of the αj1${\alpha }_{j1}$ exceeds 10.0$10.0$. You have the option of deciding whether to exit from nag_contab_binary (g11sa) (by setting ${\mathbf{ifail}}={\mathbf{0}}$ on entry) or to permit nag_contab_binary (g11sa) to proceed onwards as if it had exited normally from the maximum likelihood search function (setting ifail = 1${\mathbf{ifail}}=-1$ on entry). The elements in a, c, alpha and pigam may still be good approximations to the maximum likelihood estimates. You are advised to inspect the elements g to see whether this is confirmed.

### Goodness of Fit Statistic

When n$n$ is not very large compared to s$s$ a goodness-of-fit statistic should not be calculated as many of the expected frequencies will then be less than 5$5$.

### First and Second Order Margins

The observed and expected percentages of sample members responding to individual and pairs of items held in the arrays obs and expp on exit can be converted to observed and expected numbers by multiplying all elements of these two arrays by n / 100.0$n/100.0$.

## Example

```function nag_contab_binary_example
n = int64(1000);
gprob = false;
x = [false, false, false, false;
true, false, false, false;
false, false, false, true;
false, true, false, false;
true, false, false, true;
true, true, false, false;
false, true, false, true;
false, false, true, false;
true, true, false, true;
true, false, true, false;
false, false, true, true;
false, true, true, false;
true, false, true, true;
true, true, true, false;
false, true, true, true;
true, true, true, true];
irl = [int64(154);11;42;49;2;10;27;84;10;25;75;129;30;50;181;121];
a = [0.5;
0.5;
0.5;
0.5];
c = [0;
0;
0;
0];
cgetol = 0.0001;
chisqr = true;
[xOut, irlOut, aOut, cOut, niter, alpha, pigam, cm, g, expp, obs, exf, y, ...
iob, rlogl, chi, idf, siglev, ifail] = ...
nag_contab_binary(n, gprob, x, irl, a, c, cgetol, chisqr, 'iprint', int64(0))
```
```
ITERATION NUMBER =   148

VALUE OF LOG LIKELIHOOD KERNEL =   -0.24039E+04

MAGNITUDE OF LARGEST COMPONENT OF GRADIENT VECTOR =      0.953E-04

CURRENT ESTIMATES OF ALPHA(J,1)'S
---------------------------------

1.045         1.409         2.659         1.122

-----------------------------

-0.147E-04    -0.501E-04     0.953E-04    -0.269E-04

CURRENT ESTIMATES OF ALPHA(J,0)'S
---------------------------------

-1.276         0.424         1.615        -0.062

-----------------------------

0.900E-06     0.737E-06    -0.315E-06     0.863E-06

************************************************************************

xOut =

0     0     0     0
1     0     0     0
0     0     0     1
0     1     0     0
1     0     0     1
1     1     0     0
0     1     0     1
0     0     1     0
1     1     0     1
1     0     1     0
0     0     1     1
0     1     1     0
1     0     1     1
1     1     1     0
0     1     1     1
1     1     1     1

irlOut =

154
11
42
49
2
10
27
84
10
25
75
129
30
50
181
121

aOut =

1.0455
1.4094
2.6592
1.1217

cOut =

-1.2764
0.4237
1.6151
-0.0617

niter =

148

alpha =

1.0455
1.4094
2.6592
1.1217

pigam =

0.2182
0.6044
0.8341
0.4846

cm =

0.1481   -0.5371    0.0076   -0.0133   -0.1266   -0.1339   -0.0109   -0.0035
-0.5371    0.0174    0.0126    0.1687    0.0835    0.1825    0.0228    0.1480
0.0076    0.0126    0.1789    0.2485   -0.3595   -0.3341    0.0414    0.0001
-0.0133    0.1687    0.2485    0.0216   -0.0884    0.0900    0.0006    0.2255
-0.1266    0.0835   -0.3595   -0.0884    0.5248    0.8588   -0.2045    0.0169
-0.1339    0.1825   -0.3341    0.0900    0.8588    0.0358   -0.2001    0.1643
-0.0109    0.0228    0.0414    0.0006   -0.2045   -0.2001    0.1396   -0.0375
-0.0035    0.1480    0.0001    0.2255    0.0169    0.1643   -0.0375    0.0199

g =

1.0e-04 *

-0.1314
0.0473
-0.4490
0.0276
0.8535
-0.0204
-0.2411
0.0309

expp =

25.8963         0         0         0
19.0888   57.6547         0         0
22.4987   47.9571   69.4214         0
16.4381   33.8672   40.5658   48.7712

obs =

25.9000         0         0         0
19.1000   57.7000         0         0
22.6000   48.1000   69.5000         0
16.3000   33.9000   40.7000   48.8000

exf =

147.0609
13.4437
42.4201
54.8180
5.8856
8.4102
27.5115
92.0619
6.2365
21.8468
73.8352
123.7656
26.8989
50.8813
179.5643
125.3596

y =

-1.2735
-0.8731
-0.8462
-0.7469
-0.4941
-0.3995
-0.3743
-0.3320
-0.0187
0.0272
0.0549
0.1618
0.4659
0.5913
0.6256
1.1444

iob =

0
1
1
1
2
2
2
1
3
2
2
2
3
3
3
4

rlogl =

-2.4039e+03

chi =

9.0273

idf =

7

siglev =

0.2507

ifail =

0

```
```function g11sa_example
n = int64(1000);
gprob = false;
x = [false, false, false, false;
true, false, false, false;
false, false, false, true;
false, true, false, false;
true, false, false, true;
true, true, false, false;
false, true, false, true;
false, false, true, false;
true, true, false, true;
true, false, true, false;
false, false, true, true;
false, true, true, false;
true, false, true, true;
true, true, true, false;
false, true, true, true;
true, true, true, true];
irl = [int64(154);11;42;49;2;10;27;84;10;25;75;129;30;50;181;121];
a = [0.5;
0.5;
0.5;
0.5];
c = [0;
0;
0;
0];
cgetol = 0.0001;
chisqr = true;
[xOut, irlOut, aOut, cOut, niter, alpha, pigam, cm, g, expp, obs, exf, y, ...
iob, rlogl, chi, idf, siglev, ifail] = ...
g11sa(n, gprob, x, irl, a, c, cgetol, chisqr, 'iprint', int64(0))
```
```
ITERATION NUMBER =   148

VALUE OF LOG LIKELIHOOD KERNEL =   -0.24039E+04

MAGNITUDE OF LARGEST COMPONENT OF GRADIENT VECTOR =      0.953E-04

CURRENT ESTIMATES OF ALPHA(J,1)'S
---------------------------------

1.045         1.409         2.659         1.122

-----------------------------

-0.147E-04    -0.501E-04     0.953E-04    -0.269E-04

CURRENT ESTIMATES OF ALPHA(J,0)'S
---------------------------------

-1.276         0.424         1.615        -0.062

-----------------------------

0.900E-06     0.737E-06    -0.315E-06     0.863E-06

************************************************************************

xOut =

0     0     0     0
1     0     0     0
0     0     0     1
0     1     0     0
1     0     0     1
1     1     0     0
0     1     0     1
0     0     1     0
1     1     0     1
1     0     1     0
0     0     1     1
0     1     1     0
1     0     1     1
1     1     1     0
0     1     1     1
1     1     1     1

irlOut =

154
11
42
49
2
10
27
84
10
25
75
129
30
50
181
121

aOut =

1.0455
1.4094
2.6592
1.1217

cOut =

-1.2764
0.4237
1.6151
-0.0617

niter =

148

alpha =

1.0455
1.4094
2.6592
1.1217

pigam =

0.2182
0.6044
0.8341
0.4846

cm =

0.1481   -0.5371    0.0076   -0.0133   -0.1266   -0.1339   -0.0109   -0.0035
-0.5371    0.0174    0.0126    0.1687    0.0835    0.1825    0.0228    0.1480
0.0076    0.0126    0.1789    0.2485   -0.3595   -0.3341    0.0414    0.0001
-0.0133    0.1687    0.2485    0.0216   -0.0884    0.0900    0.0006    0.2255
-0.1266    0.0835   -0.3595   -0.0884    0.5248    0.8588   -0.2045    0.0169
-0.1339    0.1825   -0.3341    0.0900    0.8588    0.0358   -0.2001    0.1643
-0.0109    0.0228    0.0414    0.0006   -0.2045   -0.2001    0.1396   -0.0375
-0.0035    0.1480    0.0001    0.2255    0.0169    0.1643   -0.0375    0.0199

g =

1.0e-04 *

-0.1314
0.0473
-0.4490
0.0276
0.8535
-0.0204
-0.2411
0.0309

expp =

25.8963         0         0         0
19.0888   57.6547         0         0
22.4987   47.9571   69.4214         0
16.4381   33.8672   40.5658   48.7712

obs =

25.9000         0         0         0
19.1000   57.7000         0         0
22.6000   48.1000   69.5000         0
16.3000   33.9000   40.7000   48.8000

exf =

147.0609
13.4437
42.4201
54.8180
5.8856
8.4102
27.5115
92.0619
6.2365
21.8468
73.8352
123.7656
26.8989
50.8813
179.5643
125.3596

y =

-1.2735
-0.8731
-0.8462
-0.7469
-0.4941
-0.3995
-0.3743
-0.3320
-0.0187
0.0272
0.0549
0.1618
0.4659
0.5913
0.6256
1.1444

iob =

0
1
1
1
2
2
2
1
3
2
2
2
3
3
3
4

rlogl =

-2.4039e+03

chi =

9.0273

idf =

7

siglev =

0.2507

ifail =

0

```

Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013