g11 Chapter Contents
g11 Chapter Introduction
NAG Library Manual

# NAG Library Function Documentnag_binary_factor (g11sac)

## 1  Purpose

nag_binary_factor (g11sac) fits a latent variable model (with a single factor) to data consisting of a set of measurements on individuals in the form of binary-valued sequences (generally referred to as score patterns). Various measures of goodness-of-fit are calculated along with the factor (theta) scores.

## 2  Specification

 #include #include
 void nag_binary_factor (Nag_OrderType order, Integer p, Integer n, Nag_Boolean gprob, Integer ns, Nag_Boolean x[], Integer pdx, Integer irl[], double a[], double c[], Integer iprint, const char *outfile, double cgetol, Integer maxit, Nag_Boolean chisqr, Integer *niter, double alpha[], double pigam[], double cm[], Integer pdcm, double g[], double expp[], Integer pde, double obs[], double exf[], double y[], Integer iob[], double *rlogl, double *chi, Integer *idf, double *siglev, NagError *fail)

## 3  Description

Given a set of $p$ dichotomous variables $\stackrel{~}{x}={\left({x}_{1},{x}_{2},\dots ,{x}_{p}\right)}^{\prime }$, where ${}^{\prime }$ denotes vector or matrix transpose, the objective is to investigate whether the association between them can be adequately explained by a latent variable model of the form (see Bartholomew (1980) and Bartholomew (1987))
 $Gπiθ=αi0+αi1θ.$ (1)
The ${x}_{i}$ are called item responses and take the value $0$ or $1$. $\theta$ denotes the latent variable assumed to have a standard Normal distribution over a population of individuals to be tested on $p$ items. Call ${\pi }_{i}\left(\theta \right)=P\left({x}_{i}=1\mid \theta \right)$ the item response function: it represents the probability that an individual with latent ability $\theta$ will produce a positive response (1) to item $i$. ${\alpha }_{i0}$ and ${\alpha }_{i1}$ are item parameters which can assume any real values. The set of parameters, ${\alpha }_{\mathit{i}1}$, for $\mathit{i}=1,2,\dots ,p$, being coefficients of the unobserved variable $\theta$, can be interpreted as ‘factor loadings’.
$G$ is a function selected by you as either ${\Phi }^{-1}$ or logit, mapping the interval $\left(0,1\right)$ onto the whole real line. Data from a random sample of $n$ individuals takes the form of the matrices $X$ and $R$ defined below:
 $Xs×p= x11 x12 … x1p x21 x22 … x2p ⋮ ⋮ ⋮ xs1 xs2 … xsp = x~1 x~2 ⋮ x~s , Rs×1= r1 r2 ⋮ rs$
where ${\stackrel{~}{x}}_{l}=\left({x}_{l1},{x}_{l2},\dots ,{x}_{lp}\right)$ denotes the $l$th score pattern in the sample, ${r}_{l}$ the frequency with which ${\stackrel{~}{x}}_{l}$ occurs and $s$ the number of different score patterns observed. (Thus $\sum _{l=1}^{s}{r}_{l}=n$). It can be shown that the log-likelihood function is proportional to
 $∑ l=1 s rl log⁡Pl ,$
where
 $Pl = P x~ = x~l = ∫ -∞ ∞ P x~ = x~l ∣ θ ϕθ dθ$ (2)
($\varphi \left(\theta \right)$ being the probability density function of a standard Normal random variable).
${P}_{l}$ denotes the unconditional probability of observing score pattern ${\stackrel{~}{x}}_{l}$. The integral in (2) is approximated using Gauss–Hermite quadrature. If we take $G\left(z\right)=\mathrm{logit}z=\mathrm{log}\left(\frac{z}{1-z}\right)$ in (1) and reparameterise as follows,
 $αi = αi1, πi = logit-1⁡αi0,$
then (1) reduces to the logit model (see Bartholomew (1980))
 $πiθ = πi πi + 1-πi exp - αi θ .$
If we take $G\left(z\right)={\Phi }^{-1}\left(z\right)$ (where $\Phi$ is the cumulative distribution function of a standard Normal random variable) and reparameterise as follows,
 $αi = αi11+αi12 γi = -αi01+αi12 ,$
then (1) reduces to the probit model (see Bock and Aitkin (1981))
 $πiθ=ϕ αiθ-γi 1-αi2 .$
An E-M algorithm (see Bock and Aitkin (1981)) is used to maximize the log-likelihood function. The number of quadrature points used is set initially to $10$ and once convergence is attained increased to $20$.
The theta score of an individual responding in score pattern ${\stackrel{~}{x}}_{l}$ is computed as the posterior mean, i.e., $E\left(\theta \mid {\stackrel{~}{x}}_{l}\right)$. For the logit model the component score ${X}_{l}=\sum _{j=1}^{p}{\alpha }_{j}{x}_{lj}$ is also calculated. (Note that in calculating the theta scores and measures of goodness-of-fit nag_binary_factor (g11sac) automatically reverses the coding on item $j$ if ${\alpha }_{j}<0$; it is assumed in the model that a response at the one level is showing a higher measure of latent ability than a response at the zero level.)
The frequency distribution of score patterns is required as input data. If your data is in the form of individual score patterns (uncounted), then nag_binary_factor_service (g11sbc) may be used to calculate the frequency distribution.

## 4  References

Bartholomew D J (1980) Factor analysis for categorical data (with Discussion) J. Roy. Statist. Soc. Ser. B 42 293–321
Bartholomew D J (1987) Latent Variable Models and Factor Analysis Griffin
Bock R D and Aitkin M (1981) Marginal maximum likelihood estimation of item parameters: Application of an E-M algorithm Psychometrika 46 443–459

## 5  Arguments

1:     orderNag_OrderTypeInput
On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by ${\mathbf{order}}=\mathrm{Nag_RowMajor}$. See Section 3.2.1.3 in the Essential Introduction for a more detailed explanation of the use of this argument.
Constraint: ${\mathbf{order}}=\mathrm{Nag_RowMajor}$ or $\mathrm{Nag_ColMajor}$.
2:     pIntegerInput
On entry: $p$, the number of dichotomous variables.
Constraint: ${\mathbf{p}}\ge 3$.
3:     nIntegerInput
On entry: $n$, the number of individuals in the sample.
Constraint: ${\mathbf{n}}\ge 7$.
4:     gprobNag_BooleanInput
On entry: must be set equal to Nag_TRUE if $G\left(z\right)={\Phi }^{-1}\left(z\right)$ and Nag_FALSE if $G\left(z\right)=\mathrm{logit}z$.
5:     nsIntegerInput
On entry: ns must be set equal to the number of different score patterns in the sample, $s$.
Constraint: $2×{\mathbf{p}}<{\mathbf{ns}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({2}^{{\mathbf{p}}},{\mathbf{n}}\right)$.
6:     x[$\mathit{dim}$]Nag_BooleanInput/Output
Note: the dimension, dim, of the array x must be at least
• $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1,{\mathbf{pdx}}×{\mathbf{p}}\right)$ when ${\mathbf{order}}=\mathrm{Nag_ColMajor}$;
• $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1,{\mathbf{ns}}×{\mathbf{pdx}}\right)$ when ${\mathbf{order}}=\mathrm{Nag_RowMajor}$.
Where ${\mathbf{X}}\left(l,j\right)$ appears in this document, it refers to the array element
• ${\mathbf{x}}\left[\left(j-1\right)×{\mathbf{pdx}}+l-1\right]$ when ${\mathbf{order}}=\mathrm{Nag_ColMajor}$;
• ${\mathbf{x}}\left[\left(l-1\right)×{\mathbf{pdx}}+j-1\right]$ when ${\mathbf{order}}=\mathrm{Nag_RowMajor}$.
On entry: the first $s$ rows of x must contain the $s$ different score patterns. The $l$th row of x must contain the $l$th score pattern with ${\mathbf{X}}\left(l,j\right)$ set equal to Nag_TRUE if ${x}_{lj}=1$ and Nag_FALSE if ${x}_{lj}=0$. All rows of x must be distinct.
On exit: given a valid parameter set then the first $s$ rows of x still contain the $s$ different score patterns. However, the following points should be noted:
 (i) If the estimated factor loading for the $j$th item is negative then that item is re-coded, i.e., $0$s and $1$s (or Nag_TRUE and Nag_FALSE) in the $j$th column of x are interchanged. (ii) The rows of x will be reordered so that the theta scores corresponding to rows of x are in increasing order of magnitude.
7:     pdxIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) in the array x.
Constraints:
• if ${\mathbf{order}}=\mathrm{Nag_ColMajor}$, ${\mathbf{pdx}}\ge {\mathbf{ns}}$;
• if ${\mathbf{order}}=\mathrm{Nag_RowMajor}$, ${\mathbf{pdx}}\ge {\mathbf{p}}$.
8:     irl[ns]IntegerInput/Output
On entry: the $i$th component of irl must be set equal to the frequency with which the $i$th row of x occurs.
Constraints:
• ${\mathbf{irl}}\left[\mathit{i}-1\right]\ge 0$, for $\mathit{i}=1,2,\dots ,s$;
• $\sum _{i=0}^{s-1}{\mathbf{irl}}\left[i-1\right]=n$.
On exit: given a valid parameter set then the first $s$ components of irl are reordered as are the rows of x.
9:     a[p]doubleInput/Output
On entry: ${\mathbf{a}}\left[j-1\right]$ must be set equal to an initial estimate of ${\alpha }_{j1}$. In order to avoid divergence problems with the E-M algorithm you are strongly advised to set all the ${\mathbf{a}}\left[j-1\right]$ to $0.5$.
On exit: ${\mathbf{a}}\left[\mathit{j}-1\right]$ contains the latest estimate of ${\alpha }_{\mathit{j}1}$, for $\mathit{j}=1,2,\dots ,p$. (Because of possible recoding all elements of a will be positive.)
10:   c[p]doubleInput/Output
On entry: ${\mathbf{c}}\left[j-1\right]$ must be set equal to an initial estimate of ${\alpha }_{j0}$. In order to avoid divergence problems with the E-M algorithm you are strongly advised to set all the ${\mathbf{c}}\left[j-1\right]$ to $0.0$.
On exit: ${\mathbf{c}}\left[\mathit{j}-1\right]$ contains the latest estimate of ${\alpha }_{\mathit{j}0}$, for $\mathit{j}=1,2,\dots ,p$.
11:   iprintIntegerInput
On entry: the frequency with which the maximum likelihood search function is to be monitored.
${\mathbf{iprint}}>0$
The search is monitored once every iprint iterations, and when the number of quadrature points is increased, and again at the final solution point.
${\mathbf{iprint}}=0$
The search is monitored once at the final point.
${\mathbf{iprint}}<0$
The search is not monitored at all.
iprint should normally be set to a small positive number.
Suggested value: ${\mathbf{iprint}}=1$.
12:   outfileconst char *Input
On entry: the name of a file to which diagnostic output will be directed. If outfile is NULL the diagnostic output will be directed to standard output.
13:   cgetoldoubleInput
On entry: the accuracy to which the solution is required.
If cgetol is set to ${10}^{-l}$ and on exit ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or NE_ZERO_DF, then all elements of the gradient vector will be smaller than ${10}^{-l}$ in absolute value. For most practical purposes the value ${10}^{-4}$ should suffice. You should be wary of setting cgetol too small since the convergence criterion may then have become too strict for the machine to handle.
If cgetol has been set to a value which is less than the square root of the machine precision, $\epsilon$, then nag_binary_factor (g11sac) will use the value $\sqrt{\epsilon }$ instead.
14:   maxitIntegerInput
On entry: the maximum number of iterations to be made in the maximum likelihood search. There will be an error exit (see Section 6) if the search function has not converged in maxit iterations.
Suggested value: ${\mathbf{maxit}}=1000$.
Constraint: ${\mathbf{maxit}}\ge 1$.
15:   chisqrNag_BooleanInput
On entry: if chisqr is set equal to Nag_TRUE, then a likelihood ratio statistic will be calculated (see chi).
If chisqr is set equal to Nag_FALSE, no such statistic will be calculated.
16:   niterInteger *Output
On exit: given a valid parameter set then niter contains the number of iterations performed by the maximum likelihood search function.
17:   alpha[p]doubleOutput
On exit: given a valid parameter set then ${\mathbf{alpha}}\left[j-1\right]$ contains the latest estimate of ${\alpha }_{j}$. (Because of possible recoding all elements of alpha will be positive.)
18:   pigam[p]doubleOutput
On exit: given a valid parameter set then ${\mathbf{pigam}}\left[j-1\right]$ contains the latest estimate of either ${\pi }_{j}$ if ${\mathbf{gprob}}=\mathrm{Nag_FALSE}$ (logit model) or ${\gamma }_{j}$ if ${\mathbf{gprob}}=\mathrm{Nag_TRUE}$ (probit model).
19:   cm[$\mathit{dim}$]doubleOutput
Note: the dimension, dim, of the array cm must be at least ${\mathbf{pdcm}}×2×{\mathbf{p}}$.
Where ${\mathbf{CM}}\left(i,j\right)$ appears in this document, it refers to the array element
• if ${\mathbf{order}}=\mathrm{Nag_ColMajor}$, ${\mathbf{cm}}\left[\left(j-1\right)×{\mathbf{pdcm}}+i-1\right]$;
• if ${\mathbf{order}}=\mathrm{Nag_RowMajor}$, ${\mathbf{cm}}\left[\left(i-1\right)×{\mathbf{pdcm}}+j-1\right]$.
On exit: given a valid parameter set then the strict lower triangle of cm contains the correlation matrix of the parameter estimates held in alpha and pigam on exit. The diagonal elements of cm contain the standard errors. Thus:
 ${\mathbf{CM}}\left(2×i-1,2×i-1\right)$ = standard error $\left({\mathbf{alpha}}\left[i-1\right]\right)$ ${\mathbf{CM}}\left(2×i,2×i\right)$ = standard error $\left({\mathbf{pigam}}\left[i-1\right]\right)$ ${\mathbf{CM}}\left(2×i,2×i-1\right)$ = correlation $\left({\mathbf{pigam}}\left[i-1\right],{\mathbf{alpha}}\left[i-1\right]\right)$,
for $i=1,2,\dots ,p$;
 ${\mathbf{CM}}\left(2×i-1,2×j-1\right)$ = correlation $\left({\mathbf{alpha}}\left[i-1\right],{\mathbf{alpha}}\left[j-1\right]\right)$ ${\mathbf{CM}}\left(2×i,2×j\right)$ = correlation $\left({\mathbf{pigam}}\left[i-1\right],{\mathbf{pigam}}\left[j-1\right]\right)$ ${\mathbf{CM}}\left(2×i-1,2×j\right)$ = correlation $\left({\mathbf{alpha}}\left[i-1\right],{\mathbf{pigam}}\left[j-1\right]\right)$ ${\mathbf{CM}}\left(2×i,2×j-1\right)$ = correlation $\left({\mathbf{alpha}}\left[j-1\right],{\mathbf{pigam}}\left[i-1\right]\right)$,
for $j=1,2,\dots ,i-1$.
If the second derivative matrix cannot be computed then all the elements of cm are returned as zero.
20:   pdcmIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) of the matrix $C$ in the array cm.
Constraint: ${\mathbf{pdcm}}\ge 2×{\mathbf{p}}$.
21:   g[$2×{\mathbf{p}}$]doubleOutput
On exit: given a valid parameter set then g contains the estimated gradient vector corresponding to the final point held in the arrays alpha and pigam. ${\mathbf{g}}\left[2×\mathit{j}-2\right]$ contains the derivative of the log-likelihood with respect to ${\mathbf{alpha}}\left[\mathit{j}-1\right]$, for $\mathit{j}=1,2,\dots ,p$. ${\mathbf{g}}\left[2×\mathit{j}-1\right]$ contains the derivative of the log-likelihood with respect to ${\mathbf{pigam}}\left[\mathit{j}-1\right]$, for $\mathit{j}=1,2,\dots ,p$.
22:   expp[$\mathit{dim}$]doubleOutput
Note: the dimension, dim, of the array expp must be at least ${\mathbf{pde}}×{\mathbf{p}}$.
Where ${\mathbf{EXPP}}\left(i,j\right)$ appears in this document, it refers to the array element
• if ${\mathbf{order}}=\mathrm{Nag_ColMajor}$, ${\mathbf{expp}}\left[\left(j-1\right)×{\mathbf{pde}}+i-1\right]$;
• if ${\mathbf{order}}=\mathrm{Nag_RowMajor}$, ${\mathbf{expp}}\left[\left(i-1\right)×{\mathbf{pde}}+j-1\right]$.
On exit: given a valid parameter set then ${\mathbf{EXPP}}\left(i,j\right)$ contains the expected percentage of individuals in the sample who respond positively to items $i$ and $j$ ($j\le i$), corresponding to the final point held in the arrays alpha and pigam.
23:   pdeIntegerInput
On entry: the stride separating row or column elements (depending on the value of order) of the matrix $E$ in the array expp.
Constraint: ${\mathbf{pde}}\ge {\mathbf{p}}$.
24:   obs[$\mathit{dim}$]doubleOutput
Note: the dimension, dim, of the array obs must be at least ${\mathbf{pde}}×{\mathbf{p}}$.
Where ${\mathbf{OBS}}\left(i,j\right)$ appears in this document, it refers to the array element
• if ${\mathbf{order}}=\mathrm{Nag_ColMajor}$, ${\mathbf{obs}}\left[\left(j-1\right)×{\mathbf{pde}}+i-1\right]$;
• if ${\mathbf{order}}=\mathrm{Nag_RowMajor}$, ${\mathbf{obs}}\left[\left(i-1\right)×{\mathbf{pde}}+j-1\right]$.
On exit: given a valid parameter set then ${\mathbf{OBS}}\left(i,j\right)$ contains the observed percentage of individuals in the sample who responded positively to items $i$ and $j$ ($j\le i$).
25:   exf[ns]doubleOutput
On exit: given a valid parameter set then ${\mathbf{exf}}\left[l-1\right]$ contains the expected frequency of the $l$th score pattern ($l$th row of x), corresponding to the final point held in the arrays alpha and pigam.
26:   y[ns]doubleOutput
On exit: given a valid parameter set then ${\mathbf{y}}\left[l-1\right]$ contains the estimated theta score corresponding to the $l$th row of x, for the final point held in the arrays alpha and pigam.
27:   iob[ns]IntegerOutput
On exit: given a valid parameter set then ${\mathbf{iob}}\left[l-1\right]$ contains the number of items in the $l$th row of x for which the response was positive (Nag_TRUE).
28:   rlogldouble *Output
On exit: given a valid parameter set then rlogl contains the value of the log-likelihood kernel corresponding to the final point held in the arrays alpha and pigam, namely
 $∑l=0 s-1irl[l]×logexf[l]/n.$
29:   chidouble *Output
On exit: if chisqr was set equal to Nag_TRUE on entry, then given a valid parameter set, chi will contain the value of the likelihood ratio statistic corresponding to the final parameter estimates held in the arrays alpha and pigam, namely
 $2×∑l=0 s-1irl[l]×logexf[l]/irl[l].$
The summation is over those elements of irl which are positive. If ${\mathbf{exf}}\left[l-1\right]$ is less than $5.0$, then adjacent score patterns are pooled (the score patterns in x being first put in order of increasing theta score).
If chisqr has been set equal to Nag_FALSE, then chi is not used.
30:   idfInteger *Output
On exit: if chisqr was set equal to Nag_TRUE on entry, then given a valid parameter set, idf will contain the degrees of freedom associated with the likelihood ratio statistic, chi.
 ${\mathbf{idf}}={s}_{0}-2×p$ if ${s}_{0}<{2}^{p}$; ${\mathbf{idf}}={s}_{0}-2×p-1$ if ${s}_{0}={2}^{p}$,
where ${s}_{0}$ denotes the number of terms summed to calculate chi (${s}_{0}=s$ only if there is no pooling).
If chisqr has been set equal to Nag_FALSE, then idf is not used.
31:   siglevdouble *Output
On exit: if chisqr was set equal to Nag_TRUE on entry, then given a valid parameter set, siglev will contain the significance level of chi based on idf degrees of freedom. If idf is zero or negative then siglev is set to zero.
If chisqr was set equal to Nag_FALSE, then siglev is not used.
32:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

## 6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
On entry, argument $⟨\mathit{\text{value}}⟩$ had an illegal value.
NE_INT
On entry, ${\mathbf{maxit}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{maxit}}\ge 1$.
On entry, ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{n}}\ge 7$.
On entry, ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{p}}\ge 3$.
On entry, ${\mathbf{pdcm}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pdcm}}>0$.
On entry, ${\mathbf{pde}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pde}}>0$.
On entry, ${\mathbf{pdx}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pdx}}>0$.
NE_INT_2
On entry, $\mathit{I}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{irl}}\left[\mathit{I}-1\right]=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{irl}}\left[\mathit{I}-1\right]\ge 0$.
On entry, ${\mathbf{irl}}\left[0\right]+\cdots +{\mathbf{irl}}\left[{\mathbf{ns}}-1\right]=⟨\mathit{\text{value}}⟩$ and ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{irl}}\left[0\right]+\cdots +{\mathbf{irl}}\left[{\mathbf{ns}}-1\right]={\mathbf{n}}$.
On entry, ${\mathbf{ns}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{ns}}\le {\mathbf{n}}$.
On entry, ${\mathbf{ns}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{ns}}>2×{\mathbf{p}}$.
On entry, ${\mathbf{ns}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{ns}}\le {2}^{{\mathbf{p}}}$.
On entry, ${\mathbf{pdcm}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pdcm}}\ge 2×{\mathbf{p}}$.
On entry, ${\mathbf{pde}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pde}}\ge {\mathbf{p}}$
On entry, ${\mathbf{pdx}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{ns}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pdx}}\ge {\mathbf{ns}}$.
On entry, ${\mathbf{pdx}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{pdx}}\ge {\mathbf{p}}$.
On entry, rows $\mathit{I}$ and $\mathit{J}$ of x are identical: $\mathit{I}=⟨\mathit{\text{value}}⟩$ and $\mathit{J}=⟨\mathit{\text{value}}⟩$.
NE_INT_3
On entry, ${\mathbf{p}}=⟨\mathit{\text{value}}⟩$, ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{ns}}=⟨\mathit{\text{value}}⟩$.
Constraint: $2×{\mathbf{p}}<{\mathbf{ns}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({2}^{{\mathbf{p}}},{\mathbf{n}}\right)$.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_MAT_INV
Failure to invert Hessian matrix and maxit iterations made: ${\mathbf{maxit}}=⟨\mathit{\text{value}}⟩$.
Failure to invert Hessian matrix plus Heywood case encountered.
NE_NOT_CLOSE_FILE
Cannot close file $⟨\mathit{\text{value}}⟩$.
NE_NOT_WRITE_FILE
Cannot open file $⟨\mathit{\text{value}}⟩$ for writing.
NE_REAL_ARRAY_ELEM_CONS
One of the elements of a has exceeded $10$ in absolute value (Heywood case).
NE_RESPONSE_LEVEL
For at least one of the p items the responses are all at the same level.
NE_TOO_MANY_ITER
maxit iterations have been performed: ${\mathbf{maxit}}=⟨\mathit{\text{value}}⟩$.
NE_ZERO_DF
Chi-squared statistic has idf degrees of freedom: ${\mathbf{idf}}=⟨\mathit{\text{value}}⟩$.

## 7  Accuracy

On exit from nag_binary_factor (g11sac) if ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_NOERROR or NE_ZERO_DF then the following condition will be satisfied:
 $max 0≤i≤2×p-1 g[i] < cgetol .$
If ${\mathbf{fail}}\mathbf{.}\mathbf{code}=$ NE_MAT_INV or NE_TOO_MANY_ITER on exit (i.e., maxit iterations have been performed but the above condition does not hold), then the elements in a, c, alpha and pigam may still be good approximations to the maximum likelihood estimates. You are advised to inspect the elements of g to see whether this is confirmed.

## 8  Parallelism and Performance

nag_binary_factor (g11sac) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
nag_binary_factor (g11sac) makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

### 9.1  Timing

The number of iterations required in the maximum likelihood search depends upon the number of observed variables, $p$, and the distance of the starting point you supplied from the solution. The number of multiplications and divisions performed in an iteration is proportional to $p$.

### 9.2  Initial Estimates

You are strongly advised to use the recommended starting values for the elements of a and c. Divergence may result from values you supplied even if they are very close to the solution. Divergence may also occur when an item has nearly all its responses at one level.

### 9.3  Heywood Cases

As in normal factor analysis, Heywood cases can often occur, particularly when $p$ is small and $n$ not very big. To overcome this difficulty the maximum likelihood search function is terminated when the absolute value of one of the ${\alpha }_{j1}$ exceeds $10.0$. You have the option of deciding whether to exit from nag_binary_factor (g11sac) (by setting ${\mathbf{fail}}\mathbf{.}\mathbf{print}=\mathrm{NAGERR_DEFAULT}$ on entry) or to permit nag_binary_factor (g11sac) to proceed onwards as if it had exited normally from the maximum likelihood search function (see ${\mathbf{fail}}\mathbf{.}\mathbf{print}=\mathrm{Nag_TRUE}$ or Nag_FALSE on entry). The elements in a, c, alpha and pigam may still be good approximations to the maximum likelihood estimates. You are advised to inspect the elements g to see whether this is confirmed.

### 9.4  Goodness of Fit Statistic

When $n$ is not very large compared to $s$ a goodness-of-fit statistic should not be calculated as many of the expected frequencies will then be less than $5$.

### 9.5  First and Second Order Margins

The observed and expected percentages of sample members responding to individual and pairs of items held in the arrays obs and expp on exit can be converted to observed and expected numbers by multiplying all elements of these two arrays by $n/100.0$.

## 10  Example

A program to fit the logit latent variable model to the following data:
 Index Score Pattern Observed Frequency $\phantom{0}1$ $0000$ $\phantom{0}154$ $\phantom{0}2$ $1000$ $\phantom{00}11$ $\phantom{0}3$ $0001$ $\phantom{00}42$ $\phantom{0}4$ $0100$ $\phantom{00}49$ $\phantom{0}5$ $1001$ $\phantom{000}2$ $\phantom{0}6$ $1100$ $\phantom{00}10$ $\phantom{0}7$ $0101$ $\phantom{00}27$ $\phantom{0}8$ $0010$ $\phantom{00}84$ $\phantom{0}9$ $1101$ $\phantom{00}10$ $10$ $1010$ $\phantom{00}25$ $11$ $0011$ $\phantom{00}75$ $12$ $0110$ $\phantom{0}129$ $13$ $1011$ $\phantom{00}30$ $14$ $1110$ $\phantom{00}50$ $15$ $0111$ $\phantom{0}181$ $16$ $1111$ $\phantom{0}121$ –––– Total $1000$

### 10.1  Program Text

Program Text (g11sace.c)

### 10.2  Program Data

Program Data (g11sace.d)

### 10.3  Program Results

Program Results (g11sace.r)