PDF version (NAG web site
, 64bit version, 64bit version)
NAG Toolbox: nag_matop_real_gen_sparse_lu (f01br)
Purpose
nag_matop_real_gen_sparse_lu (f01br) factorizes a real sparse matrix. The function either forms the $LU$ factorization of a permutation of the entire matrix, or, optionally, first permutes the matrix to block lower triangular form and then only factorizes the diagonal blocks.
Syntax
[
a,
irn,
icn,
ikeep,
w,
idisp,
ifail] = f01br(
n,
nz,
a,
irn,
icn,
abort, 'licn',
licn, 'lirn',
lirn, 'pivot',
pivot, 'lblock',
lblock, 'grow',
grow)
[
a,
irn,
icn,
ikeep,
w,
idisp,
ifail] = nag_matop_real_gen_sparse_lu(
n,
nz,
a,
irn,
icn,
abort, 'licn',
licn, 'lirn',
lirn, 'pivot',
pivot, 'lblock',
lblock, 'grow',
grow)
Description
Given a real sparse matrix
$A$,
nag_matop_real_gen_sparse_lu (f01br) may be used to obtain the
$LU$ factorization of a permutation of
$A$,
where
$P$ and
$Q$ are permutation matrices,
$L$ is unit lower triangular and
$U$ is upper triangular. The function uses a sparse variant of Gaussian elimination, and the pivotal strategy is designed to compromise between maintaining sparsity and controlling loss of accuracy through roundoff.
Optionally the function first permutes the matrix into block lower triangular form and then only factorizes the diagonal blocks. For some matrices this gives a considerable saving in storage and execution time.
Extensive data checks are made; duplicated nonzeros can be accumulated.
The factorization is intended to be used by
nag_linsys_real_sparse_fac_solve (f04ax) to solve sparse systems of linear equations
$Ax=b$ or
${A}^{\mathrm{T}}x=b$. If several matrices of the same sparsity pattern are to be factorized,
nag_matop_real_gen_sparse_lu_reuse (f01bs) should be used for the second and subsequent matrices.
The method is fully described in
Duff (1977).
A more recent algorithm for the same calculation is provided by
nag_sparse_direct_real_gen_lu (f11me).
References
Duff I S (1977) MA28 – a set of Fortran subroutines for sparse unsymmetric linear equations AERE Report R8730 HMSO
Parameters
Compulsory Input Parameters
 1:
$\mathrm{n}$ – int64int32nag_int scalar

$n$, the order of the matrix $A$.
Constraint:
${\mathbf{n}}>0$.
 2:
$\mathrm{nz}$ – int64int32nag_int scalar

The number of nonzero elements in the matrix $A$.
Constraint:
${\mathbf{nz}}>0$.
 3:
$\mathrm{a}\left({\mathbf{licn}}\right)$ – double array

${\mathbf{a}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{nz}}$, must contain the nonzero elements of the sparse matrix $A$. They can be in any order since nag_matop_real_gen_sparse_lu (f01br) will reorder them.
 4:
$\mathrm{irn}\left({\mathbf{lirn}}\right)$ – int64int32nag_int array

${\mathbf{irn}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{nz}}$, must contain the row index of the nonzero element stored in ${\mathbf{a}}\left(i\right)$.
 5:
$\mathrm{icn}\left({\mathbf{licn}}\right)$ – int64int32nag_int array

${\mathbf{icn}}\left(\mathit{i}\right)$, for
$\mathit{i}=1,2,\dots ,{\mathbf{nz}}$, must contain, on entry, the column index of the nonzero element stored in
${\mathbf{a}}\left(i\right)$.
icn contains, on exit, the column indices of the nonzero elements in the factorization. The array must
not be changed by you between a call of
nag_matop_real_gen_sparse_lu (f01br) and subsequent calls of
nag_matop_real_gen_sparse_lu_reuse (f01bs) or
nag_linsys_real_sparse_fac_solve (f04ax).
 6:
$\mathrm{abort}\left(4\right)$ – logical array
Suggested value:
 ${\mathbf{abort}}\left(1\right)=\mathit{true}$;
 ${\mathbf{abort}}\left(2\right)=\mathit{true}$;
 ${\mathbf{abort}}\left(3\right)=\mathit{false}$;
 ${\mathbf{abort}}\left(4\right)=\mathit{true}$.
If
${\mathbf{abort}}\left(1\right)=\mathit{true}$,
nag_matop_real_gen_sparse_lu (f01br) will exit immediately on detecting a structural singularity (one that depends on the pattern of nonzeros) and return
${\mathbf{ifail}}={\mathbf{1}}$; otherwise it will complete the factorization (see
Singular and Rectangular Systems).
If
${\mathbf{abort}}\left(2\right)=\mathit{true}$,
nag_matop_real_gen_sparse_lu (f01br) will exit immediately on detecting a numerical singularity (one that depends on the numerical values) and return
${\mathbf{ifail}}={\mathbf{2}}$; otherwise it will complete the factorization (see
Singular and Rectangular Systems).
If
${\mathbf{abort}}\left(3\right)=\mathit{true}$,
nag_matop_real_gen_sparse_lu (f01br) will exit immediately (with
${\mathbf{ifail}}={\mathbf{5}}$) when the arrays
a and
icn are filled up by the previously factorized, active and unfactorized parts of the matrix; otherwise it continues so that better guidance on necessary array sizes can be given in
${\mathbf{idisp}}\left(6\right)$ and
${\mathbf{idisp}}\left(7\right)$, and will exit with
ifail in the range
$4$ to
$6$. Note that there is always an immediate error exit if the array
irn is too small.
If ${\mathbf{abort}}\left(4\right)=\mathit{true}$, nag_matop_real_gen_sparse_lu (f01br) exits immediately (with ${\mathbf{ifail}}={\mathbf{13}}$) if it finds duplicate elements in the input matrix.
If
${\mathbf{abort}}\left(4\right)=\mathit{false}$,
nag_matop_real_gen_sparse_lu (f01br) proceeds using a value equal to the sum of the duplicate elements. In either case details of each duplicate element are output on the current advisory message unit (see
nag_file_set_unit_advisory (x04ab)), unless suppressed by the value of
ifail on entry.
Optional Input Parameters
 1:
$\mathrm{licn}$ – int64int32nag_int scalar

Default:
the dimension of the arrays
a,
icn. (An error is raised if these dimensions are not equal.)
The dimension of the arrays
a and
icn. since the factorization is returned in
a and
icn,
licn should be large enough to accommodate this and should ordinarily be
$2$ to
$4$ times as large as
nz.
Constraint:
${\mathbf{licn}}\ge {\mathbf{nz}}$.
 2:
$\mathrm{lirn}$ – int64int32nag_int scalar

Default:
the dimension of the array
irn.
The dimension of the array
irn. it need not be as large as
licn; normally it will not need to be very much greater than
nz.
Constraint:
${\mathbf{lirn}}\ge {\mathbf{nz}}$.
 3:
$\mathrm{pivot}$ – double scalar
Suggested value:
${\mathbf{pivot}}=0.1$ has been found to work well on test examples.
Default:
$0.1$
Should have a value in the range
$0.0\le {\mathbf{pivot}}\le 0.9999$ and is used to control the choice of pivots. If
${\mathbf{pivot}}<0.0$, the value
$0.0$ is assumed, and if
${\mathbf{pivot}}>0.9999$, the value
$0.9999$ is assumed. When searching a row for a pivot, any element is excluded which is less than
pivot times the largest of those elements in the row available as pivots. Thus decreasing
pivot biases the algorithm to maintaining sparsity at the expense of stability.
 4:
$\mathrm{lblock}$ – logical scalar
Suggested value:
${\mathbf{lblock}}=\mathit{true}$ unless the matrix is known to be irreducible, or is singular and an upper bound on the rank is required.
Default:
$\mathit{true}$
If ${\mathbf{lblock}}=\mathit{true}$, the matrix is preordered into block lower triangular form before the $LU$ factorization is performed; otherwise the entire matrix is factorized.
 5:
$\mathrm{grow}$ – logical scalar
Default:
$\mathit{true}$
If
${\mathbf{grow}}=\mathit{true}$, then on exit
${\mathbf{w}}\left(1\right)$ contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization. If the matrix is wellscaled (see
Scaling), then a high value for
${\mathbf{w}}\left(1\right)$ indicates that the
$LU$ factorization may be inaccurate and you should be wary of the results and perhaps increase the argument
pivot for subsequent runs (see
Accuracy).
Output Parameters
 1:
$\mathrm{a}\left({\mathbf{licn}}\right)$ – double array

The nonzero elements in the
$LU$ factorization. The array must
not be changed by you between a call of
nag_matop_real_gen_sparse_lu (f01br) and a call of
nag_linsys_real_sparse_fac_solve (f04ax).
 2:
$\mathrm{irn}\left({\mathbf{lirn}}\right)$ – int64int32nag_int array

 3:
$\mathrm{icn}\left({\mathbf{licn}}\right)$ – int64int32nag_int array

 4:
$\mathrm{ikeep}\left(5\times {\mathbf{n}}\right)$ – int64int32nag_int array

Indexing information about the factorization.
 5:
$\mathrm{w}\left({\mathbf{n}}\right)$ – double array

If
${\mathbf{grow}}=\mathit{true}$,
${\mathbf{w}}\left(1\right)$ contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization (see
grow); the rest of the array is used as workspace.
If ${\mathbf{grow}}=\mathit{false}$, the array is not used.
 6:
$\mathrm{idisp}\left(10\right)$ – int64int32nag_int array

Contains information about the factorization.
${\mathbf{idisp}}\left(1\right)$ and
${\mathbf{idisp}}\left(2\right)$ indicate the position in arrays
a and
icn of the first and last elements in the
$LU$ factorization of the diagonal blocks. (
${\mathbf{idisp}}\left(2\right)$ gives the number of nonzeros in the factorization.)
${\mathbf{idisp}}\left(1\right)$ and
${\mathbf{idisp}}\left(2\right)$ must not be changed by you between a call of
nag_matop_real_gen_sparse_lu (f01br) and subsequent calls to
nag_matop_real_gen_sparse_lu_reuse (f01bs) or
nag_linsys_real_sparse_fac_solve (f04ax).
${\mathbf{idisp}}\left(3\right)$ and
${\mathbf{idisp}}\left(4\right)$ monitor the adequacy of ‘elbow room’ in the arrays
irn and
a (and
icn) respectively, by giving the number of times that the data in these arrays has been compressed during the factorization to release more storage. If either
${\mathbf{idisp}}\left(3\right)$ or
${\mathbf{idisp}}\left(4\right)$ is quite large (say greater than
$10$), it will probably pay you to increase the size of the corresponding array(s) for subsequent runs. If either is very low or zero, then you can perhaps save storage by reducing the size of the corresponding array(s).
${\mathbf{idisp}}\left(5\right)$, when ${\mathbf{lblock}}=\mathit{false}$, gives an upper bound on the rank of the matrix; when ${\mathbf{lblock}}=\mathit{true}$, gives an upper bound on the sum of the ranks of the lower triangular blocks.
${\mathbf{idisp}}\left(6\right)$ and
${\mathbf{idisp}}\left(7\right)$ give the minimum size of arrays
irn and
a (and
icn) respectively which would enable a successful run on an identical matrix (but some ‘elbowroom’ should be allowed – see
Further Comments).
${\mathbf{idisp}}\left(8\right)$ to
$\left(10\right)$ are only used if
${\mathbf{lblock}}=\mathit{true}$.
 ${\mathbf{idisp}}\left(8\right)$ gives the structural rank of the matrix.
 ${\mathbf{idisp}}\left(9\right)$ gives the number of diagonal blocks.
 ${\mathbf{idisp}}\left(10\right)$ gives the size of the largest diagonal block.
 7:
$\mathrm{ifail}$ – int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Errors or warnings detected by the function:
Cases prefixed with W are classified as warnings and
do not generate an error of type NAG:error_n. See nag_issue_warnings.
 W ${\mathbf{ifail}}=2$

Successful factorization of a numerically singular matrix (which may also be structurally singular) (see
Singular and Rectangular Systems).
 W ${\mathbf{ifail}}=1$

Successful factorization of a structurally singular matrix (see
Singular and Rectangular Systems).
 ${\mathbf{ifail}}=1$

The matrix is structurally singular and the factorization has been abandoned (${\mathbf{abort}}\left(1\right)$ was true on entry).
 ${\mathbf{ifail}}=2$

The matrix is numerically singular and the factorization has been abandoned (${\mathbf{abort}}\left(2\right)$ was true on entry).
 ${\mathbf{ifail}}=3$

lirn is too small: there is not enough space in the array
irn to continue the factorization. You are recommended to try again with
lirn (and the length of
irn) equal to at least
${\mathbf{idisp}}\left(6\right)+{\mathbf{n}}/2$.
 ${\mathbf{ifail}}=4$

licn is much too small: there is much too little space in the arrays
a and
icn to continue the factorization.
 ${\mathbf{ifail}}=5$

licn is too small: there is not enough space in the arrays
a and
icn to store the factorization. If
${\mathbf{abort}}\left(3\right)$ was
false on entry, the factorization has been completed but some of the
$LU$ factors have been discarded to create space;
${\mathbf{idisp}}\left(7\right)$ then gives the minimum value of
licn (i.e., the minimum length of
a and
icn) required for a successful factorization of the same matrix.
 ${\mathbf{ifail}}=6$

licn and
lirn are both too small: effectively this is a combination of
${\mathbf{ifail}}={\mathbf{3}}$ and
${\mathbf{5}}$ (with
${\mathbf{abort}}\left(3\right)=\mathit{false}$).
 ${\mathbf{ifail}}=7$

licn is too small: there is not enough space in the arrays
a and
icn for the permutation to block triangular form.
 ${\mathbf{ifail}}=8$

On entry,  ${\mathbf{n}}\le 0$. 
 ${\mathbf{ifail}}=9$

On entry,  ${\mathbf{nz}}\le 0$. 
 ${\mathbf{ifail}}=10$

On entry,  ${\mathbf{licn}}<{\mathbf{nz}}$. 
 ${\mathbf{ifail}}=11$

On entry,  ${\mathbf{lirn}}<{\mathbf{nz}}$. 
 ${\mathbf{ifail}}=12$

On entry, an element of the input matrix has a row or column index (i.e., an element of
irn or
icn) outside the range
$1$ to
n.
 ${\mathbf{ifail}}=13$

Duplicate elements have been found in the input matrix and the factorization has been abandoned (${\mathbf{abort}}\left(4\right)=\mathit{true}$ on entry).
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
Accuracy
The factorization obtained is exact for a perturbed matrix whose
$\left(i,j\right)$th element differs from
${a}_{ij}$ by less than
$3\epsilon \rho {m}_{ij}$ where
$\epsilon $ is the
machine precision,
$\rho $ is the growth value returned in
${\mathbf{w}}\left(1\right)$ if
${\mathbf{grow}}=\mathit{true}$, and
${m}_{ij}$ the number of Gaussian elimination operations applied to element
$\left(i,j\right)$. The value of
${m}_{ij}$ is not greater than
$n$ and is usually much less. Small
$\rho $ values therefore guarantee accurate results, but unfortunately large
$\rho $ values may give a very pessimistic indication of accuracy.
Further Comments
Timing
The time required may be estimated very roughly from the number
$\tau $ of nonzeros in the factorized form (output as
${\mathbf{idisp}}\left(2\right)$) and for
nag_matop_real_gen_sparse_lu (f01br) and its associates is
where our unit is the time for the inner loop of a full matrix code (e.g., solving a full set of equations takes about
$\frac{1}{3}{n}^{3}$ units). Note that the faster
nag_matop_real_gen_sparse_lu_reuse (f01bs) time makes it well worthwhile to use this for a sequence of problems with the same pattern.
It should be appreciated that
$\tau $ varies widely from problem to problem. For network problems it may be little greater than
nz, the number of nonzeros in
$A$; for discretization of twodimensional and threedimensional partial differential equations it may be about
$3n{\mathrm{log}}_{2}n$ and
$\frac{1}{2}{n}^{5/3}$, respectively.
The time taken by nag_matop_real_gen_sparse_lu (f01br) to find the block lower triangular form (${\mathbf{lblock}}=\mathit{true}$) is typically $5\u201315\%$ of the time taken by the function when it is not found (${\mathbf{lblock}}=\mathit{false}$). If the matrix is irreducible (${\mathbf{idisp}}\left(9\right)=1$ after a call with ${\mathbf{lblock}}=\mathit{true}$) then this time is wasted. Otherwise, particularly if the largest block is small (${\mathbf{idisp}}\left(10\right)\ll n$), the consequent savings are likely to be greater.
The time taken to estimate growth (${\mathbf{grow}}=\mathit{true}$) is typically under $20\%$ of the overall time.
The overall time may be substantially increased if there is inadequate ‘elbowroom’ in the arrays
a,
irn and
icn. When the sizes of the arrays are minimal (
${\mathbf{idisp}}\left(6\right)$ and
${\mathbf{idisp}}\left(7\right)$) it can execute as much as three times slower. Values of
${\mathbf{idisp}}\left(3\right)$ and
${\mathbf{idisp}}\left(4\right)$ greater than about
$10$ indicate that it may be worthwhile to increase array sizes.
Scaling
The use of a relative pivot tolerance
pivot essentially presupposes that the matrix is wellscaled, i.e., that the matrix elements are broadly comparable in size. Practical problems are often naturally wellscaled but particular care is needed for problems containing mixed types of variables (for example millimetres and neutron fluxes).
Singular and Rectangular Systems
It is envisaged that
nag_matop_real_gen_sparse_lu (f01br) will almost always be called for square nonsingular matrices and that singularity indicates an error condition. However, even if the matrix is singular it is possible to complete the factorization. It is even possible for
nag_linsys_real_sparse_fac_solve (f04ax) to solve a set of equations whose matrix is singular provided the set is consistent.
Two forms of singularity are possible. If the matrix would be singular for any values of the nonzeros (e.g., if it has a whole row of zeros), then we say it is structurally singular, and continue only if ${\mathbf{abort}}\left(1\right)=\mathit{false}$. If the matrix is nonsingular by virtue of the particular values of the nonzeros, then we say that it is numerically singular and continue only if ${\mathbf{abort}}\left(2\right)=\mathit{false}$, in which case an upper bound on the rank of the matrix is returned in ${\mathbf{idisp}}\left(5\right)$ when ${\mathbf{lblock}}=\mathit{false}$.
Rectangular matrices may be treated by setting
n to the larger of the number of rows and numbers of columns and setting
${\mathbf{abort}}\left(1\right)=\mathit{false}$.
Note: the soft
failure option should be used (last digit of ${\mathbf{ifail}}={\mathbf{1}}$) if you wish to factorize singular matrices with ${\mathbf{abort}}\left(1\right)$ or ${\mathbf{abort}}\left(2\right)$ set to false.
Duplicated Nonzeros
The matrix $A$ may consist of a sum of contributions from different subsystems (for example finite elements). In such cases you may rely on nag_matop_real_gen_sparse_lu (f01br) to perform assembly, since duplicated elements are summed.
Determinant
The following code may be used to compute the determinant of
$A$ (as the double variable
deta) after a call of
nag_matop_real_gen_sparse_lu (f01br):
deta = 1;
id = idisp(1);
for i = 1:n
idg = id + ikeep(3*n+i);
deta = deta*a(idg);
if (ikeep(n+i) ~= i) deta = deta;
end
if (ikeep(2*n+i) ~= i) deta = deta;
end
id = id + ikeep(i);
end
Example
This example factorizes the real sparse matrix:
This example program simply prints out some information about the factorization as returned by
nag_matop_real_gen_sparse_lu (f01br) in
${\mathbf{w}}\left(1\right)$ and
idisp. Normally the call of
nag_matop_real_gen_sparse_lu (f01br) would be followed by a call of
nag_linsys_real_sparse_fac_solve (f04ax) (see
Example in
nag_linsys_real_sparse_fac_solve (f04ax)).
Open in the MATLAB editor:
f01br_example
function f01br_example
fprintf('f01br example results\n\n');
n = int64(6);
nz = int64(15);
nzmax = int64(20);
irn = zeros(nzmax,1,'int64');
icn = zeros(nzmax,1,'int64');
a = zeros(nzmax,1);
a(1:nz) = [5; 2; 1; 2; 3; 2; 1; 1; 1; 1; 2; 3; 1; 1; 6];
irn(1:nz) = int64([1;2;2;2;3; 4;4;4;5;5; 5;5;6;6;6]);
icn(1:nz) = int64([1;2;3;4;3; 1;4;5;1;4; 5;6;1;2;6]);
abort = [true; true; false; true];
[a, irn, icn, ikeep, w, idisp, ifail] = ...
f01br(...
n, nz, a, irn, icn, abort);
fprintf('Number of nonzeros in factorization = %4d\n',idisp(2));
fprintf('Upper bound on rank of A = %4d\n',idisp(5));
f01br example results
Number of nonzeros in factorization = 16
Upper bound on rank of A = 6
PDF version (NAG web site
, 64bit version, 64bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015