hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_matop_real_gen_sparse_lu (f01br)

Purpose

nag_matop_real_gen_sparse_lu (f01br) factorizes a real sparse matrix. The function either forms the LULU factorization of a permutation of the entire matrix, or, optionally, first permutes the matrix to block lower triangular form and then only factorizes the diagonal blocks.

Syntax

[a, irn, icn, ikeep, w, idisp, ifail] = f01br(n, nz, a, irn, icn, abort, 'licn', licn, 'lirn', lirn, 'pivot', pivot, 'lblock', lblock, 'grow', grow)
[a, irn, icn, ikeep, w, idisp, ifail] = nag_matop_real_gen_sparse_lu(n, nz, a, irn, icn, abort, 'licn', licn, 'lirn', lirn, 'pivot', pivot, 'lblock', lblock, 'grow', grow)

Description

Given a real sparse matrix AA, nag_matop_real_gen_sparse_lu (f01br) may be used to obtain the LULU factorization of a permutation of AA,
PAQ = LU
PAQ=LU
where PP and QQ are permutation matrices, LL is unit lower triangular and UU is upper triangular. The function uses a sparse variant of Gaussian elimination, and the pivotal strategy is designed to compromise between maintaining sparsity and controlling loss of accuracy through round-off.
Optionally the function first permutes the matrix into block lower triangular form and then only factorizes the diagonal blocks. For some matrices this gives a considerable saving in storage and execution time.
Extensive data checks are made; duplicated nonzeros can be accumulated.
The factorization is intended to be used by nag_linsys_real_sparse_fac_solve (f04ax) to solve sparse systems of linear equations Ax = bAx=b or ATx = bATx=b. If several matrices of the same sparsity pattern are to be factorized, nag_matop_real_gen_sparse_lu_reuse (f01bs) should be used for the second and subsequent matrices.
The method is fully described in Duff (1977).
A more recent algorithm for the same calculation is provided by nag_sparse_direct_real_gen_lu (f11me).

References

Duff I S (1977) MA28 – a set of Fortran subroutines for sparse unsymmetric linear equations AERE Report R8730 HMSO

Parameters

Compulsory Input Parameters

1:     n – int64int32nag_int scalar
nn, the order of the matrix AA.
Constraint: n > 0n>0.
2:     nz – int64int32nag_int scalar
The number of nonzero elements in the matrix AA.
Constraint: nz > 0nz>0.
3:     a(licn) – double array
licn, the dimension of the array, must satisfy the constraint licnnzlicnnz.
a(i)ai, for i = 1,2,,nzi=1,2,,nz, must contain the nonzero elements of the sparse matrix AA. They can be in any order since nag_matop_real_gen_sparse_lu (f01br) will reorder them.
4:     irn(lirn) – int64int32nag_int array
lirn, the dimension of the array, must satisfy the constraint lirnnzlirnnz.
irn(i)irni, for i = 1,2,,nzi=1,2,,nz, must contain the row index of the nonzero element stored in a(i)ai.
5:     icn(licn) – int64int32nag_int array
licn, the dimension of the array, must satisfy the constraint licnnzlicnnz.
icn(i)icni, for i = 1,2,,nzi=1,2,,nz, must contain, on entry, the column index of the nonzero element stored in a(i)ai. icn contains, on exit, the column indices of the nonzero elements in the factorization. The array must not be changed by you between a call of nag_matop_real_gen_sparse_lu (f01br) and subsequent calls of nag_matop_real_gen_sparse_lu_reuse (f01bs) or nag_linsys_real_sparse_fac_solve (f04ax).
6:     abort(44) – logical array
If abort(1) = trueabort1=true, nag_matop_real_gen_sparse_lu (f01br) will exit immediately on detecting a structural singularity (one that depends on the pattern of nonzeros) and return ifail = 1ifail=1; otherwise it will complete the factorization (see Section [Singular and Rectangular Systems]).
If abort(2) = trueabort2=true, nag_matop_real_gen_sparse_lu (f01br) will exit immediately on detecting a numerical singularity (one that depends on the numerical values) and return ifail = 2ifail=2; otherwise it will complete the factorization (see Section [Singular and Rectangular Systems]).
If abort(3) = trueabort3=true, nag_matop_real_gen_sparse_lu (f01br) will exit immediately (with ifail = 5ifail=5) when the arrays a and icn are filled up by the previously factorized, active and unfactorized parts of the matrix; otherwise it continues so that better guidance on necessary array sizes can be given in idisp(6)idisp6 and idisp(7)idisp7, and will exit with ifail in the range 44 to 66. Note that there is always an immediate error exit if the array irn is too small.
If abort(4) = trueabort4=true, nag_matop_real_gen_sparse_lu (f01br) exits immediately (with ifail = 13ifail=13) if it finds duplicate elements in the input matrix.
If abort(4) = falseabort4=false, nag_matop_real_gen_sparse_lu (f01br) proceeds using a value equal to the sum of the duplicate elements. In either case details of each duplicate element are output on the current advisory message unit (see nag_file_set_unit_advisory (x04ab)), unless suppressed by the value of ifail on entry.

Optional Input Parameters

1:     licn – int64int32nag_int scalar
Default: The dimension of the arrays a, icn. (An error is raised if these dimensions are not equal.)
The dimension of the arrays a and icn as declared in the (sub)program from which nag_matop_real_gen_sparse_lu (f01br) is called. Since the factorization is returned in a and icn, licn should be large enough to accommodate this and should ordinarily be 22 to 44 times as large as nz.
Constraint: licnnzlicnnz.
2:     lirn – int64int32nag_int scalar
Default: The dimension of the array irn.
The dimension of the array irn as declared in the (sub)program from which nag_matop_real_gen_sparse_lu (f01br) is called. It need not be as large as licn; normally it will not need to be very much greater than nz.
Constraint: lirnnzlirnnz.
3:     pivot – double scalar
Should have a value in the range 0.0pivot0.99990.0pivot0.9999 and is used to control the choice of pivots. If pivot < 0.0pivot<0.0, the value 0.00.0 is assumed, and if pivot > 0.9999pivot>0.9999, the value 0.99990.9999 is assumed. When searching a row for a pivot, any element is excluded which is less than pivot times the largest of those elements in the row available as pivots. Thus decreasing pivot biases the algorithm to maintaining sparsity at the expense of stability.
Default: 0.10.1
4:     lblock – logical scalar
If lblock = truelblock=true, the matrix is preordered into block lower triangular form before the LULU factorization is performed; otherwise the entire matrix is factorized.
Default: truetrue
5:     grow – logical scalar
If grow = truegrow=true, then on exit w(1)w1 contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization. If the matrix is well-scaled (see Section [Scaling]), then a high value for w(1)w1 indicates that the LULU factorization may be inaccurate and you should be wary of the results and perhaps increase the parameter pivot for subsequent runs (see Section [Accuracy]).
Default: truetrue

Input Parameters Omitted from the MATLAB Interface

iw

Output Parameters

1:     a(licn) – double array
The nonzero elements in the LULU factorization. The array must not be changed by you between a call of nag_matop_real_gen_sparse_lu (f01br) and a call of nag_linsys_real_sparse_fac_solve (f04ax).
2:     irn(lirn) – int64int32nag_int array
irn is overwritten and is not needed for subsequent calls of nag_matop_real_gen_sparse_lu_reuse (f01bs) or nag_linsys_real_sparse_fac_solve (f04ax).
3:     icn(licn) – int64int32nag_int array
4:     ikeep(5 × n5×n) – int64int32nag_int array
Indexing information about the factorization.
5:     w(n) – double array
If grow = truegrow=true, w(1)w1 contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization (see grow); the rest of the array is used as workspace.
If grow = falsegrow=false, the array is not used.
6:     idisp(1010) – int64int32nag_int array
Contains information about the factorization.
idisp(1)idisp1 and idisp(2)idisp2 indicate the position in arrays a and icn of the first and last elements in the LULU factorization of the diagonal blocks. (idisp(2)idisp2 gives the number of nonzeros in the factorization.) idisp(1)idisp1 and idisp(2)idisp2 must not be changed by you between a call of nag_matop_real_gen_sparse_lu (f01br) and subsequent calls to nag_matop_real_gen_sparse_lu_reuse (f01bs) or nag_linsys_real_sparse_fac_solve (f04ax).
idisp(3)idisp3 and idisp(4)idisp4 monitor the adequacy of ‘elbow room’ in the arrays irn and a (and icn) respectively, by giving the number of times that the data in these arrays has been compressed during the factorization to release more storage. If either idisp(3)idisp3 or idisp(4)idisp4 is quite large (say greater than 1010), it will probably pay you to increase the size of the corresponding array(s) for subsequent runs. If either is very low or zero, then you can perhaps save storage by reducing the size of the corresponding array(s).
idisp(5)idisp5, when lblock = falselblock=false, gives an upper bound on the rank of the matrix; when lblock = truelblock=true, gives an upper bound on the sum of the ranks of the lower triangular blocks.
idisp(6)idisp6 and idisp(7)idisp7 give the minimum size of arrays irn and a (and icn) respectively which would enable a successful run on an identical matrix (but some ‘elbow-room’ should be allowed – see Section [Further Comments]).
idisp(8)idisp8 to (10)(10) are only used if lblock = truelblock=true.
  • idisp(8)idisp8 gives the structural rank of the matrix.
  • idisp(9)idisp9 gives the number of diagonal blocks.
  • idisp(10)idisp10 gives the size of the largest diagonal block.
7:     ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

W ifail = 2ifail=-2
Successful factorization of a numerically singular matrix (which may also be structurally singular) (see Section [Singular and Rectangular Systems]).
W ifail = 1ifail=-1
Successful factorization of a structurally singular matrix (see Section [Singular and Rectangular Systems]).
  ifail = 1ifail=1
The matrix is structurally singular and the factorization has been abandoned (abort(1)abort1 was true on entry).
  ifail = 2ifail=2
The matrix is numerically singular and the factorization has been abandoned (abort(2)abort2 was true on entry).
  ifail = 3ifail=3
lirn is too small: there is not enough space in the array irn to continue the factorization. You are recommended to try again with lirn (and the length of irn) equal to at least idisp(6) + n / 2idisp6+n/2.
  ifail = 4ifail=4
licn is much too small: there is much too little space in the arrays a and icn to continue the factorization.
  ifail = 5ifail=5
licn is too small: there is not enough space in the arrays a and icn to store the factorization. If abort(3)abort3 was false on entry, the factorization has been completed but some of the LULU factors have been discarded to create space; idisp(7)idisp7 then gives the minimum value of licn (i.e., the minimum length of a and icn) required for a successful factorization of the same matrix.
  ifail = 6ifail=6
licn and lirn are both too small: effectively this is a combination of ifail = 3ifail=3 and 55 (with abort(3) = falseabort3=false).
  ifail = 7ifail=7
licn is too small: there is not enough space in the arrays a and icn for the permutation to block triangular form.
  ifail = 8ifail=8
On entry,n0n0.
  ifail = 9ifail=9
On entry,nz0nz0.
  ifail = 10ifail=10
On entry,licn < nzlicn<nz.
  ifail = 11ifail=11
On entry,lirn < nzlirn<nz.
  ifail = 12ifail=12
On entry, an element of the input matrix has a row or column index (i.e., an element of irn or icn) outside the range 11 to n.
  ifail = 13ifail=13
Duplicate elements have been found in the input matrix and the factorization has been abandoned (abort(4) = trueabort4=true on entry).

Accuracy

The factorization obtained is exact for a perturbed matrix whose (i,j)(i,j)th element differs from aijaij by less than 3ερmij3ερmij where εε is the machine precision, ρρ is the growth value returned in w(1)w1 if grow = truegrow=true, and mijmij the number of Gaussian elimination operations applied to element (i,j)(i,j). The value of mijmij is not greater than nn and is usually much less. Small ρρ values therefore guarantee accurate results, but unfortunately large ρρ values may give a very pessimistic indication of accuracy.

Further Comments

Timing

The time required may be estimated very roughly from the number ττ of nonzeros in the factorized form (output as idisp(2)idisp2) and for nag_matop_real_gen_sparse_lu (f01br) and its associates is
nag_matop_real_gen_sparse_lu (f01br): 5τ2 / n5τ2/n units
nag_matop_real_gen_sparse_lu_reuse (f01bs): τ2 / nτ2/n units
nag_linsys_real_sparse_fac_solve (f04ax): 2τ2τ units
where our unit is the time for the inner loop of a full matrix code (e.g., solving a full set of equations takes about (1/3)n313n3 units). Note that the faster nag_matop_real_gen_sparse_lu_reuse (f01bs) time makes it well worthwhile to use this for a sequence of problems with the same pattern.
It should be appreciated that ττ varies widely from problem to problem. For network problems it may be little greater than nz, the number of nonzeros in AA; for discretization of two-dimensional and three-dimensional partial differential equations it may be about 3n log2n 3n log2n  and (1/2)n5 / 312n5/3, respectively.
The time taken by nag_matop_real_gen_sparse_lu (f01br) to find the block lower triangular form (lblock = truelblock=true) is typically 515%515% of the time taken by the function when it is not found (lblock = falselblock=false). If the matrix is irreducible (idisp(9) = 1idisp9=1 after a call with lblock = truelblock=true) then this time is wasted. Otherwise, particularly if the largest block is small (idisp(10)nidisp10n), the consequent savings are likely to be greater.
The time taken to estimate growth (grow = truegrow=true) is typically under 20%20% of the overall time.
The overall time may be substantially increased if there is inadequate ‘elbow-room’ in the arrays a, irn and icn. When the sizes of the arrays are minimal (idisp(6)idisp6 and idisp(7)idisp7) it can execute as much as three times slower. Values of idisp(3)idisp3 and idisp(4)idisp4 greater than about 1010 indicate that it may be worthwhile to increase array sizes.

Scaling

The use of a relative pivot tolerance pivot essentially presupposes that the matrix is well-scaled, i.e., that the matrix elements are broadly comparable in size. Practical problems are often naturally well-scaled but particular care is needed for problems containing mixed types of variables (for example millimetres and neutron fluxes).

Singular and Rectangular Systems

It is envisaged that nag_matop_real_gen_sparse_lu (f01br) will almost always be called for square nonsingular matrices and that singularity indicates an error condition. However, even if the matrix is singular it is possible to complete the factorization. It is even possible for nag_linsys_real_sparse_fac_solve (f04ax) to solve a set of equations whose matrix is singular provided the set is consistent.
Two forms of singularity are possible. If the matrix would be singular for any values of the nonzeros (e.g., if it has a whole row of zeros), then we say it is structurally singular, and continue only if abort(1) = falseabort1=false. If the matrix is nonsingular by virtue of the particular values of the nonzeros, then we say that it is numerically singular and continue only if abort(2) = falseabort2=false, in which case an upper bound on the rank of the matrix is returned in idisp(5)idisp5 when lblock = falselblock=false.
Rectangular matrices may be treated by setting n to the larger of the number of rows and numbers of columns and setting abort(1) = falseabort1=false.
Note:  the soft failure option should be used (last digit of ifail = 1ifail=1) if you wish to factorize singular matrices with abort(1)abort1 or abort(2)abort2 set to false.

Duplicated Nonzeros

The matrix AA may consist of a sum of contributions from different sub-systems (for example finite elements). In such cases you may rely on nag_matop_real_gen_sparse_lu (f01br) to perform assembly, since duplicated elements are summed.

Determinant

The following code may be used to compute the determinant of AA (as the double variable deta) after a call of nag_matop_real_gen_sparse_lu (f01br):
deta = 1;
id = idisp(1);
for i = 1:n
  idg = id + ikeep(3*n+i);
  deta = deta*a(idg);
  if (ikeep(n+i) ~= i)
    deta = -deta;
  end
  if (ikeep(2*n+i) ~= i)
    deta = -deta;
  end
  id = id + ikeep(i);
end

Example

function nag_matop_real_gen_sparse_lu_example
n = int64(6);
nz = int64(15);
a = zeros(150,1);
a(1:15) = [5; 2; -1; 2; 3; -2; 1; 1; -1; -1; 2; -3; -1; -1; 6];
irn = zeros(75,1,'int64');
irn(1:15) = [int64(1);2;2;2;3;4; ...
                    4;4;5;5;5;5; ...
                    6;6;6];
icn = zeros(150,1,'int64');
icn(1:15) = [int64(1);2;3;4;3;1; ...
                    4;5;1;4;5;6; ...
                    1;2;6];
abort = [true;
     true;
     false;
     true];
[aOut, irnOut, icnOut, ikeep, w, idisp, ifail] = ...
    nag_matop_real_gen_sparse_lu(n, nz, a, irn, icn, abort);
 w, idisp, ifail
 

w =

   18.0000
         0
    2.0000
    0.5000
         0
         0


idisp =

                    5
                   16
                    0
                    0
                    6
                   15
                   19
                    6
                    3
                    4


ifail =

                    0


function f01br_example
n = int64(6);
nz = int64(15);
a = zeros(150,1);
a(1:15) = [5; 2; -1; 2; 3; -2; 1; 1; -1; -1; 2; -3; -1; -1; 6];
irn = zeros(75,1,'int64');
irn(1:15) = [int64(1);2;2;2;3;4; ...
                    4;4;5;5;5;5; ...
                    6;6;6];
icn = zeros(150,1,'int64');
icn(1:15) = [int64(1);2;3;4;3;1; ...
                    4;5;1;4;5;6; ...
                    1;2;6];
abort = [true;
     true;
     false;
     true];
[aOut, irnOut, icnOut, ikeep, w, idisp, ifail] = f01br(n, nz, a, irn, icn, abort);
 w, idisp, ifail
 

w =

   18.0000
         0
    2.0000
    0.5000
         0
         0


idisp =

                    5
                   16
                    0
                    0
                    6
                   15
                   19
                    6
                    3
                    4


ifail =

                    0



PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013