NAG Library Routine Document
F11DFF
1 Purpose
F11DFF computes a block diagonal incomplete
LU factorization of a real sparse nonsymmetric matrix, represented in coordinate storage format. The diagonal blocks may be composed of arbitrary rows and the corresponding columns, and may overlap. This factorization can be used to provide a block Jacobi or additive Schwarz preconditioner, for use in combination with
F11BEF or
F11DGF.
2 Specification
SUBROUTINE F11DFF ( |
N, NNZ, A, LA, IROW, ICOL, NB, ISTB, INDB, LINDB, LFILL, DTOL, PSTRAT, MILU, IPIVP, IPIVQ, ISTR, IDIAG, NNZC, NPIVM, IWORK, LIWORK, IFAIL) |
INTEGER |
N, NNZ, LA, IROW(LA), ICOL(LA), NB, ISTB(NB+1), INDB(LINDB), LINDB, LFILL(NB), IPIVP(LINDB), IPIVQ(LINDB), ISTR(LINDB+1), IDIAG(LINDB), NNZC, NPIVM(NB), IWORK(LIWORK), LIWORK, IFAIL |
REAL (KIND=nag_wp) |
A(LA), DTOL(NB) |
CHARACTER(1) |
PSTRAT(NB), MILU(NB) |
|
3 Description
F11DFF computes an incomplete
LU factorization (see
Meijerink and Van der Vorst (1977) and
Meijerink and Van der Vorst (1981)) of the (possibly overlapping)
diagonal blocks
Ab,
b=1,2,…,NB, of a real sparse nonsymmetric
n by
n matrix
A. The factorization is intended primarily for use as a block Jacobi or additive Schwarz preconditioner (see
Saad (1996)), with one of the iterative solvers
F11BEF and
F11DGF.
The
NB diagonal blocks need not consist of consecutive rows and columns of
A, but may be composed of arbitrarily indexed rows, and the corresponding columns, as defined in the arguments
INDB and
ISTB. Any given row or column index may appear in more than one diagonal block, resulting in overlap. Each diagonal block
Ab,
b=1,2,…,NB, is factorized as:
where
and
Lb is lower triangular with unit diagonal elements,
Db is diagonal,
Ub is upper triangular with unit diagonals,
Pb and
Qb are permutation matrices, and
Rb is a remainder matrix.
The amount of fill-in occurring in the factorization of block b can vary from zero to complete fill, and can be controlled by specifying either the maximum level of fill LFILLb, or the drop tolerance DTOLb.
The parameter PSTRATb defines the pivoting strategy to be used in block b. The options currently available are no pivoting, user-defined pivoting, partial pivoting by columns for stability, and complete pivoting by rows for sparsity and by columns for stability. The factorization may optionally be modified to preserve the row-sums of the original block matrix.
The sparse matrix
A is represented in coordinate storage (CS) format (see
Section 2.1.1 in the F11 Chapter Introduction). The array
A stores all the nonzero elements of the matrix
A, while arrays
IROW and
ICOL store the corresponding row and column indices respectively. Multiple nonzero elements may not be specified for the same row and column index.
The preconditioning matrices
Mb,
b=1,2,…,NB, are returned in terms of the CS representations of the matrices
4 References
Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix
Math. Comput. 31 148–162
Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems
J. Comput. Phys. 44 134–155
Saad Y (1996)
Iterative Methods for Sparse Linear Systems PWS Publishing Company, Boston, MA
5 Parameters
- 1: N – INTEGERInput
On entry: n, the order of the matrix A.
Constraint:
N≥1.
- 2: NNZ – INTEGERInput
On entry: the number of nonzero elements in the matrix A.
Constraint:
1≤NNZ≤N2.
- 3: A(LA) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the nonzero elements in the matrix
A, ordered by increasing row index, and by increasing column index within each row. Multiple entries for the same row and column indices are not permitted. The routine
F11ZAF may be used to order the elements in this way.
On exit: the first
NNZ entries of
A contain the nonzero elements of
A and the next
NNZC entries contain the elements of the matrices
Cb, for
b=1,2,…,NB stored consecutively. Within each block the matrix elements are ordered by increasing row index, and by increasing column index within each row.
- 4: LA – INTEGERInput
On entry: the dimension of the arrays
A,
IROW and
ICOL as declared in the (sub)program from which F11DFF is called. These arrays must be of sufficient size to store both
A (
NNZ elements) and
C (
NNZC elements).
Constraint:
LA≥2×NNZ.
- 5: IROW(LA) – INTEGER arrayInput/Output
- 6: ICOL(LA) – INTEGER arrayInput/Output
On entry: the row and column indices of the nonzero elements supplied in
A.
Constraints:
IROW and
ICOL must satisfy these constraints (which may be imposed by a call to
F11ZAF):
- 1≤IROWi≤N and 1≤ICOLi≤N, for i=1,2,…,NNZ;
- either IROWi-1<IROWi or both IROWi-1=IROWi and ICOLi-1<ICOLi, for i=2,3,…,NNZ.
On exit: the row and column indices of the nonzero elements returned in
A.
- 7: NB – INTEGERInput
On entry: the number of diagonal blocks to factorize.
Constraint:
1≤NB≤N.
- 8: ISTB(NB+1) – INTEGER arrayInput
On entry:
ISTBb, for
b=1,2,…,NB, holds the index in arrays
INDB,
IPIVP,
IPIVQ and
IDIAG defining block
b.
ISTBNB+1 holds the sum of the number of rows in all blocks plus
ISTB1.
Constraint:
ISTB1≥1, ISTBb< ISTBb+1 , for b=1,2,…,NB.
- 9: INDB(LINDB) – INTEGER arrayInput
On entry:
INDB must hold the row indices appearing in each diagonal block, stored consecutively. Thus the elements
INDBISTBb to
INDBISTBb+1-1 are the row indices in the
bth block.
Constraint:
1≤INDBm≤N, for m=1,2,…,ISTBNB+1-1.
- 10: LINDB – INTEGERInput
On entry: the dimension of the arrays
INDB,
IPIVP,
IPIVQ and
IDIAG as declared in the (sub)program from which F11DFF is called.
Constraint:
LINDB≥ISTBNB+1-1.
- 11: LFILL(NB) – INTEGER arrayInput
On entry: if
LFILLb≥0 its value is the maximum level of fill allowed in the decomposition of the block
b (see
Section 8.2 in F11DAF). A negative value of
LFILLb indicates that
DTOLb will be used to control the fill in block
b instead.
- 12: DTOL(NB) – REAL (KIND=nag_wp) arrayInput
On entry: if
LFILLb<0 then
DTOLb is used as a drop tolerance in block
b to control the fill-in (see
Section 8.2 in F11DAF); otherwise
DTOLb is not referenced.
Constraint:
if LFILLb<0, DTOLb≥0.0, for b=1,2,…,NB.
- 13: PSTRAT(NB) – CHARACTER(1) arrayInput
On entry:
PSTRATb, for
b=1,2,…,NB, specifies the pivoting strategy to be adopted in block
b as follows:
- PSTRATb='N'
- No pivoting is carried out.
- PSTRATb='U'
- Pivoting is carried out according to the user-defined input values of IPIVP and IPIVQ.
- PSTRATb='P'
- Partial pivoting by columns for stability is carried out.
- PSTRATb='C'
- Complete pivoting by rows for sparsity, and by columns for stability, is carried out.
Suggested value:
PSTRATb='C', for b=1,2,…,NB.
Constraint:
PSTRATb='N', 'U', 'P' or 'C', for b=1,2,…,NB.
- 14: MILU(NB) – CHARACTER(1) arrayInput
On entry:
MILUb, for
b=1,2,…,NB, indicates whether or not the factorization in block
b should be modified to preserve row-sums (see
Section 8.4 in F11DAF).
- MILUb='M'
- The factorization is modified.
- MILUb='N'
- The factorization is not modified.
Constraint:
MILUb='M' or 'N', for b=1,2,…,NB.
- 15: IPIVP(LINDB) – INTEGER arrayInput/Output
- 16: IPIVQ(LINDB) – INTEGER arrayInput/Output
On entry: if
PSTRATb='U', then
IPIVPISTBb+k-1 and
IPIVQISTBb+k-1 must specify the row and column indices of the element used as a pivot at elimination stage
k of the factorization of block
b. Otherwise
IPIVP and
IPIVQ need not be initialized.
Constraint:
if
PSTRATb='U', the elements
ISTBb to
ISTBb+1-1 of
IPIVP and
IPIVQ must both hold valid permutations of the integers on
1,ISTBb+1-ISTBb.
On exit: the row and column indices of the pivot elements, arranged consecutively for each block, as for
INDB. If
IPIVPISTBb+k-1=i and
IPIVQISTBb+k-1=j, then the element in row
i and column
j of
Ab was used as the pivot at elimination stage
k.
- 17: ISTR(LINDB+1) – INTEGER arrayOutput
On exit:
ISTRISTBb+k-1, gives the starting address in the arrays
A,
IROW and
ICOL of row
k of the matrix
Cb, for
b=1,2,…,NB and
k=1,2,…,ISTBb+1-ISTBb.
ISTRISTBNB+1 contains NNZ+NNZC+1.
- 18: IDIAG(LINDB) – INTEGER arrayOutput
On exit:
IDIAGISTBb+k-1, gives the address in the arrays
A,
IROW and
ICOL of the diagonal element in row
k of the matrix
Cb, for
b=1,2,…,NB and
k=1,2,…,ISTBb+1-ISTBb.
- 19: NNZC – INTEGEROutput
On exit: the sum total number of nonzero elements in the matrices
Cb, for b=1,2,…,NB.
- 20: NPIVM(NB) – INTEGER arrayOutput
-
On exit: if
NPIVMb>0 it gives the number of pivots which were modified during the factorization to ensure that
Mb exists.
If
NPIVMb=-1 no pivot modifications were required, but a local restart occurred (see
Section 8.3 in F11DAF). The quality of the preconditioner will generally depend on the returned values of
NPIVMb, for
b=1,2,…,NB.
If NPIVMb is large, for some b, the preconditioner may not be satisfactory. In this case it may be advantageous to call F11DFF again with an increased value of LFILLb, a reduced value of DTOLb, or PSTRATb='C'.
- 21: IWORK(LIWORK) – INTEGER arrayWorkspace
- 22: LIWORK – INTEGERInput
On entry: the dimension of the array
IWORK as declared in the (sub)program from which F11DFF is called.
Constraint:
LIWORK≥9×N+3.
- 23: IFAIL – INTEGERInput/Output
-
On entry:
IFAIL must be set to
0,
-1 or 1. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
-1 or 1 is recommended. If the output of error messages is undesirable, then the value
1 is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is
0.
When the value -1 or 1 is used it is essential to test the value of IFAIL on exit.
On exit:
IFAIL=0 unless the routine detects an error or a warning has been flagged (see
Section 6).
6 Error Indicators and Warnings
If on entry
IFAIL=0 or
-1, explanatory error messages are output on the current error message unit (as defined by
X04AAF).
Errors or warnings detected by the routine:
- IFAIL=1
-
On entry, DTOLvalue=value.
Constraint: DTOLb≥0.0 for all b.
On entry, for b=value, ISTBb+1=value and ISTBb=value.
Constraint: ISTBb+1>ISTBb for all b.
On entry, INDBvalue=value and N=value.
Constraint: 1≤INDBb≤N for all b.
On entry, ISTB1=value.
Constraint: ISTB1≥1.
On entry, LA=value and NNZ=value.
Constraint: LA≥2×NNZ.
On entry, LINDB=value, ISTBNB+1-1=value and NB=value.
Constraint: LINDB≥ISTBNB+1-1.
On entry, LIWORK=value.
Constraint: LIWORK≥value.
On entry, MILUvalue=value.
Constraint: MILUb='M' or 'N' for all b.
On entry, N=value.
Constraint: N≥1.
On entry, NB=value and N=value.
Constraint: 1≤NB≤N.
On entry, NNZ=value.
Constraint: NNZ≥1.
On entry, NNZ=value and N=value.
Constraint: NNZ≤N2.
On entry, PSTRATvalue=value.
Constraint: PSTRATb='N', 'U', 'P' or 'C' for all b.
- IFAIL=2
-
On entry, element
value of
A was out of order.
On entry, ICOLvalue=value and N=value.
Constraint: 1≤ICOLj≤N for all j.
On entry, IROWvalue=value and N=value.
Constraint: 1≤IROWi≤N for all i.
On entry, location value of IROW,ICOL was a duplicate.
- IFAIL=3
-
On entry, the user-supplied value of
IPIVP for block
value lies outside the range
1,N.
On entry, the user-supplied value of
IPIVP for block
value was repeated.
On entry, the user-supplied value of
IPIVQ for block
value lies outside the range
1,N.
On entry, the user-supplied value of
IPIVQ for block
value was repeated.
- IFAIL=4
-
The number of nonzero entries in the decomposition is too large.
The decomposition has been terminated before completion.
Either increase
LA, or reduce the fill by reducing
LFILL, or increasing
DTOL.
7 Accuracy
The accuracy of the factorization of each block Ab will be determined by the size of the elements that are dropped and the size of any modifications made to the pivot elements. If these sizes are small then the computed factors will correspond to a matrix close to Ab. The factorization can generally be made more accurate by increasing the level of fill LFILLb, or by reducing the drop tolerance DTOLb with LFILLb<0.
If F11DFF is used in combination with
F11BEF or
F11DGF, the more accurate the factorization the fewer iterations will be required. However, the cost of the decomposition will also generally increase.
8 Further Comments
F11DFF calls
F11DAF
internally for each block
Ab. The comments and advice provided in
Section 8 in F11DAF on timing, control of
fill, algorithmic details, and choice of parameters, are all
therefore relevant to F11DFF, if interpreted blockwise.
9 Example
This example program reads in a sparse matrix
A and then defines a block partitioning of the row indices with a user-supplied overlap and computes an overlapping incomplete
LU factorization suitable for use as an additive Schwarz preconditioner. Such a factorization is used for this purpose in the example program of
F11DGF.
9.1 Program Text
Program Text (f11dffe.f90)
9.2 Program Data
Program Data (f11dffe.d)
9.3 Program Results
Program Results (f11dffe.r)