NAG FL Interface
f01bsf (real_gen_sparse_lu_reuse)
1
Purpose
f01bsf factorizes a real sparse matrix using the pivotal sequence previously obtained by
f01brf when a matrix of the same sparsity pattern was factorized.
2
Specification
Fortran Interface
Subroutine f01bsf ( 
n, nz, a, licn, ivect, jvect, icn, ikeep, iw, w, grow, eta, rpmin, abort, idisp, ifail) 
Integer, Intent (In) 
:: 
n, nz, licn, ivect(nz), jvect(nz), ikeep(5*n), idisp(2) 
Integer, Intent (Inout) 
:: 
icn(licn), ifail 
Integer, Intent (Out) 
:: 
iw(5*n) 
Real (Kind=nag_wp), Intent (In) 
:: 
eta 
Real (Kind=nag_wp), Intent (Inout) 
:: 
a(licn) 
Real (Kind=nag_wp), Intent (Out) 
:: 
w(n), rpmin 
Logical, Intent (In) 
:: 
grow, abort 

C Header Interface
#include <nag.h>
void 
f01bsf_ (const Integer *n, const Integer *nz, double a[], const Integer *licn, const Integer ivect[], const Integer jvect[], Integer icn[], const Integer ikeep[], Integer iw[], double w[], const logical *grow, const double *eta, double *rpmin, const logical *abort, const Integer idisp[], Integer *ifail) 

C++ Header Interface
#include <nag.h> extern "C" {
void 
f01bsf_ (const Integer &n, const Integer &nz, double a[], const Integer &licn, const Integer ivect[], const Integer jvect[], Integer icn[], const Integer ikeep[], Integer iw[], double w[], const logical &grow, const double &eta, double &rpmin, const logical &abort, const Integer idisp[], Integer &ifail) 
}

The routine may be called by the names f01bsf or nagf_matop_real_gen_sparse_lu_reuse.
3
Description
f01bsf accepts as input a real sparse matrix of the same sparsity pattern as a matrix previously factorized by a call of
f01brf. It first applies to the matrix the same permutations as were used by
f01brf, both for permutation to block triangular form and for pivoting, and then performs Gaussian elimination to obtain the
$LU$ factorization of the diagonal blocks.
Extensive data checks are made; duplicated nonzeros can be accumulated.
The factorization is intended to be used by
f04axf to solve sparse systems of linear equations
$Ax=b$ or
${A}^{\mathrm{T}}x=b$.
f01bsf is much faster than
f01brf and in some applications it is expected that there will be many calls of
f01bsf for each call of
f01brf.
The method is fully described in
Duff (1977).
A more recent algorithm for the same calculation is provided by
f11mef.
4
References
Duff I S (1977) MA28 – a set of Fortran subroutines for sparse unsymmetric linear equations AERE Report R8730 HMSO
5
Arguments

1:
$\mathbf{n}$ – Integer
Input

On entry: $n$, the order of the matrix $A$.
Constraint:
${\mathbf{n}}>0$.

2:
$\mathbf{nz}$ – Integer
Input

On entry: the number of nonzero elements in the matrix $A$.
Constraint:
${\mathbf{nz}}>0$.

3:
$\mathbf{a}\left({\mathbf{licn}}\right)$ – Real (Kind=nag_wp) array
Input/Output

On entry: ${\mathbf{a}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{nz}}$, must contain the nonzero elements of the sparse matrix $A$. They can be in any order since f01bsf will reorder them.
On exit: the nonzero elements in the
$LU$ factorization. The array must
not be changed by you between a call of
f01bsf and a call of
f04axf.

4:
$\mathbf{licn}$ – Integer
Input

On entry: the dimension of the arrays
a and
icn as declared in the (sub)program from which
f01bsf is called. It should have the same value as it had for
f01brf.
Constraint:
${\mathbf{licn}}\ge {\mathbf{nz}}$.

5:
$\mathbf{ivect}\left({\mathbf{nz}}\right)$ – Integer array
Input

6:
$\mathbf{jvect}\left({\mathbf{nz}}\right)$ – Integer array
Input

On entry: ${\mathbf{ivect}}\left(\mathit{i}\right)$ and ${\mathbf{jvect}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{nz}}$, must contain the row index and the column index respectively of the nonzero element stored in ${\mathbf{a}}\left(i\right)$.

7:
$\mathbf{icn}\left({\mathbf{licn}}\right)$ – Integer array
Input

icn contains, on entry, the same information as output by
f01brf. It must not be changed by you between a call of
f01bsf and a call of
f04axf.
icn is used as internal workspace prior to being restored on exit and hence is unchanged.

8:
$\mathbf{ikeep}\left(5\times {\mathbf{n}}\right)$ – Integer array
Communication Array

On entry: the same indexing information about the factorization as output in
ikeep by
f01brf.
You must
not change
ikeep between a call of
f01bsf and subsequent calls to
f04axf.

9:
$\mathbf{iw}\left(5\times {\mathbf{n}}\right)$ – Integer array
Workspace


10:
$\mathbf{w}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) array
Output

On exit: if
${\mathbf{grow}}=\mathrm{.TRUE.}$,
${\mathbf{w}}\left(1\right)$ contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization (see
grow); the rest of the array is used as workspace.
If ${\mathbf{grow}}=\mathrm{.FALSE.}$, the array is not used.

11:
$\mathbf{grow}$ – Logical
Input

On entry: if
${\mathbf{grow}}=\mathrm{.TRUE.}$, then on exit
${\mathbf{w}}\left(1\right)$ contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization. If the matrix is wellscaled (see
Section 9), then a high value for
${\mathbf{w}}\left(1\right)$ indicates that the
$LU$ factorization may be inaccurate and you should be wary of the results and perhaps increase the argument
pivot for subsequent runs (see
Section 7).

12:
$\mathbf{eta}$ – Real (Kind=nag_wp)
Input

On entry: the relative pivot threshold below which an error diagnostic is provoked and
ifail is set to
${\mathbf{ifail}}={\mathbf{7}}$. If
eta is greater than
$1.0$, then no check on pivot size is made.
Suggested value:
${\mathbf{eta}}={10}^{4}$.

13:
$\mathbf{rpmin}$ – Real (Kind=nag_wp)
Output

On exit: if
eta is less than
$1.0$, then
rpmin gives the smallest ratio of the pivot to the largest element in the row of the corresponding upper triangular factor thus monitoring the stability of the factorization. If
rpmin is very small it may be advisable to perform a new factorization using
f01brf.

14:
$\mathbf{abort}$ – Logical
Input

On entry: if
${\mathbf{abort}}=\mathrm{.TRUE.}$,
f01bsf exits immediately (with
${\mathbf{ifail}}={\mathbf{8}}$) if it finds duplicate elements in the input matrix.
If ${\mathbf{abort}}=\mathrm{.FALSE.}$, f01bsf proceeds using a value equal to the sum of the duplicate elements.
In either case details of each duplicate element are output on the current advisory message unit (see
x04abf), unless suppressed by the value of
ifail on entry.
Suggested value:
${\mathbf{abort}}=\mathrm{.TRUE.}$.

15:
$\mathbf{idisp}\left(2\right)$ – Integer array
Communication Array

On entry:
${\mathbf{idisp}}\left(1\right)$ and
${\mathbf{idisp}}\left(2\right)$ must be as output in
idisp by the previous call of
f01brf.

16:
$\mathbf{ifail}$ – Integer
Input/Output

This routine uses an
ifail input value codification that differs from the normal case to distinguish between errors and warnings (see
Section 4 in the Introduction to the NAG Library FL Interface).
On entry:
ifail must be set to one of the values below to set behaviour on detection of an error; these values have no effect when no error is detected. The behaviour relate to whether or not program execution is halted and whether or not messages are printed when an error or warning is detected.
ifail 
Execution 
Error Printing 
Warning Printed 
$\phantom{00}0$ 
halted 
No 
No 
$\phantom{00}1$ 
continue 
No 
No 
$\phantom{0}10$ 
halted 
Yes 
No 
$\phantom{0}11$ 
continue 
Yes 
No 
$100$ 
halted 
No 
Yes 
$101$ 
continue 
No 
Yes 
$110$ 
halted 
Yes 
Yes 
$111$ 
continue 
Yes 
Yes 
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1$,
$11$,
$101$ or
$111$ is recommended. If the printing of messages is undesirable, then the value
$1$ is recommended. Otherwise, the recommended value is
$110$.
When the value $\mathbf{1}$, $\mathbf{11}$, $\mathbf{101}$ or $\mathbf{111}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry, ${\mathbf{n}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n}}>0$.
 ${\mathbf{ifail}}=2$

On entry, ${\mathbf{nz}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{nz}}>0$.
 ${\mathbf{ifail}}=3$

On entry, ${\mathbf{licn}}=\u2329\mathit{\text{value}}\u232a$ and ${\mathbf{nz}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{licn}}\ge {\mathbf{nz}}$.
 ${\mathbf{ifail}}=4$

On entry,
${\mathbf{irn}}\left(I\right)$ in
f01brf or
${\mathbf{icn}}\left(I\right)$ is out of range:
$I=\u2329\mathit{\text{value}}\u232a$,
${\mathbf{a}}\left(I\right)=\u2329\mathit{\text{value}}\u232a$ ${\mathbf{irn}}\left(I\right)=\u2329\mathit{\text{value}}\u232a$ in
f01brf,
${\mathbf{icn}}\left(I\right)=\u2329\mathit{\text{value}}\u232a$.
 ${\mathbf{ifail}}=5$

Nonzero element ($\u2329\mathit{\text{value}}\u232a$, $\u2329\mathit{\text{value}}\u232a$) in zero offdiagonal block.
Nonzero element ($\u2329\mathit{\text{value}}\u232a$, $\u2329\mathit{\text{value}}\u232a$) was not in L/U pattern.
 ${\mathbf{ifail}}=6$

Numerical singularity in row $\u2329\mathit{\text{value}}\u232a$  decomposition aborted.
 ${\mathbf{ifail}}=7$

Subthreshold pivot in row $\u2329\mathit{\text{value}}\u232a$  decomposition completed.
 ${\mathbf{ifail}}=8$

On entry, duplicate elements found – see advisory messages.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
The factorization obtained is exact for a perturbed matrix whose $\left(i,j\right)$th element differs from ${a}_{ij}$ by less than $3\epsilon \rho {m}_{ij}$ where $\epsilon $ is the machine precision, $\rho $ is the growth value returned in ${\mathbf{w}}\left(1\right)$ if ${\mathbf{grow}}=\mathrm{.TRUE.}$, and ${m}_{ij}$ the number of Gaussian elimination operations applied to element $\left(i,j\right)$.
If
$\rho ={\mathbf{w}}\left(1\right)$ is very large or
rpmin is very small, then a fresh call of
f01brf is recommended.
8
Parallelism and Performance
f01bsf is not threaded in any implementation.
If you have a sequence of problems with the same sparsity pattern then
f01bsf is recommended after
f01brf has been called for one such problem. It is typically
$4$ to
$7$ times faster but is potentially unstable since the previous pivotal sequence is used. Further details on timing are given in the document for
f01brf.
If growth estimation is performed (${\mathbf{grow}}=\mathrm{.TRUE.}$), then the time increases by between $5\%$ and $10\%$. Pivot size monitoring (${\mathbf{eta}}\le 1.0$) involves a similar overhead.
We normally expect this routine to be entered with a matrix having the same pattern of nonzeros as was earlier presented to
f01brf. However there is no record of this pattern, but rather a record of the pattern including all fillins. Therefore we permit additional nonzeros in positions corresponding to fillins.
If singular matrices are being treated then it is also required that the present matrix be sufficiently like the previous one for the same permutations to be suitable for factorization with the same set of zero pivots.
10
Example
This example factorizes the real sparse matrices
and
This example program simply prints the values of
${\mathbf{w}}\left(1\right)$ and
rpmin returned by
f01bsf. Normally the calls of
f01brf and
f01bsf would be followed by calls of
f04axf.
10.1
Program Text
10.2
Program Data
10.3
Program Results