Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_sparse_real_gen_precon_bdilu (f11df)

## Purpose

nag_sparse_real_gen_precon_bdilu (f11df) computes a block diagonal incomplete LU$LU$ factorization of a real sparse nonsymmetric matrix, represented in coordinate storage format. The diagonal blocks may be composed of arbitrary rows and the corresponding columns, and may overlap. This factorization can be used to provide a block Jacobi or additive Schwarz preconditioner, for use in combination with nag_sparse_real_gen_basic_solver (f11be) or nag_sparse_real_gen_solve_bdilu (f11dg).

## Syntax

[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = f11df(n, nnz, a, irow, icol, istb, indb, lfill, dtol, milu, ipivp, ipivq, 'la', la, 'nb', nb, 'lindb', lindb, 'pstrat', pstrat)
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = nag_sparse_real_gen_precon_bdilu(n, nnz, a, irow, icol, istb, indb, lfill, dtol, milu, ipivp, ipivq, 'la', la, 'nb', nb, 'lindb', lindb, 'pstrat', pstrat)

## Description

nag_sparse_real_gen_precon_bdilu (f11df) computes an incomplete LU$LU$ factorization (see Meijerink and Van der Vorst (1977) and Meijerink and Van der Vorst (1981)) of the (possibly overlapping) diagonal blocks Ab${A}_{b}$, b = 1,2,,nb$b=1,2,\dots ,{\mathbf{nb}}$, of a real sparse nonsymmetric n$n$ by n$n$ matrix A$A$. The factorization is intended primarily for use as a block Jacobi or additive Schwarz preconditioner (see Saad (1996)), with one of the iterative solvers nag_sparse_real_gen_basic_solver (f11be) and nag_sparse_real_gen_solve_bdilu (f11dg).
The nb diagonal blocks need not consist of consecutive rows and columns of A$A$, but may be composed of arbitrarily indexed rows, and the corresponding columns, as defined in the arguments indb and istb. Any given row or column index may appear in more than one diagonal block, resulting in overlap. Each diagonal block Ab${A}_{b}$, b = 1,2,,nb$b=1,2,\dots ,{\mathbf{nb}}$, is factorized as:
 Ab = Mb + Rb $Ab = Mb+Rb$
where
 Mb = Pb Lb Db Ub Qb $Mb = Pb Lb Db Ub Qb$
and Lb${L}_{b}$ is lower triangular with unit diagonal elements, Db${D}_{b}$ is diagonal, Ub${U}_{b}$ is upper triangular with unit diagonals, Pb${P}_{b}$ and Qb${Q}_{b}$ are permutation matrices, and Rb${R}_{b}$ is a remainder matrix.
The amount of fill-in occurring in the factorization of block b$b$ can vary from zero to complete fill, and can be controlled by specifying either the maximum level of fill lfill(b)${\mathbf{lfill}}\left(b\right)$, or the drop tolerance dtol(b)${\mathbf{dtol}}\left(b\right)$.
The parameter pstrat(b)${\mathbf{pstrat}}\left(b\right)$ defines the pivoting strategy to be used in block b$b$. The options currently available are no pivoting, user-defined pivoting, partial pivoting by columns for stability, and complete pivoting by rows for sparsity and by columns for stability. The factorization may optionally be modified to preserve the row-sums of the original block matrix.
The sparse matrix A$A$ is represented in coordinate storage (CS) format (see Section [Coordinate storage (CS) format] in the F11 Chapter Introduction). The array a stores all the nonzero elements of the matrix A$A$, while arrays irow and icol store the corresponding row and column indices respectively. Multiple nonzero elements may not be specified for the same row and column index.
The preconditioning matrices Mb${M}_{b}$, b = 1,2,,nb$b=1,2,\dots ,{\mathbf{nb}}$, are returned in terms of the CS representations of the matrices
 Cb = Lb + D − 1b + Ub − 2I . $Cb = Lb + D-1b + Ub -2I .$

## References

Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix Math. Comput. 31 148–162
Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems J. Comput. Phys. 44 134–155
Saad Y (1996) Iterative Methods for Sparse Linear Systems PWS Publishing Company, Boston, MA

## Parameters

### Compulsory Input Parameters

1:     n – int64int32nag_int scalar
n$n$, the order of the matrix A$A$.
Constraint: n1${\mathbf{n}}\ge 1$.
2:     nnz – int64int32nag_int scalar
The number of nonzero elements in the matrix A$A$.
Constraint: 1nnzn2$1\le {\mathbf{nnz}}\le {{\mathbf{n}}}^{2}$.
3:     a(la) – double array
la, the dimension of the array, must satisfy the constraint la2 × nnz${\mathbf{la}}\ge 2×{\mathbf{nnz}}$.
The nonzero elements in the matrix A$A$, ordered by increasing row index, and by increasing column index within each row. Multiple entries for the same row and column indices are not permitted. The function nag_sparse_real_gen_sort (f11za) may be used to order the elements in this way.
4:     irow(la) – int64int32nag_int array
5:     icol(la) – int64int32nag_int array
la, the dimension of the array, must satisfy the constraint la2 × nnz${\mathbf{la}}\ge 2×{\mathbf{nnz}}$.
The row and column indices of the nonzero elements supplied in a.
Constraints:
irow and icol must satisfy these constraints (which may be imposed by a call to nag_sparse_real_gen_sort (f11za)):
• 1irow(i)n$1\le {\mathbf{irow}}\left(\mathit{i}\right)\le {\mathbf{n}}$ and 1icol(i)n$1\le {\mathbf{icol}}\left(\mathit{i}\right)\le {\mathbf{n}}$, for i = 1,2,,nnz$\mathit{i}=1,2,\dots ,{\mathbf{nnz}}$;
• either irow(i1) < irow(i)${\mathbf{irow}}\left(\mathit{i}-1\right)<{\mathbf{irow}}\left(\mathit{i}\right)$ or both irow(i1) = irow(i)${\mathbf{irow}}\left(\mathit{i}-1\right)={\mathbf{irow}}\left(\mathit{i}\right)$ and icol(i1) < icol(i)${\mathbf{icol}}\left(\mathit{i}-1\right)<{\mathbf{icol}}\left(\mathit{i}\right)$, for i = 2,3,,nnz$\mathit{i}=2,3,\dots ,{\mathbf{nnz}}$.
6:     istb(nb + 1${\mathbf{nb}}+1$) – int64int32nag_int array
istb(b)${\mathbf{istb}}\left(\mathit{b}\right)$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$, holds the index in arrays indb, ipivp, ipivq and idiag defining block b$\mathit{b}$. istb(nb + 1)${\mathbf{istb}}\left({\mathbf{nb}}+1\right)$ holds the sum of the number of rows in all blocks plus istb(1)${\mathbf{istb}}\left(1\right)$.
Constraint: istb(1)1, istb(b) < istb(b + 1) ${\mathbf{istb}}\left(1\right)\ge 1,{\mathbf{istb}}\left(\mathit{b}\right)<{\mathbf{istb}}\left(\mathit{b}+1\right)$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.
7:     indb(lindb) – int64int32nag_int array
lindb, the dimension of the array, must satisfy the constraint lindbistb(nb + 1)1${\mathbf{lindb}}\ge {\mathbf{istb}}\left({\mathbf{nb}}+1\right)-1$.
indb must hold the row indices appearing in each diagonal block, stored consecutively. Thus the elements indb(istb(b))${\mathbf{indb}}\left({\mathbf{istb}}\left(b\right)\right)$ to indb(istb(b + 1)1)${\mathbf{indb}}\left({\mathbf{istb}}\left(b+1\right)-1\right)$ are the row indices in the b$b$th block.
Constraint: 1indb(m)n$1\le {\mathbf{indb}}\left(\mathit{m}\right)\le {\mathbf{n}}$, for m = 1,2,,istb(nb + 1)1$\mathit{m}=1,2,\dots ,{\mathbf{istb}}\left({\mathbf{nb}}+1\right)-1$.
8:     lfill(nb) – int64int32nag_int array
nb, the dimension of the array, must satisfy the constraint 1nbn$1\le {\mathbf{nb}}\le {\mathbf{n}}$.
If lfill(b)0${\mathbf{lfill}}\left(b\right)\ge 0$ its value is the maximum level of fill allowed in the decomposition of the block b$b$ (see Section [Control of Fill-in] in (f11da)). A negative value of lfill(b)${\mathbf{lfill}}\left(b\right)$ indicates that dtol(b)${\mathbf{dtol}}\left(b\right)$ will be used to control the fill in block b$b$ instead.
9:     dtol(nb) – double array
nb, the dimension of the array, must satisfy the constraint 1nbn$1\le {\mathbf{nb}}\le {\mathbf{n}}$.
If lfill(b) < 0${\mathbf{lfill}}\left(b\right)<0$ then dtol(b)${\mathbf{dtol}}\left(b\right)$ is used as a drop tolerance in block b$b$ to control the fill-in (see Section [Control of Fill-in] in (f11da)); otherwise dtol(b)${\mathbf{dtol}}\left(b\right)$ is not referenced.
Constraint: if lfill(b) < 0${\mathbf{lfill}}\left(b\right)<0$, dtol(b)0.0${\mathbf{dtol}}\left(\mathit{b}\right)\ge 0.0$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.
10:   milu(nb) – cell array of strings
nb, the dimension of the array, must satisfy the constraint 1nbn$1\le {\mathbf{nb}}\le {\mathbf{n}}$.
milu(b)${\mathbf{milu}}\left(\mathit{b}\right)$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$, indicates whether or not the factorization in block b$b$ should be modified to preserve row-sums (see Section [Choice of s] in (f11da)).
milu(b) = 'M'${\mathbf{milu}}\left(b\right)=\text{'M'}$
The factorization is modified.
milu(b) = 'N'${\mathbf{milu}}\left(b\right)=\text{'N'}$
The factorization is not modified.
Constraint: milu(b) = 'M'${\mathbf{milu}}\left(\mathit{b}\right)=\text{'M'}$ or 'N'$\text{'N'}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.
11:   ipivp(lindb) – int64int32nag_int array
12:   ipivq(lindb) – int64int32nag_int array
lindb, the dimension of the array, must satisfy the constraint lindbistb(nb + 1)1${\mathbf{lindb}}\ge {\mathbf{istb}}\left({\mathbf{nb}}+1\right)-1$.
If pstrat(b) = 'U'${\mathbf{pstrat}}\left(b\right)=\text{'U'}$, then ipivp(istb(b) + k1)${\mathbf{ipivp}}\left({\mathbf{istb}}\left(b\right)+k-1\right)$ and ipivq(istb(b) + k1)${\mathbf{ipivq}}\left({\mathbf{istb}}\left(b\right)+k-1\right)$ must specify the row and column indices of the element used as a pivot at elimination stage k$k$ of the factorization of block b$b$. Otherwise ipivp and ipivq need not be initialized.
Constraint: if pstrat(b) = 'U'${\mathbf{pstrat}}\left(b\right)=\text{'U'}$, the elements istb(b)${\mathbf{istb}}\left(b\right)$ to istb(b + 1)1${\mathbf{istb}}\left(b+1\right)-1$ of ipivp and ipivq must both hold valid permutations of the integers on [1,istb(b + 1)istb(b)]$\left[1,{\mathbf{istb}}\left(b+1\right)-{\mathbf{istb}}\left(b\right)\right]$.

### Optional Input Parameters

1:     la – int64int32nag_int scalar
Default: The dimension of the arrays a, irow, icol. (An error is raised if these dimensions are not equal.)
The dimension of the arrays a, irow and icol as declared in the (sub)program from which nag_sparse_real_gen_precon_bdilu (f11df) is called. These arrays must be of sufficient size to store both A$A$ (nnz elements) and C$C$ (nnzc elements).
Constraint: la2 × nnz${\mathbf{la}}\ge 2×{\mathbf{nnz}}$.
2:     nb – int64int32nag_int scalar
Default: The dimension of the arrays lfill, dtol, pstrat, milu. (An error is raised if these dimensions are not equal.)
The number of diagonal blocks to factorize.
Constraint: 1nbn$1\le {\mathbf{nb}}\le {\mathbf{n}}$.
3:     lindb – int64int32nag_int scalar
Default: The dimension of the arrays indb, ipivp, ipivq. (An error is raised if these dimensions are not equal.)
The dimension of the arrays indb, ipivp, ipivq and idiag as declared in the (sub)program from which nag_sparse_real_gen_precon_bdilu (f11df) is called.
Constraint: lindbistb(nb + 1)1${\mathbf{lindb}}\ge {\mathbf{istb}}\left({\mathbf{nb}}+1\right)-1$.
4:     pstrat(nb) – cell array of strings
pstrat(b)${\mathbf{pstrat}}\left(\mathit{b}\right)$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$, specifies the pivoting strategy to be adopted in block b$b$ as follows:
pstrat(b) = 'N'${\mathbf{pstrat}}\left(b\right)=\text{'N'}$
No pivoting is carried out.
pstrat(b) = 'U'${\mathbf{pstrat}}\left(b\right)=\text{'U'}$
Pivoting is carried out according to the user-defined input values of ipivp and ipivq.
pstrat(b) = 'P'${\mathbf{pstrat}}\left(b\right)=\text{'P'}$
Partial pivoting by columns for stability is carried out.
pstrat(b) = 'C'${\mathbf{pstrat}}\left(b\right)=\text{'C'}$
Complete pivoting by rows for sparsity, and by columns for stability, is carried out.
Default: 'C'$\text{'C'}$
Constraint: pstrat(b) = 'N'${\mathbf{pstrat}}\left(\mathit{b}\right)=\text{'N'}$, 'U'$\text{'U'}$, 'P'$\text{'P'}$ or 'C'$\text{'C'}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.

iwork liwork

### Output Parameters

1:     a(la) – double array
The first nnz entries of a contain the nonzero elements of A$A$ and the next nnzc entries contain the elements of the matrices Cb${C}_{\mathit{b}}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$ stored consecutively. Within each block the matrix elements are ordered by increasing row index, and by increasing column index within each row.
2:     irow(la) – int64int32nag_int array
3:     icol(la) – int64int32nag_int array
The row and column indices of the nonzero elements returned in a.
4:     ipivp(lindb) – int64int32nag_int array
5:     ipivq(lindb) – int64int32nag_int array
The row and column indices of the pivot elements, arranged consecutively for each block, as for indb. If ipivp(istb(b) + k1) = i${\mathbf{ipivp}}\left({\mathbf{istb}}\left(b\right)+k-1\right)=i$ and ipivq(istb(b) + k1) = j${\mathbf{ipivq}}\left({\mathbf{istb}}\left(b\right)+k-1\right)=j$, then the element in row i$i$ and column j$j$ of Ab${A}_{b}$ was used as the pivot at elimination stage k$k$.
6:     istr(lindb + 1${\mathbf{lindb}}+1$) – int64int32nag_int array
istr(istb(b) + k1)${\mathbf{istr}}\left({\mathbf{istb}}\left(\mathit{b}\right)+\mathit{k}-1\right)$, gives the starting address in the arrays a, irow and icol of row k$\mathit{k}$ of the matrix Cb${C}_{\mathit{b}}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$ and k = 1,2,,istb(b + 1)istb(b)$\mathit{k}=1,2,\dots ,{\mathbf{istb}}\left(\mathit{b}+1\right)-{\mathbf{istb}}\left(\mathit{b}\right)$.
istr(istb(nb + 1))${\mathbf{istr}}\left({\mathbf{istb}}\left({\mathbf{nb}}+1\right)\right)$ contains nnz + nnzc + 1${\mathbf{nnz}}+{\mathbf{nnzc}}+1$.
7:     idiag(lindb) – int64int32nag_int array
idiag(istb(b) + k1)${\mathbf{idiag}}\left({\mathbf{istb}}\left(\mathit{b}\right)+\mathit{k}-1\right)$, gives the address in the arrays a, irow and icol of the diagonal element in row k$\mathit{k}$ of the matrix Cb${C}_{\mathit{b}}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$ and k = 1,2,,istb(b + 1)istb(b)$\mathit{k}=1,2,\dots ,{\mathbf{istb}}\left(\mathit{b}+1\right)-{\mathbf{istb}}\left(\mathit{b}\right)$.
8:     nnzc – int64int32nag_int scalar
The sum total number of nonzero elements in the matrices Cb${C}_{\mathit{b}}$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.
9:     npivm(nb) – int64int32nag_int array
If npivm(b) > 0${\mathbf{npivm}}\left(b\right)>0$ it gives the number of pivots which were modified during the factorization to ensure that Mb${M}_{b}$ exists.
If npivm(b) = 1${\mathbf{npivm}}\left(\mathit{b}\right)=-1$ no pivot modifications were required, but a local restart occurred (see Section [Algorithmic Details] in (f11da)). The quality of the preconditioner will generally depend on the returned values of npivm(b)${\mathbf{npivm}}\left(\mathit{b}\right)$, for b = 1,2,,nb$\mathit{b}=1,2,\dots ,{\mathbf{nb}}$.
If npivm(b)${\mathbf{npivm}}\left(b\right)$ is large, for some b$b$, the preconditioner may not be satisfactory. In this case it may be advantageous to call nag_sparse_real_gen_precon_bdilu (f11df) again with an increased value of lfill(b)${\mathbf{lfill}}\left(b\right)$, a reduced value of dtol(b)${\mathbf{dtol}}\left(b\right)$, or pstrat(b) = 'C'${\mathbf{pstrat}}\left(b\right)=\text{'C'}$.
10:   ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
Constraint: 1indb(b)n$1\le {\mathbf{indb}}\left(b\right)\le {\mathbf{n}}$ for all b$b$.
Constraint: 1nbn$1\le {\mathbf{nb}}\le {\mathbf{n}}$.
Constraint: dtol(b)0.0${\mathbf{dtol}}\left(b\right)\ge 0.0$ for all b$b$.
Constraint: istb(1)1${\mathbf{istb}}\left(1\right)\ge 1$.
Constraint: istb(b + 1) > istb(b)${\mathbf{istb}}\left(b+1\right)>{\mathbf{istb}}\left(b\right)$ for all b$b$.
Constraint: la2 × nnz${\mathbf{la}}\ge 2×{\mathbf{nnz}}$.
Constraint: lindbistb(nb + 1)1${\mathbf{lindb}}\ge {\mathbf{istb}}\left({\mathbf{nb}}+1\right)-1$.
Constraint: milu(b) = 'M'${\mathbf{milu}}\left(b\right)=\text{'M'}$ or 'N'$\text{'N'}$ for all b$b$.
Constraint: nnzn2${\mathbf{nnz}}\le {{\mathbf{n}}}^{2}$.
Constraint: nnz1${\mathbf{nnz}}\ge 1$.
Constraint: n1${\mathbf{n}}\ge 1$.
Constraint: pstrat(b) = 'N'${\mathbf{pstrat}}\left(b\right)=\text{'N'}$, 'U'$\text{'U'}$, 'P'$\text{'P'}$ or 'C'$\text{'C'}$ for all b$b$.
liwork is too small.
ifail = 2${\mathbf{ifail}}=2$
Constraint: 1icol(j)n$1\le {\mathbf{icol}}\left(j\right)\le {\mathbf{n}}$ for all j$j$.
Constraint: 1irow(i)n$1\le {\mathbf{irow}}\left(i\right)\le {\mathbf{n}}$ for all i$i$.
On entry, element _$_$ of a was out of order.
On entry, location _$_$ of (irow,icol)$\left({\mathbf{irow}},{\mathbf{icol}}\right)$ was a duplicate.
ifail = 3${\mathbf{ifail}}=3$
On entry, the user-supplied value of ipivp for block _$_$ lies outside the range [1,n]$\left[1,{\mathbf{n}}\right]$.
On entry, the user-supplied value of ipivp for block _$_$ was repeated.
On entry, the user-supplied value of ipivq for block _$_$ lies outside the range [1,n]$\left[1,{\mathbf{n}}\right]$.
On entry, the user-supplied value of ipivq for block _$_$ was repeated.
ifail = 4${\mathbf{ifail}}=4$
The number of nonzero entries in the decomposition is too large.
The decomposition has been terminated before completion.
Either increase la, or reduce the fill by reducing lfill, or increasing dtol.

## Accuracy

The accuracy of the factorization of each block Ab${A}_{b}$ will be determined by the size of the elements that are dropped and the size of any modifications made to the pivot elements. If these sizes are small then the computed factors will correspond to a matrix close to Ab${A}_{b}$. The factorization can generally be made more accurate by increasing the level of fill lfill(b)${\mathbf{lfill}}\left(b\right)$, or by reducing the drop tolerance dtol(b)${\mathbf{dtol}}\left(b\right)$ with lfill(b) < 0${\mathbf{lfill}}\left(b\right)<0$.
If nag_sparse_real_gen_precon_bdilu (f11df) is used in combination with nag_sparse_real_gen_basic_solver (f11be) or nag_sparse_real_gen_solve_bdilu (f11dg), the more accurate the factorization the fewer iterations will be required. However, the cost of the decomposition will also generally increase.

F11DFF calls nag_sparse_real_gen_precon_ilu (f11da) internally for each block Ab${A}_{b}$. The comments and advice provided in Section [Further Comments] in (f11da) on timing, control of fill, algorithmic details, and choice of parameters, are all therefore relevant to F11DFF, if interpreted blockwise.

## Example

function nag_sparse_real_gen_precon_bdilu_example
n   = int64(9);
nz = int64(33);
a    = zeros(20*nz, 1);
irow = zeros(20*nz, 1, 'int64');
icol = zeros(20*nz, 1, 'int64');
a(1:nz) = [ 64; -20; -20; -12; 64; -20; -20; -12; 64; -20; -12; 64; -20; ...
-20; -12; -12; 64; -20; -20; -12; -12; 64; -20; -12; 64; -20; ...
-12; -12; 64; -20; -12; -12; 64];
irow(1:nz) = [int64(1); 1; 1; 2; 2; 2; 2; 3; 3; 3; 4; 4; 4; 4; 5; 5; 5; ...
5; 5; 6; 6; 6; 6; 7; 7; 7; 8; 8; 8; 8; 9; 9; 9];
icol(1:nz) = [int64(1); 2; 4; 1; 2; 3; 5; 2; 3; 6; 1; 4; 5; 7; 2; 4; 5; ...
6; 8; 3; 5; 6; 9; 4; 7; 8; 5; 7; 8; 9; 6; 8; 9];
nb = 3;
nover = 1;
lfill = [int64(0); 0; 0];
dtol = [0; 0; 0];
pstrat = {'n'; 'n'; 'n'};
milu   = {'n'; 'n'; 'n'};

% Define diagonal block indices.
% In this example use blocks of MB consecutive rows and initialise
% assuming no overlap.
mb = idivide(n+nb-1, int64(nb));
istb = zeros(nb+1, 1, 'int64');
indb = zeros(3*n, 1, 'int64');
ipivp = zeros(3*n, 1, 'int64');
ipivq = zeros(3*n, 1, 'int64');
for k=1:nb
istb(k) = (k-1)*mb+1;
end
istb(nb+1) = n+1;
for i=1:n
indb(i) = i;
end

% Modify indb and istb to account for overlap.
[istb, indb, ifail] = nag_sparse_real_gen_precon_bdilu_overlap(n, nz, ...
irow, icol, nb, istb, indb, 3*n, nover);
if (ifail == -999)
error('indb is too small, size of indb = %d', numel(indb));
end

% Output matrix and blocking details
fprintf('\nOriginal matrix\n');
fprintf(' n   = %d\n', n);
fprintf(' nz  = %d\n', nz);
fprintf(' nb  = %d\n', nb);
for k=1:nb
fprintf(' Block %d: order = %d, start row = %d\n', k, istb(k+1)-istb(k), ...
min(indb(istb(k):istb(k+1)-1)));
end

% Calculate Factorisation
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = ...
nag_sparse_real_gen_precon_bdilu(n, nz, a, irow, icol, istb, indb, ...
lfill, dtol, milu, ipivp, ipivq, 'pstrat', pstrat);

% Output details of the factorization
fprintf('\nFactorization\n');
fprintf(' nnzc = %d\n\n', nnzc);
fprintf(' Elements of factorization\n\n');
fprintf('        I   J        C(I,J)     Index\n');
for k=1:nb
fprintf(' C_%d   --------------------------------\n', k);
% Elements of the k-th block
for i = istr(istb(k)):istr(istb(k+1))-1
fprintf('     %4d%4d%16e%8d\n', irow(i), icol(i), a(i), i);
end
end

fprintf('\n Details of factorized blocks\n\n');
if max(npivm) > 0
% Including pivoting details.
fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)  IPIVP(I)  IPIVQ(I)\n');
for k=1:nb
i = istb(k);
fprintf(' %4d%4d%10d%10d%10d%10d%10d\n', k, i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
for i = istb(k)+1:istb(k+1)-1
fprintf(' %7d%10d%10d%10d%10d%10d\n', i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
end
fprintf(' ------------------------------------\n');
end
else
% No pivoting on any block.
fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)\n');
for k=1:nb
i = istb(k);
fprintf('%3d%4d%10d%10d%10d\n', k, i, istr(i), idiag(i), indb(i));
for i = istb(k)+1:istb(k+1)-1
fprintf('%7d%10d%10d%10d\n', i, istr(i), idiag(i), indb(i));
end
fprintf(' ------------------------------------\n');
end
end

function [istb, indb, ifail] = nag_sparse_real_gen_precon_bdilu_overlap(n, ...
nz, irow, icol, nb, istb, indb, lindb, nover)

ifail = 0;

% This function takes a set of row indices indb defining the diagonal blocks
% to be used in nag_sparse_real_gen_precon_bdilu to define a block Jacobi or
% additive Schwarz preconditioner, and expands them to allow for nover levels
% of overlap.  The pointer array istb is also updated accordingly, so that
% the returned values of istb and indb can be passed to
% nag_sparse_real_gen_precon_bdilu to define overlapping diagonal blocks.
iwork = zeros(3*n+1, 1, 'int64');

% Find the number of non-zero elements in each row of the matrix A, and
% the start address of each row. Store the start addresses in
% iwork(n+1,...,2*n+1).
for k=1:nz
iwork(irow(k)) = iwork(irow(k)) + 1;
end
iwork(n+1) = 1;
for i = 1:n
iwork(n+i+1) = iwork(n+i) + iwork(i);
end

% Loop over blocks
for k=1:nb
% Initialize marker array.
iwork(1:n) = 0;

% Mark the rows already in block k in the workspace array.
for l = istb(k):istb(k+1)-1
iwork(indb(l)) = 1;
end

% Loop over levels of overlap.
for iover=1:nover
% Initialize counter of new row indices to be added.
ind = 0;

% Loop over the rows currently in the diagonal block.
for l = istb(k):istb(k+1)-1
row = indb(l);

% Loop over non-zero elements in row
for i = iwork(n+row):iwork(n+row+1)-1

% If the column index of the non-zero element is not in the
% existing set for this block, store it to be added later, and
% mark it in the marker array.
if (iwork(icol(i))==0)
iwork(icol(i)) = 1;
ind = ind + 1;
iwork(2*n+1+ind) = icol(i);
end
end
end

% Shift the indices in indb and add the new entries for block k.
% Change istb accordingly.
ifail = -999;
return;
end

for i = istb(nb+1) - 1:-1:istb(k+1)
end
n21 = 2*n + 1;
ik = istb(k+1) - 1;
end
end

Original matrix
n   = 9
nz  = 33
nb  = 3
Block 1: order = 6, start row = 1
Block 2: order = 9, start row = 1
Block 3: order = 6, start row = 4

Factorization
nnzc = 73

Elements of factorization

I   J        C(I,J)     Index
C_1   --------------------------------
1   1    1.562500e-02      34
1   2   -3.125000e-01      35
1   4   -3.125000e-01      36
2   1   -1.875000e-01      37
2   2    1.659751e-02      38
2   3   -3.319502e-01      39
2   5   -3.319502e-01      40
3   2   -1.991701e-01      41
3   3    1.666206e-02      42
3   6   -3.332412e-01      43
4   1   -1.875000e-01      44
4   4    1.659751e-02      45
4   5   -3.319502e-01      46
5   2   -1.991701e-01      47
5   4   -1.991701e-01      48
5   5    1.784656e-02      49
5   6   -3.569313e-01      50
6   3   -1.999447e-01      51
6   5   -2.141588e-01      52
6   6    1.794754e-02      53
C_2   --------------------------------
1   1    1.562500e-02      54
1   2   -3.125000e-01      55
1   4   -1.875000e-01      56
1   5   -3.125000e-01      57
2   1   -1.875000e-01      58
2   2    1.659751e-02      59
2   3   -3.319502e-01      60
2   6   -1.991701e-01      61
2   7   -3.319502e-01      62
3   2   -1.991701e-01      63
3   3    1.666206e-02      64
3   8   -1.999447e-01      65
3   9   -3.332412e-01      66
4   1   -3.125000e-01      67
4   4    1.659751e-02      68
4   6   -3.319502e-01      69
5   1   -1.875000e-01      70
5   5    1.659751e-02      71
5   7   -3.319502e-01      72
6   2   -3.319502e-01      73
6   4   -1.991701e-01      74
6   6    1.784656e-02      75
6   8   -3.569313e-01      76
7   2   -1.991701e-01      77
7   5   -1.991701e-01      78
7   7    1.784656e-02      79
7   9   -3.569313e-01      80
8   3   -3.332412e-01      81
8   6   -2.141588e-01      82
8   8    1.794754e-02      83
9   3   -1.999447e-01      84
9   7   -2.141588e-01      85
9   9    1.794754e-02      86
C_3   --------------------------------
1   1    1.562500e-02      87
1   2   -3.125000e-01      88
1   4   -1.875000e-01      89
2   1   -1.875000e-01      90
2   2    1.659751e-02      91
2   3   -3.319502e-01      92
2   5   -1.991701e-01      93
3   2   -1.991701e-01      94
3   3    1.666206e-02      95
3   6   -1.999447e-01      96
4   1   -3.125000e-01      97
4   4    1.659751e-02      98
4   5   -3.319502e-01      99
5   2   -3.319502e-01     100
5   4   -1.991701e-01     101
5   5    1.784656e-02     102
5   6   -3.569313e-01     103
6   3   -3.332412e-01     104
6   5   -2.141588e-01     105
6   6    1.794754e-02     106

Details of factorized blocks

K   I      ISTR(I)  IDIAG(I)   INDB(I)
1   1        34        34         1
2        37        38         2
3        41        42         3
4        44        45         4
5        47        49         5
6        51        53         6
------------------------------------
2   7        54        54         4
8        58        59         5
9        63        64         6
10        67        68         1
11        70        71         7
12        73        75         2
13        77        79         8
14        81        83         3
15        84        86         9
------------------------------------
3  16        87        87         7
17        90        91         8
18        94        95         9
19        97        98         4
20       100       102         5
21       104       106         6
------------------------------------

function f11df_example
n   = int64(9);
nz = int64(33);
a    = zeros(20*nz, 1);
irow = zeros(20*nz, 1, 'int64');
icol = zeros(20*nz, 1, 'int64');
a(1:nz) = [64; -20; -20; -12; 64; -20; -20; -12; 64; -20; -12; 64; -20; ...
-20; -12; -12; 64; -20; -20; -12; -12; 64; -20; -12; 64; -20; ...
-12; -12; 64; -20; -12; -12; 64];
irow(1:nz) = [int64(1); 1; 1; 2; 2; 2; 2; 3; 3; 3; 4; 4; 4; 4; 5; 5; 5; ...
5; 5; 6; 6; 6; 6; 7; 7; 7; 8; 8; 8; 8; 9; 9; 9];
icol(1:nz) = [int64(1); 2; 4; 1; 2; 3; 5; 2; 3; 6; 1; 4; 5; 7; 2; 4; 5; ...
6; 8; 3; 5; 6; 9; 4; 7; 8; 5; 7; 8; 9; 6; 8; 9];
nb = 3;
nover = 1;
lfill = [int64(0); 0; 0];
dtol = [0; 0; 0];
pstrat = {'n'; 'n'; 'n'};
milu   = {'n'; 'n'; 'n'};

% Define diagonal block indices.
% In this example use blocks of MB consecutive rows and initialise
% assuming no overlap.
mb = idivide(n+nb-1, int64(nb));
istb = zeros(nb+1, 1, 'int64');
indb = zeros(3*n, 1, 'int64');
ipivp = zeros(3*n, 1, 'int64');
ipivq = zeros(3*n, 1, 'int64');
for k=1:nb
istb(k) = (k-1)*mb+1;
end
istb(nb+1) = n+1;
for i=1:n
indb(i) = i;
end

% Modify indb and istb to account for overlap.
[istb, indb, ifail] = f11df_overlap(n, nz, irow, icol, nb, ...
istb, indb, 3*n, nover);
if (ifail == -999)
error('indb is too small, size of indb = %d', numel(indb));
end

% Output matrix and blocking details
fprintf('\nOriginal matrix\n');
fprintf(' n  = %d\n', n);
fprintf(' nz = %d\n', nz);
fprintf(' nb = %d\n', nb);
for k=1:nb
fprintf(' Block %d: order = %d, start row = %d\n', k, istb(k+1)-istb(k), ...
min(indb(istb(k):istb(k+1)-1)));
end

% Calculate Factorisation
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = ...
f11df(n, nz, a, irow, icol, istb, indb, ...
lfill, dtol, milu, ipivp, ipivq, 'pstrat', pstrat);

% Output details of the factorization
fprintf('\nFactorization\n');
fprintf(' nnzc = %d\n\n', nnzc);
fprintf(' Elements of factorization\n\n');
fprintf('        I   J        C(I,J)     Index\n');
for k=1:nb
fprintf(' C_%d   --------------------------------\n', k);
% Elements of the k-th block
for i = istr(istb(k)):istr(istb(k+1))-1
fprintf('     %4d%4d%16e%8d\n', irow(i), icol(i), a(i), i);
end
end

fprintf('\n Details of factorized blocks\n\n');
if max(npivm) > 0
% Including pivoting details.
fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)  IPIVP(I)  IPIVQ(I)\n');
for k=1:nb
i = istb(k);
fprintf(' %4d%4d%10d%10d%10d%10d%10d\n', k, i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
for i = istb(k)+1:istb(k+1)-1
fprintf(' %7d%10d%10d%10d%10d%10d\n', i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
end
fprintf(' ------------------------------------\n');
end
else
% No pivoting on any block.
fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)\n');
for k=1:nb
i = istb(k);
fprintf('%3d%4d%10d%10d%10d\n', k, i, istr(i), idiag(i), indb(i));
for i = istb(k)+1:istb(k+1)-1
fprintf('%7d%10d%10d%10d\n', i, istr(i), idiag(i), indb(i));
end
fprintf(' ------------------------------------\n');
end
end

function [istb, indb, ifail] = f11df_overlap(n, nz, irow, icol, nb, ...
istb, indb, lindb, nover)

ifail = 0;

% This function takes a set of row indices indb defining the diagonal
% blocks to be used in f11df to define a block Jacobi or additive Schwarz
% preconditioner, and expands them to allow for nover levels of overlap.
% The pointer array istb is also updated accordingly, so that the returned
% values of istb and indb can be passed to f11df to define overlapping
% diagonal blocks.
iwork = zeros(3*n+1, 1, 'int64');

% Find the number of non-zero elements in each row of the matrix A, and
% the start address of each row. Store the start addresses in
% iwork(n+1,...,2*n+1).
for k=1:nz
iwork(irow(k)) = iwork(irow(k)) + 1;
end
iwork(n+1) = 1;
for i = 1:n
iwork(n+i+1) = iwork(n+i) + iwork(i);
end

% Loop over blocks
for k=1:nb
% Initialize marker array.
iwork(1:n) = 0;

% Mark the rows already in block k in the workspace array.
for l = istb(k):istb(k+1)-1
iwork(indb(l)) = 1;
end

% Loop over levels of overlap.
for iover=1:nover
% Initialize counter of new row indices to be added.
ind = 0;

% Loop over the rows currently in the diagonal block.
for l = istb(k):istb(k+1)-1
row = indb(l);

% Loop over non-zero elements in row
for i = iwork(n+row):iwork(n+row+1)-1

% If the column index of the non-zero element is not in the
% existing set for this block, store it to be added later, and
% mark it in the marker array.
if (iwork(icol(i))==0)
iwork(icol(i)) = 1;
ind = ind + 1;
iwork(2*n+1+ind) = icol(i);
end
end
end

% Shift the indices in indb and add the new entries for block k.
% Change istb accordingly.
ifail = -999;
return;
end

for i = istb(nb+1) - 1:-1:istb(k+1)
end
n21 = 2*n + 1;
ik = istb(k+1) - 1;
end
end

Original matrix
n  = 9
nz = 33
nb = 3
Block 1: order = 6, start row = 1
Block 2: order = 9, start row = 1
Block 3: order = 6, start row = 4

Factorization
nnzc = 73

Elements of factorization

I   J        C(I,J)     Index
C_1   --------------------------------
1   1    1.562500e-02      34
1   2   -3.125000e-01      35
1   4   -3.125000e-01      36
2   1   -1.875000e-01      37
2   2    1.659751e-02      38
2   3   -3.319502e-01      39
2   5   -3.319502e-01      40
3   2   -1.991701e-01      41
3   3    1.666206e-02      42
3   6   -3.332412e-01      43
4   1   -1.875000e-01      44
4   4    1.659751e-02      45
4   5   -3.319502e-01      46
5   2   -1.991701e-01      47
5   4   -1.991701e-01      48
5   5    1.784656e-02      49
5   6   -3.569313e-01      50
6   3   -1.999447e-01      51
6   5   -2.141588e-01      52
6   6    1.794754e-02      53
C_2   --------------------------------
1   1    1.562500e-02      54
1   2   -3.125000e-01      55
1   4   -1.875000e-01      56
1   5   -3.125000e-01      57
2   1   -1.875000e-01      58
2   2    1.659751e-02      59
2   3   -3.319502e-01      60
2   6   -1.991701e-01      61
2   7   -3.319502e-01      62
3   2   -1.991701e-01      63
3   3    1.666206e-02      64
3   8   -1.999447e-01      65
3   9   -3.332412e-01      66
4   1   -3.125000e-01      67
4   4    1.659751e-02      68
4   6   -3.319502e-01      69
5   1   -1.875000e-01      70
5   5    1.659751e-02      71
5   7   -3.319502e-01      72
6   2   -3.319502e-01      73
6   4   -1.991701e-01      74
6   6    1.784656e-02      75
6   8   -3.569313e-01      76
7   2   -1.991701e-01      77
7   5   -1.991701e-01      78
7   7    1.784656e-02      79
7   9   -3.569313e-01      80
8   3   -3.332412e-01      81
8   6   -2.141588e-01      82
8   8    1.794754e-02      83
9   3   -1.999447e-01      84
9   7   -2.141588e-01      85
9   9    1.794754e-02      86
C_3   --------------------------------
1   1    1.562500e-02      87
1   2   -3.125000e-01      88
1   4   -1.875000e-01      89
2   1   -1.875000e-01      90
2   2    1.659751e-02      91
2   3   -3.319502e-01      92
2   5   -1.991701e-01      93
3   2   -1.991701e-01      94
3   3    1.666206e-02      95
3   6   -1.999447e-01      96
4   1   -3.125000e-01      97
4   4    1.659751e-02      98
4   5   -3.319502e-01      99
5   2   -3.319502e-01     100
5   4   -1.991701e-01     101
5   5    1.784656e-02     102
5   6   -3.569313e-01     103
6   3   -3.332412e-01     104
6   5   -2.141588e-01     105
6   6    1.794754e-02     106

Details of factorized blocks

K   I      ISTR(I)  IDIAG(I)   INDB(I)
1   1        34        34         1
2        37        38         2
3        41        42         3
4        44        45         4
5        47        49         5
6        51        53         6
------------------------------------
2   7        54        54         4
8        58        59         5
9        63        64         6
10        67        68         1
11        70        71         7
12        73        75         2
13        77        79         8
14        81        83         3
15        84        86         9
------------------------------------
3  16        87        87         7
17        90        91         8
18        94        95         9
19        97        98         4
20       100       102         5
21       104       106         6
------------------------------------