hide long namesshow long names
hide short namesshow short names
Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

NAG Toolbox: nag_sparse_complex_gen_precon_bdilu (f11dt)

Purpose

nag_sparse_complex_gen_precon_bdilu (f11dt) computes a block diagonal incomplete LULU factorization of a complex sparse non-Hermitian matrix, represented in coordinate storage format. The diagonal blocks may be composed of arbitrary rows and the corresponding columns, and may overlap. This factorization can be used to provide a block Jacobi or additive Schwarz preconditioner, for use in combination with nag_sparse_complex_gen_basic_solver (f11bs) or nag_sparse_complex_gen_solve_bdilu (f11du).

Syntax

[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = f11dt(n, nnz, a, irow, icol, istb, indb, lfill, dtol, milu, ipivp, ipivq, 'la', la, 'nb', nb, 'lindb', lindb, 'pstrat', pstrat)
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = nag_sparse_complex_gen_precon_bdilu(n, nnz, a, irow, icol, istb, indb, lfill, dtol, milu, ipivp, ipivq, 'la', la, 'nb', nb, 'lindb', lindb, 'pstrat', pstrat)

Description

nag_sparse_complex_gen_precon_bdilu (f11dt) computes an incomplete LULU factorization (see Meijerink and Van der Vorst (1977) and Meijerink and Van der Vorst (1981)) of the (possibly overlapping) diagonal blocks AbAb, b = 1,2,,nbb=1,2,,nb, of a complex sparse non-Hermitian nn by nn matrix AA. The factorization is intended primarily for use as a block Jacobi or additive Schwarz preconditioner (see Saad (1996)), with one of the iterative solvers nag_sparse_complex_gen_basic_solver (f11bs) and nag_sparse_complex_gen_solve_bdilu (f11du).
The nb diagonal blocks need not consist of consecutive rows and columns of AA, but may be composed of arbitrarily indexed rows, and the corresponding columns, as defined in the arguments indb and istb. Any given row or column index may appear in more than one diagonal block, resulting in overlap. Each diagonal block AbAb, b = 1,2,,nbb=1,2,,nb, is factorized as:
Ab = Mb + Rb
Ab = Mb+Rb
where
Mb = Pb Lb Db Ub Qb
Mb = Pb Lb Db Ub Qb
and LbLb is lower triangular with unit diagonal elements, DbDb is diagonal, UbUb is upper triangular with unit diagonals, PbPb and QbQb are permutation matrices, and RbRb is a remainder matrix.
The amount of fill-in occurring in the factorization of block bb can vary from zero to complete fill, and can be controlled by specifying either the maximum level of fill lfill(b)lfillb, or the drop tolerance dtol(b)dtolb.
The parameter pstrat(b)pstratb defines the pivoting strategy to be used in block bb. The options currently available are no pivoting, user-defined pivoting, partial pivoting by columns for stability, and complete pivoting by rows for sparsity and by columns for stability. The factorization may optionally be modified to preserve the row-sums of the original block matrix.
The sparse matrix AA is represented in coordinate storage (CS) format (see Section [Coordinate storage (CS) format] in the F11 Chapter Introduction). The array a stores all the nonzero elements of the matrix AA, while arrays irow and icol store the corresponding row and column indices respectively. Multiple nonzero elements may not be specified for the same row and column index.
The preconditioning matrices MbMb, b = 1,2,,nbb=1,2,,nb, are returned in terms of the CS representations of the matrices
Cb = Lb + D1b + Ub 2I .
Cb = Lb + D-1b + Ub -2I .

References

Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix Math. Comput. 31 148–162
Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems J. Comput. Phys. 44 134–155
Saad Y (1996) Iterative Methods for Sparse Linear Systems PWS Publishing Company, Boston, MA

Parameters

Compulsory Input Parameters

1:     n – int64int32nag_int scalar
nn, the order of the matrix AA.
Constraint: n1n1.
2:     nnz – int64int32nag_int scalar
The number of nonzero elements in the matrix AA.
Constraint: 1nnzn21nnzn2.
3:     a(la) – complex array
la, the dimension of the array, must satisfy the constraint la2 × nnzla2×nnz.
The nonzero elements in the matrix AA, ordered by increasing row index, and by increasing column index within each row. Multiple entries for the same row and column indices are not permitted. The function nag_sparse_complex_gen_sort (f11zn) may be used to order the elements in this way.
4:     irow(la) – int64int32nag_int array
5:     icol(la) – int64int32nag_int array
la, the dimension of the array, must satisfy the constraint la2 × nnzla2×nnz.
The row and column indices of the nonzero elements supplied in a.
Constraints:
irow and icol must satisfy these constraints (which may be imposed by a call to nag_sparse_real_gen_sort (f11za)):
  • 1irow(i)n1irowin and 1icol(i)n1icolin, for i = 1,2,,nnzi=1,2,,nnz;
  • either irow(i1) < irow(i)irowi-1<irowi or both irow(i1) = irow(i)irowi-1=irowi and icol(i1) < icol(i)icoli-1<icoli, for i = 2,3,,nnzi=2,3,,nnz.
6:     istb(nb + 1nb+1) – int64int32nag_int array
istb(b)istbb, for b = 1,2,,nbb=1,2,,nb, holds the index in arrays indb, ipivp, ipivq and idiag defining block bb. istb(nb + 1)istbnb+1 holds the sum of the number of rows in all blocks plus istb(1)istb1.
Constraint: istb(1)1, istb(b) < istb(b + 1) istb11, istbb< istbb+1 , for b = 1,2,,nbb=1,2,,nb.
7:     indb(lindb) – int64int32nag_int array
lindb, the dimension of the array, must satisfy the constraint lindbistb(nb + 1)1lindbistbnb+1-1.
indb must hold the row indices appearing in each diagonal block, stored consecutively. Thus the elements indb(istb(b))indbistbb to indb(istb(b + 1)1)indbistbb+1-1 are the row indices in the bbth block.
Constraint: 1indb(m)n1indbmn, for m = 1,2,,istb(nb + 1)1m=1,2,,istbnb+1-1.
8:     lfill(nb) – int64int32nag_int array
nb, the dimension of the array, must satisfy the constraint 1nbn1nbn.
If lfill(b)0lfillb0 its value is the maximum level of fill allowed in the decomposition of the block bb (see Section [Control of Fill-in] in (f11dn)). A negative value of lfill(b)lfillb indicates that dtol(b)dtolb will be used to control the fill in block bb instead.
9:     dtol(nb) – double array
nb, the dimension of the array, must satisfy the constraint 1nbn1nbn.
If lfill(b) < 0lfillb<0 then dtol(b)dtolb is used as a drop tolerance in block bb to control the fill-in (see Section [Control of Fill-in] in (f11dn)); otherwise dtol(b)dtolb is not referenced.
Constraint: if lfill(b) < 0lfillb<0, dtol(b)0.0dtolb0.0, for b = 1,2,,nbb=1,2,,nb.
10:   milu(nb) – cell array of strings
nb, the dimension of the array, must satisfy the constraint 1nbn1nbn.
milu(b)milub, for b = 1,2,,nbb=1,2,,nb, indicates whether or not the factorization in block bb should be modified to preserve row-sums (see Section [Choice of s] in (f11dn)).
milu(b) = 'M'milub='M'
The factorization is modified.
milu(b) = 'N'milub='N'
The factorization is not modified.
Constraint: milu(b) = 'M'milub='M' or 'N''N', for b = 1,2,,nbb=1,2,,nb.
11:   ipivp(lindb) – int64int32nag_int array
12:   ipivq(lindb) – int64int32nag_int array
lindb, the dimension of the array, must satisfy the constraint lindbistb(nb + 1)1lindbistbnb+1-1.
If pstrat(b) = 'U'pstratb='U', then ipivp(istb(b) + k1)ipivpistbb+k-1 and ipivq(istb(b) + k1)ipivqistbb+k-1 must specify the row and column indices of the element used as a pivot at elimination stage kk of the factorization of block bb. Otherwise ipivp and ipivq need not be initialized.
Constraint: if pstrat(b) = 'U'pstratb='U', the elements istb(b)istbb to istb(b + 1)1istbb+1-1 of ipivp and ipivq must both hold valid permutations of the integers on [1,istb(b + 1)istb(b)][1,istbb+1-istbb].

Optional Input Parameters

1:     la – int64int32nag_int scalar
Default: The dimension of the arrays a, irow, icol. (An error is raised if these dimensions are not equal.)
The dimension of the arrays a, irow and icol as declared in the (sub)program from which nag_sparse_complex_gen_precon_bdilu (f11dt) is called. These arrays must be of sufficient size to store both AA (nnz elements) and CC (nnzc elements).
Constraint: la2 × nnzla2×nnz.
2:     nb – int64int32nag_int scalar
Default: The dimension of the arrays lfill, dtol, pstrat, milu. (An error is raised if these dimensions are not equal.)
The number of diagonal blocks to factorize.
Constraint: 1nbn1nbn.
3:     lindb – int64int32nag_int scalar
Default: The dimension of the arrays indb, ipivp, ipivq. (An error is raised if these dimensions are not equal.)
The dimension of the arrays indb, ipivp, ipivq and idiag as declared in the (sub)program from which nag_sparse_complex_gen_precon_bdilu (f11dt) is called.
Constraint: lindbistb(nb + 1)1lindbistbnb+1-1.
4:     pstrat(nb) – cell array of strings
pstrat(b)pstratb, for b = 1,2,,nbb=1,2,,nb, specifies the pivoting strategy to be adopted in block bb as follows:
pstrat(b) = 'N'pstratb='N'
No pivoting is carried out.
pstrat(b) = 'U'pstratb='U'
Pivoting is carried out according to the user-defined input values of ipivp and ipivq.
pstrat(b) = 'P'pstratb='P'
Partial pivoting by columns for stability is carried out.
pstrat(b) = 'C'pstratb='C'
Complete pivoting by rows for sparsity, and by columns for stability, is carried out.
Default: 'C''C'
Constraint: pstrat(b) = 'N'pstratb='N', 'U''U', 'P''P' or 'C''C', for b = 1,2,,nbb=1,2,,nb.

Input Parameters Omitted from the MATLAB Interface

iwork liwork

Output Parameters

1:     a(la) – complex array
The first nnz entries of a contain the nonzero elements of AA and the next nnzc entries contain the elements of the matrices CbCb, for b = 1,2,,nbb=1,2,,nb stored consecutively. Within each block the matrix elements are ordered by increasing row index, and by increasing column index within each row.
2:     irow(la) – int64int32nag_int array
3:     icol(la) – int64int32nag_int array
The row and column indices of the nonzero elements returned in a.
4:     ipivp(lindb) – int64int32nag_int array
5:     ipivq(lindb) – int64int32nag_int array
The row and column indices of the pivot elements, arranged consecutively for each block, as for indb. If ipivp(istb(b) + k1) = iipivpistbb+k-1=i and ipivq(istb(b) + k1) = jipivqistbb+k-1=j, then the element in row ii and column jj of AbAb was used as the pivot at elimination stage kk.
6:     istr(lindb + 1lindb+1) – int64int32nag_int array
istr(istb(b) + k1)istristbb+k-1, gives the starting address in the arrays a, irow and icol of row kk of the matrix CbCb, for b = 1,2,,nbb=1,2,,nb and k = 1,2,,istb(b + 1)istb(b)k=1,2,,istbb+1-istbb.
istr(istb(nb + 1))istristbnb+1 contains nnz + nnzc + 1nnz+nnzc+1.
7:     idiag(lindb) – int64int32nag_int array
idiag(istb(b) + k1)idiagistbb+k-1, gives the address in the arrays a, irow and icol of the diagonal element in row kk of the matrix CbCb, for b = 1,2,,nbb=1,2,,nb and k = 1,2,,istb(b + 1)istb(b)k=1,2,,istbb+1-istbb.
8:     nnzc – int64int32nag_int scalar
The sum total number of nonzero elements in the matrices CbCb, for b = 1,2,,nbb=1,2,,nb.
9:     npivm(nb) – int64int32nag_int array
If npivm(b) > 0npivmb>0 it gives the number of pivots which were modified during the factorization to ensure that MbMb exists.
If npivm(b) = 1npivmb=-1 no pivot modifications were required, but a local restart occurred (see Section [Algorithmic Details] in (f11dn)). The quality of the preconditioner will generally depend on the returned values of npivm(b)npivmb, for b = 1,2,,nbb=1,2,,nb.
If npivm(b)npivmb is large, for some bb, the preconditioner may not be satisfactory. In this case it may be advantageous to call nag_sparse_complex_gen_precon_bdilu (f11dt) again with an increased value of lfill(b)lfillb, a reduced value of dtol(b)dtolb, or pstrat(b) = 'C'pstratb='C'.
10:   ifail – int64int32nag_int scalar
ifail = 0ifail=0 unless the function detects an error (see [Error Indicators and Warnings]).

Error Indicators and Warnings

Errors or warnings detected by the function:
  ifail = 1ifail=1
Constraint: 1indb(b)n1indbbn for all bb.
Constraint: 1nbn1nbn.
Constraint: dtol(b)0.0dtolb0.0 for all bb.
Constraint: istb(1)1istb11.
Constraint: istb(b + 1) > istb(b)istbb+1>istbb for all bb.
Constraint: la2 × nnzla2×nnz.
Constraint: lindbistb(nb + 1)1lindbistbnb+1-1.
Constraint: milu(b) = 'M'milub='M' or 'N''N' for all bb.
Constraint: nnzn2nnzn2.
Constraint: nnz1nnz1.
Constraint: n1n1.
Constraint: pstrat(b) = 'N'pstratb='N', 'U''U', 'P''P' or 'C''C' for all bb.
liwork is too small.
  ifail = 2ifail=2
Constraint: 1icol(j)n1icoljn for all jj.
Constraint: 1irow(i)n1irowin for all ii.
On entry, element __ of a was out of order.
On entry, location __ of (irow,icol)(irow,icol) was a duplicate.
  ifail = 3ifail=3
On entry, the user-supplied value of ipivp for block __ lies outside the range [1,n][1,n].
On entry, the user-supplied value of ipivp for block __ was repeated.
On entry, the user-supplied value of ipivq for block __ lies outside the range [1,n][1,n].
On entry, the user-supplied value of ipivq for block __ was repeated.
  ifail = 4ifail=4
The number of nonzero entries in the decomposition is too large.
The decomposition has been terminated before completion.
Either increase la, or reduce the fill by reducing lfill, or increasing dtol.

Accuracy

The accuracy of the factorization of each block AbAb will be determined by the size of the elements that are dropped and the size of any modifications made to the pivot elements. If these sizes are small then the computed factors will correspond to a matrix close to AbAb. The factorization can generally be made more accurate by increasing the level of fill lfill(b)lfillb, or by reducing the drop tolerance dtol(b)dtolb with lfill(b) < 0lfillb<0.
If nag_sparse_complex_gen_precon_bdilu (f11dt) is used in combination with nag_sparse_complex_gen_basic_solver (f11bs) or nag_sparse_complex_gen_solve_bdilu (f11du), the more accurate the factorization the fewer iterations will be required. However, the cost of the decomposition will also generally increase.

Further Comments

nag_sparse_complex_gen_precon_bdilu (f11dt) calls nag_sparse_complex_gen_precon_ilu (f11dn) internally for each block AbAb. The comments and advice provided in Section [Further Comments] in (f11dn) on timing, control of fill, algorithmic details, and choice of parameters, are all therefore relevant to nag_sparse_complex_gen_precon_bdilu (f11dt), if interpreted blockwise.

Example

function nag_sparse_complex_gen_precon_bdilu_example
n   = int64(9);
nz  = int64(33);
a    = zeros(20*nz, 1);
irow = zeros(20*nz, 1, 'int64');
icol = zeros(20*nz, 1, 'int64');
a(1:nz) = [ 96.0 + -64.0i; -20.0 +  22.0i; -36.0 +  14.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -36.0 +  14.0i; -12.0 +  10.0i;  96.0 + -64.0i; ...
           -36.0 +  14.0i; -28.0 +  18.0i;  96.0 + -64.0i; ...
           -20.0 +  22.0i; -36.0 +  14.0i; -28.0 +  18.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -36.0 +  14.0i; -28.0 +  18.0i; -12.0 +  10.0i; ...
            96.0 + -64.0i; -36.0 +  14.0i; -28.0 +  18.0i; ...
            96.0 + -64.0i; -20.0 +  22.0i; -28.0 +  18.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -28.0 +  18.0i; -12.0 +  10.0i;  96.0 + -64.0i];
irow(1:nz) = [int64(1); 1; 1; 2; 2; 2; 2; 3; 3; 3; 4; 4; 4; 4; 5; 5; 5; ...
              5; 5; 6; 6; 6; 6; 7; 7; 7; 8; 8; 8; 8; 9; 9; 9];
icol(1:nz) = [int64(1); 2; 4; 1; 2; 3; 5; 2; 3; 6; 1; 4; 5; 7; 2; 4; 5; ...
              6; 8; 3; 5; 6; 9; 4; 7; 8; 5; 7; 8; 9; 6; 8; 9];
nb = 3;
nover = 1;
lfill = [int64(0); 0; 0];
dtol = [0; 0; 0];
pstrat = {'n'; 'n'; 'n'};
milu   = {'n'; 'n'; 'n'};

% Define diagonal block indices.
% In this example use blocks of MB consecutive rows and initialise
% assuming no overlap.
mb = idivide(n+nb-1, int64(nb));
istb = zeros(nb+1, 1, 'int64');
indb = zeros(3*n, 1, 'int64');
ipivp = zeros(3*n, 1, 'int64');
ipivq = zeros(3*n, 1, 'int64');
for k=1:nb
  istb(k) = (k-1)*mb+1;
end
istb(nb+1) = n+1;
for i=1:n
  indb(i) = i;
end

% Modify indb and istb to account for overlap.
[istb, indb, ifail] = nag_sparse_complex_gen_precon_bdilu_overlap(n, nz, ...
                                 irow, icol, nb, istb, indb, 3*n, nover);
if (ifail == -999)
  error('indb is too small, size of indb = %d', numel(indb));
end

% Output matrix and blocking details
fprintf('\nOriginal matrix\n');
fprintf(' n   = %d\n', n);
fprintf(' nz  = %d\n', nz);
fprintf(' nb  = %d\n', nb);
for k=1:nb
  fprintf(' Block %d: order = %d, start row = %d\n', k, istb(k+1)-istb(k), ...
              min(indb(istb(k):istb(k+1)-1)));
end

% Calculate Factorisation
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = ...
     nag_sparse_complex_gen_precon_bdilu(n, nz, a, irow, icol, istb, indb, ...
           lfill, dtol, milu, ipivp, ipivq, 'pstrat', pstrat);
% Output details of the factorization
fprintf('\nFactorization\n');
fprintf(' nnzc = %d\n\n', nnzc);
fprintf(' Elements of factorization\n\n');
fprintf('        I   J               C(I,J)              Index\n');
for k=1:nb
  fprintf(' C_%d   ----------------------------------------------\n', k);
  % Elements of the k-th block
  for i = istr(istb(k)):istr(istb(k+1))-1
    fprintf('     %4d%4d   (%13.5e,%13.5e)%8d\n', irow(i), icol(i), ...
                                               real(a(i)), imag(a(i)), i);
  end
end

fprintf('\n Details of factorized blocks\n\n');
if max(npivm) > 0
  % Including pivoting details.
  fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)  IPIVP(I)  IPIVQ(I)\n');
  for k=1:nb
    i = istb(k);
    fprintf(' %4d%4d%10d%10d%10d%10d%10d\n', k, i, istr(i), idiag(i), ...
            indb(i), ipivp(i), ipivq(i));
    for i = istb(k)+1:istb(k+1)-1
      fprintf(' %7d%10d%10d%10d%10d%10d\n', i, istr(i), idiag(i), ...
              indb(i), ipivp(i), ipivq(i));
    end
    fprintf(' ------------------------------------\n');
  end
else
  % No pivoting on any block.
  fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)\n');
  for k=1:nb
    i = istb(k);
    fprintf('%3d%4d%10d%10d%10d\n', k, i, istr(i), idiag(i), indb(i));
    for i = istb(k)+1:istb(k+1)-1
      fprintf('%7d%10d%10d%10d\n', i, istr(i), idiag(i), indb(i));
    end
    fprintf(' ------------------------------------\n');
  end
end



function [istb, indb, ifail] = nag_sparse_complex_gen_precon_bdilu_overlap(n, ...
                                  nz, irow, icol, nb, istb, indb, lindb, nover)

  ifail = 0;

  % This function takes a set of row indices indb defining the diagonal blocks
  % to be used in nag_sparse_complex_gen_precon_bdilu to define a block Jacobi
  % or additive Schwarz preconditioner, and expands them to allow for nover
  % levels of overlap.  The pointer array istb is also updated accordingly, so
  % that the returned values of istb and indb can be passed to
  % nag_sparse_complex_gen_precon_bdilu to define overlapping diagonal blocks.
  iwork = zeros(3*n+1, 1, 'int64');

  % Find the number of non-zero elements in each row of the matrix A, and
  % the start address of each row. Store the start addresses in
  % iwork(n+1,...,2*n+1).
  for k=1:nz
    iwork(irow(k)) = iwork(irow(k)) + 1;
  end
  iwork(n+1) = 1;
  for i = 1:n
    iwork(n+i+1) = iwork(n+i) + iwork(i);
  end

  % Loop over blocks
  for k=1:nb
    % Initialize marker array.
    iwork(1:n) = 0;

    % Mark the rows already in block k in the workspace array.
    for l = istb(k):istb(k+1)-1
      iwork(indb(l)) = 1;
    end

    % Loop over levels of overlap.
    for iover=1:nover
      % Initialize counter of new row indices to be added.
      ind = 0;

      % Loop over the rows currently in the diagonal block.
      for l = istb(k):istb(k+1)-1
        row = indb(l);

        % Loop over non-zero elements in row
        for i = iwork(n+row):iwork(n+row+1)-1

          % If the column index of the non-zero element is not in the
          % existing set for this block, store it to be added later, and
          % mark it in the marker array.
          if (iwork(icol(i))==0)
            iwork(icol(i)) = 1;
            ind = ind + 1;
            iwork(2*n+1+ind) = icol(i);
          end
        end
      end

      % Shift the indices in indb and add the new entries for block k.
      % Change istb accordingly.
      nadd = ind;
      if (istb(nb+1)+nadd-1>lindb) Then
        ifail = -999;
        return;
      end

      for i = istb(nb+1) - 1:-1:istb(k+1)
        indb(i+nadd) = indb(i);
      end
      n21 = 2*n + 1;
      ik = istb(k+1) - 1;
      indb(ik+1:ik+nadd) = iwork(n21+1:n21+nadd);
      istb(k+1:nb+1) = istb(k+1:nb+1) + nadd;
    end
  end
 

Original matrix
 n   = 9
 nz  = 33
 nb  = 3
 Block 1: order = 6, start row = 1
 Block 2: order = 9, start row = 1
 Block 3: order = 6, start row = 4

Factorization
 nnzc = 73

 Elements of factorization

        I   J               C(I,J)              Index
 C_1   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      34
        1   2   ( -2.50000e-01,  6.25000e-02)      35
        1   4   ( -3.26923e-01, -7.21154e-02)      36
        2   1   ( -1.34615e-01,  1.44231e-02)      37
        2   2   (  7.51634e-03,  4.87709e-03)      38
        2   3   ( -2.57623e-01,  6.78176e-02)      39
        2   5   ( -3.38867e-01, -7.03465e-02)      40
        3   2   ( -1.38967e-01,  1.66383e-02)      41
        3   3   (  7.52786e-03,  4.87530e-03)      42
        3   6   ( -3.39257e-01, -7.01208e-02)      43
        4   1   ( -2.88462e-01, -4.80769e-03)      44
        4   4   (  7.82358e-03,  5.49946e-03)      45
        4   5   ( -2.77460e-01,  6.21296e-02)      46
        5   2   ( -2.98245e-01, -1.26443e-03)      47
        5   4   ( -1.48878e-01,  1.22423e-02)      48
        5   5   (  8.26388e-03,  5.64193e-03)      49
        5   6   ( -2.89400e-01,  6.89668e-02)      50
        6   3   ( -2.98536e-01, -1.00694e-03)      51
        6   5   ( -1.55586e-01,  1.49357e-02)      52
        6   6   (  8.28693e-03,  5.64169e-03)      53
 C_2   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      54
        1   2   ( -2.50000e-01,  6.25000e-02)      55
        1   4   ( -2.88462e-01, -4.80769e-03)      56
        1   5   ( -3.26923e-01, -7.21154e-02)      57
        2   1   ( -1.34615e-01,  1.44231e-02)      58
        2   2   (  7.51634e-03,  4.87709e-03)      59
        2   3   ( -2.57623e-01,  6.78176e-02)      60
        2   6   ( -2.98245e-01, -1.26443e-03)      61
        2   7   ( -3.38867e-01, -7.03465e-02)      62
        3   2   ( -1.38967e-01,  1.66383e-02)      63
        3   3   (  7.52786e-03,  4.87530e-03)      64
        3   8   ( -2.98536e-01, -1.00694e-03)      65
        3   9   ( -3.39257e-01, -7.01208e-02)      66
        4   1   ( -3.26923e-01, -7.21154e-02)      67
        4   4   (  7.82358e-03,  5.49946e-03)      68
        4   6   ( -2.77460e-01,  6.21296e-02)      69
        5   1   ( -2.88462e-01, -4.80769e-03)      70
        5   5   (  7.82358e-03,  5.49946e-03)      71
        5   7   ( -2.77460e-01,  6.21296e-02)      72
        6   2   ( -3.38867e-01, -7.03465e-02)      73
        6   4   ( -1.48878e-01,  1.22423e-02)      74
        6   6   (  8.26388e-03,  5.64193e-03)      75
        6   8   ( -2.89400e-01,  6.89668e-02)      76
        7   2   ( -2.98245e-01, -1.26443e-03)      77
        7   5   ( -1.48878e-01,  1.22423e-02)      78
        7   7   (  8.26388e-03,  5.64193e-03)      79
        7   9   ( -2.89400e-01,  6.89668e-02)      80
        8   3   ( -3.39257e-01, -7.01208e-02)      81
        8   6   ( -1.55586e-01,  1.49357e-02)      82
        8   8   (  8.28693e-03,  5.64169e-03)      83
        9   3   ( -2.98536e-01, -1.00694e-03)      84
        9   7   ( -1.55586e-01,  1.49357e-02)      85
        9   9   (  8.28693e-03,  5.64169e-03)      86
 C_3   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      87
        1   2   ( -2.50000e-01,  6.25000e-02)      88
        1   4   ( -2.88462e-01, -4.80769e-03)      89
        2   1   ( -1.34615e-01,  1.44231e-02)      90
        2   2   (  7.51634e-03,  4.87709e-03)      91
        2   3   ( -2.57623e-01,  6.78176e-02)      92
        2   5   ( -2.98245e-01, -1.26443e-03)      93
        3   2   ( -1.38967e-01,  1.66383e-02)      94
        3   3   (  7.52786e-03,  4.87530e-03)      95
        3   6   ( -2.98536e-01, -1.00694e-03)      96
        4   1   ( -3.26923e-01, -7.21154e-02)      97
        4   4   (  7.82358e-03,  5.49946e-03)      98
        4   5   ( -2.77460e-01,  6.21296e-02)      99
        5   2   ( -3.38867e-01, -7.03465e-02)     100
        5   4   ( -1.48878e-01,  1.22423e-02)     101
        5   5   (  8.26388e-03,  5.64193e-03)     102
        5   6   ( -2.89400e-01,  6.89668e-02)     103
        6   3   ( -3.39257e-01, -7.01208e-02)     104
        6   5   ( -1.55586e-01,  1.49357e-02)     105
        6   6   (  8.28693e-03,  5.64169e-03)     106

 Details of factorized blocks

  K   I      ISTR(I)  IDIAG(I)   INDB(I)
  1   1        34        34         1
      2        37        38         2
      3        41        42         3
      4        44        45         4
      5        47        49         5
      6        51        53         6
 ------------------------------------
  2   7        54        54         4
      8        58        59         5
      9        63        64         6
     10        67        68         1
     11        70        71         7
     12        73        75         2
     13        77        79         8
     14        81        83         3
     15        84        86         9
 ------------------------------------
  3  16        87        87         7
     17        90        91         8
     18        94        95         9
     19        97        98         4
     20       100       102         5
     21       104       106         6
 ------------------------------------

function f11dt_example
n   = int64(9);
nz  = int64(33);
a    = zeros(20*nz, 1);
irow = zeros(20*nz, 1, 'int64');
icol = zeros(20*nz, 1, 'int64');
a(1:nz) = [ 96.0 + -64.0i; -20.0 +  22.0i; -36.0 +  14.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -36.0 +  14.0i; -12.0 +  10.0i;  96.0 + -64.0i; ...
           -36.0 +  14.0i; -28.0 +  18.0i;  96.0 + -64.0i; ...
           -20.0 +  22.0i; -36.0 +  14.0i; -28.0 +  18.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -36.0 +  14.0i; -28.0 +  18.0i; -12.0 +  10.0i; ...
            96.0 + -64.0i; -36.0 +  14.0i; -28.0 +  18.0i; ...
            96.0 + -64.0i; -20.0 +  22.0i; -28.0 +  18.0i; ...
           -12.0 +  10.0i;  96.0 + -64.0i; -20.0 +  22.0i; ...
           -28.0 +  18.0i; -12.0 +  10.0i;  96.0 + -64.0i];
irow(1:nz) = [int64(1); 1; 1; 2; 2; 2; 2; 3; 3; 3; 4; 4; 4; 4; 5; 5; 5; ...
               5; 5; 6; 6; 6; 6; 7; 7; 7; 8; 8; 8; 8; 9; 9; 9];
icol(1:nz) = [int64(1); 2; 4; 1; 2; 3; 5; 2; 3; 6; 1; 4; 5; 7; 2; 4; 5; ...
               6; 8; 3; 5; 6; 9; 4; 7; 8; 5; 7; 8; 9; 6; 8; 9];
nb = 3;
nover = 1;
lfill = [int64(0); 0; 0];
dtol = [0; 0; 0];
pstrat = {'n'; 'n'; 'n'};
milu   = {'n'; 'n'; 'n'};

% Define diagonal block indices.
% In this example use blocks of MB consecutive rows and initialise
% assuming no overlap.
mb = idivide(n+nb-1, int64(nb));
istb = zeros(nb+1, 1, 'int64');
indb = zeros(3*n, 1, 'int64');
ipivp = zeros(3*n, 1, 'int64');
ipivq = zeros(3*n, 1, 'int64');
for k=1:nb
  istb(k) = (k-1)*mb+1;
end
istb(nb+1) = n+1;
for i=1:n
  indb(i) = i;
end

% Modify indb and istb to account for overlap.
[istb, indb, ifail] = f11dt_overlap(n, nz, irow, icol, nb, ...
                                    istb, indb, 3*n, nover);
if (ifail == -999)
  error('indb is too small, size of indb = %d', numel(indb));
end

% Output matrix and blocking details
fprintf('\nOriginal matrix\n');
fprintf(' n   = %d\n', n);
fprintf(' nz  = %d\n', nz);
fprintf(' nb  = %d\n', nb);
for k=1:nb
  fprintf(' Block %d: order = %d, start row = %d\n', k, istb(k+1)-istb(k), ...
              min(indb(istb(k):istb(k+1)-1)));
end

% Calculate Factorisation
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = ...
     f11dt(n, nz, a, irow, icol, istb, indb, ...
           lfill, dtol, milu, ipivp, ipivq, 'pstrat', pstrat);
% Output details of the factorization
fprintf('\nFactorization\n');
fprintf(' nnzc = %d\n\n', nnzc);
fprintf(' Elements of factorization\n\n');
fprintf('        I   J               C(I,J)              Index\n');
for k=1:nb
  fprintf(' C_%d   ----------------------------------------------\n', k);
  % Elements of the k-th block
  for i = istr(istb(k)):istr(istb(k+1))-1
    fprintf('     %4d%4d   (%13.5e,%13.5e)%8d\n', irow(i), icol(i), ...
                                               real(a(i)), imag(a(i)), i);
  end
end

fprintf('\n Details of factorized blocks\n\n');
if max(npivm) > 0
  % Including pivoting details.
  fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)  IPIVP(I)  IPIVQ(I)\n');
  for k=1:nb
    i = istb(k);
    fprintf(' %4d%4d%10d%10d%10d%10d%10d\n', k, i, istr(i), idiag(i), ...
            indb(i), ipivp(i), ipivq(i));
    for i = istb(k)+1:istb(k+1)-1
      fprintf(' %7d%10d%10d%10d%10d%10d\n', i, istr(i), idiag(i), ...
              indb(i), ipivp(i), ipivq(i));
    end
    fprintf(' ------------------------------------\n');
  end
else
  % No pivoting on any block.
  fprintf('  K   I      ISTR(I)  IDIAG(I)   INDB(I)\n');
  for k=1:nb
    i = istb(k);
    fprintf('%3d%4d%10d%10d%10d\n', k, i, istr(i), idiag(i), indb(i));
    for i = istb(k)+1:istb(k+1)-1
      fprintf('%7d%10d%10d%10d\n', i, istr(i), idiag(i), indb(i));
    end
    fprintf(' ------------------------------------\n');
  end
end



function [istb, indb, ifail] = f11dt_overlap(n, nz, irow, icol, nb, ...
                                             istb, indb, lindb, nover)

  ifail = 0;

  % This function takes a set of row indices indb defining the diagonal
  % blocks to be used in f11dt to define a block Jacobi or additive Schwarz
  % preconditioner, and expands them to allow for nover levels of overlap.
  % The pointer array istb is also updated accordingly, so that the returned
  % values of istb and indb can be passed to f11dt to define overlapping
  % diagonal blocks.
  iwork = zeros(3*n+1, 1, 'int64');

  % Find the number of non-zero elements in each row of the matrix A, and
  % the start address of each row. Store the start addresses in
  % iwork(n+1,...,2*n+1).
  for k=1:nz
    iwork(irow(k)) = iwork(irow(k)) + 1;
  end
  iwork(n+1) = 1;
  for i = 1:n
    iwork(n+i+1) = iwork(n+i) + iwork(i);
  end

  % Loop over blocks
  for k=1:nb
    % Initialize marker array.
    iwork(1:n) = 0;

    % Mark the rows already in block k in the workspace array.
    for l = istb(k):istb(k+1)-1
      iwork(indb(l)) = 1;
    end

    % Loop over levels of overlap.
    for iover=1:nover
      % Initialize counter of new row indices to be added.
      ind = 0;

      % Loop over the rows currently in the diagonal block.
      for l = istb(k):istb(k+1)-1
        row = indb(l);

        % Loop over non-zero elements in row
        for i = iwork(n+row):iwork(n+row+1)-1

          % If the column index of the non-zero element is not in the
          % existing set for this block, store it to be added later, and
          % mark it in the marker array.
          if (iwork(icol(i))==0)
            iwork(icol(i)) = 1;
            ind = ind + 1;
            iwork(2*n+1+ind) = icol(i);
          end
        end
      end

      % Shift the indices in indb and add the new entries for block k.
      % Change istb accordingly.
      nadd = ind;
      if (istb(nb+1)+nadd-1>lindb) Then
        ifail = -999;
        return;
      end

      for i = istb(nb+1) - 1:-1:istb(k+1)
        indb(i+nadd) = indb(i);
      end
      n21 = 2*n + 1;
      ik = istb(k+1) - 1;
      indb(ik+1:ik+nadd) = iwork(n21+1:n21+nadd);
      istb(k+1:nb+1) = istb(k+1:nb+1) + nadd;
    end
  end
 

Original matrix
 n   = 9
 nz  = 33
 nb  = 3
 Block 1: order = 6, start row = 1
 Block 2: order = 9, start row = 1
 Block 3: order = 6, start row = 4

Factorization
 nnzc = 73

 Elements of factorization

        I   J               C(I,J)              Index
 C_1   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      34
        1   2   ( -2.50000e-01,  6.25000e-02)      35
        1   4   ( -3.26923e-01, -7.21154e-02)      36
        2   1   ( -1.34615e-01,  1.44231e-02)      37
        2   2   (  7.51634e-03,  4.87709e-03)      38
        2   3   ( -2.57623e-01,  6.78176e-02)      39
        2   5   ( -3.38867e-01, -7.03465e-02)      40
        3   2   ( -1.38967e-01,  1.66383e-02)      41
        3   3   (  7.52786e-03,  4.87530e-03)      42
        3   6   ( -3.39257e-01, -7.01208e-02)      43
        4   1   ( -2.88462e-01, -4.80769e-03)      44
        4   4   (  7.82358e-03,  5.49946e-03)      45
        4   5   ( -2.77460e-01,  6.21296e-02)      46
        5   2   ( -2.98245e-01, -1.26443e-03)      47
        5   4   ( -1.48878e-01,  1.22423e-02)      48
        5   5   (  8.26388e-03,  5.64193e-03)      49
        5   6   ( -2.89400e-01,  6.89668e-02)      50
        6   3   ( -2.98536e-01, -1.00694e-03)      51
        6   5   ( -1.55586e-01,  1.49357e-02)      52
        6   6   (  8.28693e-03,  5.64169e-03)      53
 C_2   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      54
        1   2   ( -2.50000e-01,  6.25000e-02)      55
        1   4   ( -2.88462e-01, -4.80769e-03)      56
        1   5   ( -3.26923e-01, -7.21154e-02)      57
        2   1   ( -1.34615e-01,  1.44231e-02)      58
        2   2   (  7.51634e-03,  4.87709e-03)      59
        2   3   ( -2.57623e-01,  6.78176e-02)      60
        2   6   ( -2.98245e-01, -1.26443e-03)      61
        2   7   ( -3.38867e-01, -7.03465e-02)      62
        3   2   ( -1.38967e-01,  1.66383e-02)      63
        3   3   (  7.52786e-03,  4.87530e-03)      64
        3   8   ( -2.98536e-01, -1.00694e-03)      65
        3   9   ( -3.39257e-01, -7.01208e-02)      66
        4   1   ( -3.26923e-01, -7.21154e-02)      67
        4   4   (  7.82358e-03,  5.49946e-03)      68
        4   6   ( -2.77460e-01,  6.21296e-02)      69
        5   1   ( -2.88462e-01, -4.80769e-03)      70
        5   5   (  7.82358e-03,  5.49946e-03)      71
        5   7   ( -2.77460e-01,  6.21296e-02)      72
        6   2   ( -3.38867e-01, -7.03465e-02)      73
        6   4   ( -1.48878e-01,  1.22423e-02)      74
        6   6   (  8.26388e-03,  5.64193e-03)      75
        6   8   ( -2.89400e-01,  6.89668e-02)      76
        7   2   ( -2.98245e-01, -1.26443e-03)      77
        7   5   ( -1.48878e-01,  1.22423e-02)      78
        7   7   (  8.26388e-03,  5.64193e-03)      79
        7   9   ( -2.89400e-01,  6.89668e-02)      80
        8   3   ( -3.39257e-01, -7.01208e-02)      81
        8   6   ( -1.55586e-01,  1.49357e-02)      82
        8   8   (  8.28693e-03,  5.64169e-03)      83
        9   3   ( -2.98536e-01, -1.00694e-03)      84
        9   7   ( -1.55586e-01,  1.49357e-02)      85
        9   9   (  8.28693e-03,  5.64169e-03)      86
 C_3   ----------------------------------------------
        1   1   (  7.21154e-03,  4.80769e-03)      87
        1   2   ( -2.50000e-01,  6.25000e-02)      88
        1   4   ( -2.88462e-01, -4.80769e-03)      89
        2   1   ( -1.34615e-01,  1.44231e-02)      90
        2   2   (  7.51634e-03,  4.87709e-03)      91
        2   3   ( -2.57623e-01,  6.78176e-02)      92
        2   5   ( -2.98245e-01, -1.26443e-03)      93
        3   2   ( -1.38967e-01,  1.66383e-02)      94
        3   3   (  7.52786e-03,  4.87530e-03)      95
        3   6   ( -2.98536e-01, -1.00694e-03)      96
        4   1   ( -3.26923e-01, -7.21154e-02)      97
        4   4   (  7.82358e-03,  5.49946e-03)      98
        4   5   ( -2.77460e-01,  6.21296e-02)      99
        5   2   ( -3.38867e-01, -7.03465e-02)     100
        5   4   ( -1.48878e-01,  1.22423e-02)     101
        5   5   (  8.26388e-03,  5.64193e-03)     102
        5   6   ( -2.89400e-01,  6.89668e-02)     103
        6   3   ( -3.39257e-01, -7.01208e-02)     104
        6   5   ( -1.55586e-01,  1.49357e-02)     105
        6   6   (  8.28693e-03,  5.64169e-03)     106

 Details of factorized blocks

  K   I      ISTR(I)  IDIAG(I)   INDB(I)
  1   1        34        34         1
      2        37        38         2
      3        41        42         3
      4        44        45         4
      5        47        49         5
      6        51        53         6
 ------------------------------------
  2   7        54        54         4
      8        58        59         5
      9        63        64         6
     10        67        68         1
     11        70        71         7
     12        73        75         2
     13        77        79         8
     14        81        83         3
     15        84        86         9
 ------------------------------------
  3  16        87        87         7
     17        90        91         8
     18        94        95         9
     19        97        98         4
     20       100       102         5
     21       104       106         6
 ------------------------------------


PDF version (NAG web site, 64-bit version, 64-bit version)
Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013