NAG Library Routine Document
d01esf (md_sgq_multi_vec)
1
Purpose
d01esf approximates a vector of definite integrals
$\mathbf{F}$ over the unit hypercube
$\Omega ={\left[0,1\right]}^{d}$, given the vector of integrands
$\mathbf{f}\left(\mathbf{x}\right)$.
The routine uses a sparse grid discretisation, allowing for computationally feasible estimations of integrals of high dimension ($d\sim O\left(100\right)$).
2
Specification
Fortran Interface
Subroutine d01esf ( 
ni, ndim, f, maxdlv, dinest, errest, ivalid, iopts, opts, iuser, ruser, ifail) 
Integer, Intent (In)  ::  ni, ndim, maxdlv(ndim), iopts(100)  Integer, Intent (Inout)  ::  iuser(*), ifail  Integer, Intent (Out)  ::  ivalid(ni)  Real (Kind=nag_wp), Intent (In)  ::  opts(100)  Real (Kind=nag_wp), Intent (Inout)  ::  ruser(*)  Real (Kind=nag_wp), Intent (Out)  ::  dinest(ni), errest(ni)  External  ::  f 

C Header Interface
#include nagmk26.h
void 
d01esf_ (const Integer *ni, const Integer *ndim, void (NAG_CALL *f)(const Integer *ni, const Integer *ndim, const Integer *nx, const double *xtr, const Integer *nntr, const Integer icolzp[], const Integer irowix[], const double xs[], const Integer qs[], double fm[], Integer *iflag, Integer iuser[], double ruser[]), const Integer maxdlv[], double dinest[], double errest[], Integer ivalid[], const Integer iopts[], const double opts[], Integer iuser[], double ruser[], Integer *ifail) 

3
Description
d01esf uses a sparse grid to generate a vector of approximations
$\hat{\mathbf{F}}$ to a vector of integrals
$\mathbf{F}$ over the unit hypercube
$\Omega ={\left[0,1\right]}^{d}$, that is,
3.1
Comparing Quadrature Over Full and Sparse Grids
Before illustrating the sparse grid construction, it is worth comparing integration over a sparse grid to integration over a full grid.
Given a onedimensional quadrature rule with $N$ abscissae, which accurately evaluates a polynomial of order ${P}_{N}$, a full tensor product over $d$ dimensions, a full grid, may be constructed with ${N}^{d}$ multidimensional abscissae. Such a product will accurately integrate a polynomial where the maximum power of any dimension is ${P}_{N}$. For example if $d=2$ and ${P}_{N}=3$, such a rule will accurately integrate any polynomial whose highest order term is ${x}_{1}^{3}{x}_{2}^{3}$. Such a polynomial may be said to have a maximum combined order of ${P}_{N}^{d}$, provided no individual dimension contributes a power greater than ${P}_{N}$. However, the number of multidimensional abscissae, or points, required increases exponentially with the dimension, rapidly making such a construction unusable.
The sparse grid technique was developed by Smolyak (
Smolyak (1963)). In this, multiple onedimensional quadrature rules of increasing accuracy are combined in such a way as to provide a multidimensional quadrature rule which will accurately evaluate the integral of a polynomial whose maximum order appears as a monomial. Hence a sparse grid construction whose highest level quadrature rule has polynomial order
${P}_{N}$ will accurately integrate a polynomial whose maximum combined order is also
${P}_{N}$. Again taking
${P}_{N}=3$, one may, theoretically, accurately integrate a polynomial such as
${x}^{3}+{x}^{2}y+{y}^{3}$, but not a polynomial such as
${x}^{3}{y}^{3}+xy$. Whilst this has a lower maximum combined order than the full tensor product, the number of abscissae required increases significantly slower than the equivalent full grid, making some classes of integrals of dimension
$d\sim \mathit{O}\left(100\right)$ tractable. Specifically, if a onedimensional quadrature rule of level
$\ell $ has
$N\sim \mathit{O}\left({2}^{\ell}\right)$ abscissae, the corresponding full grid will have
$\mathit{O}\left({\left({2}^{\ell}\right)}^{d}\right)$ multidimensional abscissae, whereas the sparse grid will have
$\mathit{O}\left({2}^{\ell}{d}^{\ell 1}\right)$.
Figure 1 demonstrates this using a Gauss–Patterson rule with
$15$ points in
$3$ dimensions. The full grid requires
$3375$ points, whereas the sparse grid only requires
$111$.
Figure 1: Threedimensional full (left) and sparse (right) grids, constructed from the $15$ point Gauss–Patterson rule
3.2
Sparse Grid Quadrature
We now include a brief description of the sparse grid construction, sufficient for the understanding of the use of this routine. For a more detailed analysis, see
Gerstner and Griebel (1998).
Consider a onedimensional
${n}_{\ell}$point quadrature rule of level
$\ell $,
${Q}_{\ell}$. The action of this rule on a integrand
$f$ is to approximate its definite onedimensional integral
${}_{1}F$ as,
using weights
${w}_{\ell ,i}$ and abscissae
${x}_{\ell ,i}$, for
$\mathit{i}=1,2,\dots ,{n}_{\ell}$.
Now construct a set of onedimensional quadrature rules, $\left\{{Q}_{\ell}\mid \ell =1,\dots ,L\right\}$, such that the accuracy of the quadrature rule increases with the level number. In this routine we exclusively use quadrature rules which are completely nested, implying that if an abscissae ${x}_{\ell ,k}$ is in level $\ell $, it is also in level $\ell +1$. The quantity $L$ denotes some maximum level appropriate to the rules that have been selected.
Now define the action of the tensor product of
$d$ rules as,
where the individual level indices
${\ell}_{j}$ are not necessarily ordered or unique. Each tensor product of
$d$ rules defines an action of the quadrature rules
${Q}_{\mathbf{\ell}}$,
$\mathbf{\ell}=\left({\ell}_{1},{\ell}_{2},\dots ,{\ell}_{d}\right)$ over a subspace, which is given a level
$\left\mathbf{\ell}\right={\displaystyle \sum _{\mathit{j}=1}^{d}}{\ell}_{j}$. If all rule levels are equal, this is the full tensor product of that level.
The sparse grid construction of level
$\ell $ can then be declared as the sum of all actions of the quadrature differences
${\Delta}_{k}=\left({Q}_{k}{Q}_{k1}\right)$, over all subspaces having a level at most
$\ell d+1$,
By definition, all subspaces used for level $\ell 1$ must also be used for level $\ell $, and as such the difference between the result of all actions over subsequent sparse grid constructions may be used as an error estimate.
Let $L$ be the maximum level allowable in a sparse grid construction. The classical sparse grid construction of $\ell =L$ allows each dimension to support a onedimensional quadrature rule of level at most $L$, with such a quadrature rule being used in every dimension at least once. Such a construction lends equal weight to each dimension of the integration, and is termed here ‘isotropic’.
Define the set $\mathbf{m}=\left({m}_{j},j=1,2,\dots ,d\right)$, where ${m}_{j}$ is the maximum quadrature rule allowed in the $j$th dimension, and ${m}_{q}$ to be the maximum quadrature rule used by any dimension. Let a subspace be identified by its quadrature difference levels, $\mathbf{k}=\left({k}_{1},{k}_{2},\dots ,{k}_{d}\right)$.
The classical construction may be extended by allowing different dimensions to have different values ${m}_{j}$, and by allowing ${m}_{q}\le L$. This creates nonisotropic constructions. These are especially useful in higher dimensions, where some dimensions contribute more than others to the result, as they can drastically reduce the number of function evaluations required.
For example, consider the twodimensional construction with $L=4$. The classical isotropic construction would have the following subspaces.
Subspaces generated by a classical sparse grid with
$L=4$.
Level 
Subspaces 
$1$ 
$\left(1,1\right)$ 
$2$ 
$\left(2,1\right)$,
$\left(1,2\right)$ 
$3$ 
$\left(3,1\right)$,
$\left(2,2\right)$,
$\left(1,3\right)$ 
$4$ 
$\left(4,1\right)$,
$\left(3,2\right)$,
$\left(2,3\right)$,
$\left(1,4\right)$ 
If the variation in the second dimension is sufficiently accurately described by a quadrature rule of level $2$, the contributions of the subspaces $\left(1,3\right)$ and $\left(1,4\right)$ are probably negligible. Similarly, if the variation in the first dimension is sufficiently accurately described by a quadrature rule of level $3$, the subspace $\left(4,1\right)$ is probably negligible. Furthermore the subspace $\left(2,3\right)$ would also probably have negligible impact, whereas the subspaces $\left(2,2\right)$ and $\left(3,2\right)$ would not. Hence restricting the first dimension to a maximum level of $3$, and the second dimension to a maximum level of $2$ would probably give a sufficiently acceptable estimate, and would generate the following subspaces.
Subspaces generated by a nonisotropic sparse grid with
$L=4$,
${m}_{q}=3$ and
$\mathbf{m}=\left(3,2\right)$.
Level 
Subspaces 
$1$ 
$\left(1,1\right)$ 
$2$ 
$\left(2,1\right)$,
$\left(1,2\right)$ 
$3$ 
$\left(3,1\right)$,
$\left(2,2\right)$ 
$4$ 
$\left(4,1\right)$,
$\left(3,2\right)$ 
Taking this to the extreme, if the variation in the first and second dimensions are sufficiently accurately described by a level $2$ quadrature rule, restricting the maximum level of both dimensions to $2$ would generate the following subspaces.
Subspaces generated by a sparse grid construction with
$L=4$,
${m}_{q}=2$ and
$\mathbf{m}=\left(2,2\right)$.
Level 
Subspaces 
$1$ 
$\left(1,1\right)$ 
$2$ 
$\left(2,1\right)$,
$\left(1,2\right)$ 
$3$ 
$\left(2,2\right)$ 
$4$ 
None 
Hence one subspace is generated at level $3$, and no subspaces are generated at level $4$. The level $3$ subspace $\left(2,2\right)$ actually indicates that this is the full grid of level $2$.
3.3
Using d01esf
d01esf uses optional parameters, supplied in the option arrays
iopts and
opts. Before calling
d01esf, these option arrays must be initialized using
d01zkf. Once initialized, the required options may be set and queried using
d01zkf and
d01zlf respectively. A complete list of the options available may be found in
Section 11.
You may control the maximum level required,
$L$, using the optional parameter
Maximum Level. Furthermore, you may control the first level at which the error comparison will take place using the optional parameter
Minimum Level, allowing for the forced evaluation of a predetermined number of levels before the routine attempts to complete. Completion is flagged when the error estimate is sufficiently small:
where
${\epsilon}_{a}$ and
${\epsilon}_{r}$ are the absolute and relative error tolerances, respectively, and
$k\le L$ is the highest level at which computation was performed. The tolerances
${\epsilon}_{a}$ and
${\epsilon}_{r}$ can be controlled by setting the optional parameters
Absolute Tolerance and
Relative Tolerance.
Owing to the interlacing nature of the quadrature rules used herein, abscissae
$\mathbf{x}$ required in lower level subspaces will also appear in higherlevel subspaces. This allows for calculations which will be repeated later to be stored and reused. However, this is naturally at the expense of memory. It may also be at the expense of computational time, depending on the complexity of the integrands, as the lookup time for a given value is (at worst)
$\mathit{O}\left(d\right)$. Furthermore, as the sparse grid level increases, fewer subsequent levels will require values from the current level. You may control the number of levels for which values are stored by setting the optional parameter
Index Level.
Two different sets of interlacing quadrature rules are selectable using the optional parameter
Quadrature Rule: Gauss–Patterson and Clenshaw–Curtis. Gauss–Patterson rules offer greater polynomial accuracy, whereas Clenshaw–Curtis rules are often effective for oscillatory integrands. Clenshaw–Curtis rules require function values to be evaluated on the boundary of the hypercube, whereas Gauss–Patterson rules do not. Both of these rules use precomputed weights, and as such there is an effective limit on
${m}_{q}$; see the description of the optional parameter
Quadrature Rule. The value of
${m}_{q}$ is returned by the queriable optional parameter
Maximum Quadrature Level.
d01esf also allows for nonisotropic sparse grids to be constructed. This is done by appropriately setting the array
maxdlv. It should be emphasised that a nonisometric construction should only be used if the integrands behave in a suitable way. For example, they may decay toward zero as the lesser dimensions approach their bounds of
$\Omega $. It should also be noted that setting
${\mathbf{maxdlv}}\left(k\right)=1$ will not reduce the dimension of the integrals, it will simply indicate that only one point in dimension
$k$ should be used. It is also advisable to approximate the integrals several times, once with an isometric construction of some level, and then with a nonisometric construction with higher levels in various dimensions. If the difference between the solutions is significantly more than the returned error estimates, the assumptions of dimensional importance are probably incorrect.
The abscissae in each subspace are generally expressible in a sparse manner, because many of the elements of each abscissa will in fact be the centre point of the dimension, which is termed here the ‘trivial’ element. In this routine the trivial element is always returned as
$0.5$ owing to the restriction to the
$\left[0,1\right]$ hypercube. As such, the subroutine
f returns the abscissae in Compressed Column Storage (CCS) format (see the
F11 Chapter Introduction). This has particular advantages when using accelerator hardware to evaluate the required functions, as much less data must be forwarded. It also, potentially, allows for calculations to be computed faster, as any subcalculations dependent upon the trivial value may be potentially reused. See the example in
Section 10.
4
References
Caflisch R E, Morokoff W and Owen A B (1997) Valuation of mortgage backed securities using Brownian bridges to reduce effective dimension Journal of Computational Finance 1 27–46
Gerstner T and Griebel M (1998) Numerical integration using sparse grids Numerical Algorithms 18 209–232
Smolyak S A (1963) Quadrature and interpolation formulas for tensor products of certain classes of functions Dokl. Akad. Nauk SSSR 4 240–243
5
Arguments
 1: $\mathbf{ni}$ – IntegerInput

On entry: ${n}_{i}$, the number of integrands.
Constraint:
${\mathbf{ni}}\ge 1$.
 2: $\mathbf{ndim}$ – IntegerInput

On entry: $d$, the number of dimensions.
Constraint:
${\mathbf{ndim}}\ge 1$.
 3: $\mathbf{f}$ – Subroutine, supplied by the user.External Procedure

f must return the value of the integrands
${f}_{j}$ at a set of
${n}_{x}$ $d$dimensional points
${\mathbf{x}}_{i}$, implicitly supplied as columns of a matrix
$X\left(d,{n}_{x}\right)$. If
$X$ was supplied explicitly you would find that most of the elements attain the same value,
${x}_{\mathrm{tr}}$; the larger the number of dimensions, the greater the proportion of elements of
$X$ would be equal to
${x}_{\mathrm{tr}}$. So,
$X$ is effectively a sparse matrix, except that the ‘zero’ elements are replaced by elements that are all equal to the value
${x}_{\mathrm{tr}}$. For this reason
$X$ is supplied, as though it were a sparse matrix, in compressed column storage (CCS) format (see the
F11 Chapter Introduction).
Individual entries
${x}_{\mathit{k},i}$ of
$X$, for
$\mathit{k}=1,2,\dots ,d$, are either trivially valued, in which case
${x}_{k,i}={x}_{\mathrm{tr}}$, or are nontrivially valued. For point
$i$, the nontrivial row indices and corresponding abscissae values are supplied in elements
$c\left(\mathit{i}\right)={\mathbf{icolzp}}\left(\mathit{i}\right),\dots ,{\mathbf{icolzp}}\left(\mathit{i}+1\right)1$, for
$\mathit{i}=1,2,\dots ,{n}_{x}$, of the arrays
irowix and
xs, respectively. Hence the
$i$th column of the matrix
$X$ is retrievable as
An equivalent integer valued matrix
$Q$ is also implicitly provided. This contains the unique indices
${q}_{k,i}$ of the underlying onedimensional quadrature rule corresponding to the individual abscissae
${x}_{k,i}$. For trivial abscissae, the implicit index
${q}_{k,i}=1$.
$Q$ is supplied in the same CCS format as
$X$, with the nontrivial values supplied in
qs.
The specification of
f is:
Fortran Interface
Subroutine f ( 
ni, ndim, nx, xtr, nntr, icolzp, irowix, xs, qs, fm, iflag, iuser, ruser) 
Integer, Intent (In)  ::  ni, ndim, nx, nntr, icolzp(nx+1), irowix(nntr), qs(nntr)  Integer, Intent (Inout)  ::  iflag, iuser(*)  Real (Kind=nag_wp), Intent (In)  ::  xtr, xs(nntr)  Real (Kind=nag_wp), Intent (Inout)  ::  ruser(*)  Real (Kind=nag_wp), Intent (Out)  ::  fm(ni,nx) 

C Header Interface
#include nagmk26.h
void 
f (const Integer *ni, const Integer *ndim, const Integer *nx, const double *xtr, const Integer *nntr, const Integer icolzp[], const Integer irowix[], const double xs[], const Integer qs[], double fm[], Integer *iflag, Integer iuser[], double ruser[]) 

 1: $\mathbf{ni}$ – IntegerInput

On entry: ${n}_{i}$, the number of integrands.
 2: $\mathbf{ndim}$ – IntegerInput

On entry: $d$, the number of dimensions.
 3: $\mathbf{nx}$ – IntegerInput

On entry: ${n}_{x}$, the number of points ${x}_{i}$, corresponding to the number of columns of $X$, at which the set of integrands must be evaluated.
 4: $\mathbf{xtr}$ – Real (Kind=nag_wp)Input

On entry: ${x}_{\mathrm{tr}}$, the value of the trivial elements of $X$.
 5: $\mathbf{nntr}$ – IntegerInput

On entry: if
${\mathbf{iflag}}>0$, the number of nontrivial elements of
$X$.
If ${\mathbf{iflag}}=0$, the total number of abscissae from the underlying onedimensional quadrature.
 6: $\mathbf{icolzp}\left({\mathbf{nx}}+1\right)$ – Integer arrayInput

On entry: the set
$\left\{{\mathbf{icolzp}}\left(\mathit{i}\right),\dots ,{\mathbf{icolzp}}\left(\mathit{i}+1\right)1\right\}$ contains the indices of
irowix and
xs corresponding to the nontrivial elements of column
$\mathit{i}$ of
$X$ and hence of the point
${\mathbf{x}}_{\mathit{i}}$, for
$\mathit{i}=1,2,\dots ,{n}_{x}$.
 7: $\mathbf{irowix}\left({\mathbf{nntr}}\right)$ – Integer arrayInput

On entry: the row indices corresponding to the nontrivial elements of $X$.
 8: $\mathbf{xs}\left({\mathbf{nntr}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry: ${x}_{k,i}\ne {x}_{\mathrm{tr}}$, the nontrivial entries of $X$.
 9: $\mathbf{qs}\left({\mathbf{nntr}}\right)$ – Integer arrayInput

On entry: ${q}_{k,i}\ne 1$, the indices of the underlying quadrature rules corresponding to ${x}_{k,i}\ne {x}_{\mathrm{tr}}$.
 10: $\mathbf{fm}\left({\mathbf{ni}},{\mathbf{nx}}\right)$ – Real (Kind=nag_wp) arrayOutput

On exit: ${\mathbf{fm}}\left(\mathit{p},\mathit{i}\right)={f}_{\mathit{p}}\left({\mathbf{x}}_{\mathit{i}}\right)$, for $\mathit{i}=1,2,\dots ,{n}_{x}$ and $\mathit{p}=1,2,\dots ,{n}_{\mathit{i}}$.
 11: $\mathbf{iflag}$ – IntegerInput/Output

On entry: if
${\mathbf{iflag}}=0$, this is the first call to
f.
${n}_{x}=1$, and the entire point
${\mathbf{x}}_{1}$ will satisfy
${x}_{\mathit{k},1}={x}_{\mathrm{tr}}$, for
$\mathit{k}=1,2,\dots ,d$. In addition,
nntr contains the total number of abscissae from the underlying onedimensional quadrature;
xs contains the complete set of abscissae and
qs contains the corresponding quadrature indices, with
${\mathbf{xs}}\left(1\right)={x}_{\mathrm{tr}}$ and
${\mathbf{qs}}\left(1\right)=1$. This will always be called in serial.
In subsequent calls to
f,
${\mathbf{iflag}}=1$. Subsequent calls may be made from within an OpenMP parallel region. See
Section 8 for details.
On exit: set ${\mathbf{iflag}}<0$ if you wish to force an immediate exit from d01esf with ${\mathbf{ifail}}={{\mathbf{1}}}$.
 12: $\mathbf{iuser}\left(*\right)$ – Integer arrayUser Workspace
 13: $\mathbf{ruser}\left(*\right)$ – Real (Kind=nag_wp) arrayUser Workspace

f is called with the arguments
iuser and
ruser as supplied to
d01esf. You should use the arrays
iuser and
ruser to supply information to
f.
f must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which
d01esf is called. Arguments denoted as
Input must
not be changed by this procedure.
Note: f should not return floatingpoint NaN (Not a Number) or infinity values, since these are not handled by
d01esf. If your code inadvertently
does return any NaNs or infinities,
d01esf is likely to produce unexpected results.
 4: $\mathbf{maxdlv}\left({\mathbf{ndim}}\right)$ – Integer arrayInput

On entry:
$\mathbf{m}$, the array of maximum levels for each dimension.
${\mathbf{maxdlv}}\left(\mathit{j}\right)$, for
$\mathit{j}=1,2,\dots ,d$, contains
${m}_{\mathit{j}}$, the maximum level of quadrature rule dimension
j will support.
The default value,
$\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({m}_{q},L\right)$ will be used if either
${\mathbf{maxdlv}}\left(j\right)\le 0$ or
${\mathbf{maxdlv}}\left(j\right)\ge \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({m}_{q},L\right)$ (for details on the default values for
${m}_{q}$ and
$L$ and on how to change these values see the options
Maximum Level,
Maximum Quadrature Level and
Quadrature Rule).
If ${\mathbf{maxdlv}}\left(j\right)=1$ for all $j$, only one evaluation will be performed, and as such no error estimation will be possible.
Suggested value:
${\mathbf{maxdlv}}\left(j\right)=0$ for all $j=1,2,\dots ,d$.
Note: setting nondefault values for some dimensions makes the assumption that the contribution from the omitted subspaces is $0$. The integral and error estimates will only be based on included subspaces, which if the $0$ contribution assumption is not valid will be erroneous.
 5: $\mathbf{dinest}\left({\mathbf{ni}}\right)$ – Real (Kind=nag_wp) arrayOutput

On exit: ${\mathbf{dinest}}\left(\mathit{p}\right)$ contains the final estimate ${\hat{F}}_{\mathit{p}}$ of the definite integral ${F}_{p}$, for $\mathit{p}=1,2,\dots ,{n}_{i}$.
 6: $\mathbf{errest}\left({\mathbf{ni}}\right)$ – Real (Kind=nag_wp) arrayOutput

On exit: ${\mathbf{errest}}\left(\mathit{p}\right)$ contains the final error estimate ${E}_{p}$ of the definite integral ${F}_{p}$, for $\mathit{p}=1,2,\dots ,{n}_{i}$.
 7: $\mathbf{ivalid}\left({\mathbf{ni}}\right)$ – Integer arrayOutput

On exit:
${\mathbf{ivalid}}\left(\mathit{p}\right)$ indicates the final state of integral
$\mathit{p}$, for
$\mathit{p}=1,2,\dots ,{n}_{i}$.
 ${\mathbf{ivalid}}\left(p\right)=0$
 The error estimate for integral $p$ was below the requested tolerance.
 ${\mathbf{ivalid}}\left(p\right)=1$
 The error estimate for integral $p$ was below the requested tolerance. The final level used was nonisotropic.
 ${\mathbf{ivalid}}\left(p\right)=2$
 The error estimate for integral $p$ was above the requested tolerance.
 ${\mathbf{ivalid}}\left(p\right)=3$
 The error estimate for integral $p$ was above $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0.1\left{\hat{F}}_{p}\right,0.01\right)$.
 ${\mathbf{ivalid}}\left(p\right)<0$
 You aborted the evaluation before an error estimate could be made.
 8: $\mathbf{iopts}\left(100\right)$ – Integer arrayCommunication Array
 9: $\mathbf{opts}\left(100\right)$ – Real (Kind=nag_wp) arrayCommunication Array

The arrays
iopts and
opts must not be altered between calls to any of the routines
d01esf,
d01zkf and
d01zlf.
 10: $\mathbf{iuser}\left(*\right)$ – Integer arrayUser Workspace
 11: $\mathbf{ruser}\left(*\right)$ – Real (Kind=nag_wp) arrayUser Workspace

iuser and
ruser are not used by
d01esf, but are passed directly to
f and may be used to pass information to this routine.
 12: $\mathbf{ifail}$ – IntegerInput/Output

On entry:
ifail must be set to
$0$,
$1\text{ or}1$. If you are unfamiliar with this argument you should refer to
Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{ or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, because for this routine the values of the output arguments may be useful even if
${\mathbf{ifail}}\ne {\mathbf{0}}$ on exit, the recommended value is
$1$.
When the value $\mathbf{1}\text{ or}1$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Note: d01esf may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

The requested accuracy was not achieved for at least one integral.
 ${\mathbf{ifail}}=2$

No accuracy was achieved for at least one integral.
 ${\mathbf{ifail}}=11$

On entry, ${\mathbf{ni}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{ni}}\ge 1$.
 ${\mathbf{ifail}}=21$

On entry, ${\mathbf{ndim}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{ndim}}\ge 1$.
 ${\mathbf{ifail}}=1001$

Either the option arrays
iopts and
opts have not been initialized for
d01esf, or they have become corrupted.
 ${\mathbf{ifail}}=1$

Exit requested from
f with
${\mathbf{iflag}}=\u2329\mathit{\text{value}}\u232a$.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 3.9 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 3.8 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 3.7 in How to Use the NAG Library and its Documentation for further information.
7
Accuracy
For each integral
$p$, an error estimate
${E}_{p}$ is returned, where,
where
$k\le L$ is the highest level at which computation was performed.
8
Parallelism and Performance
8.1
Direct Threading
d01esf is directly threaded for parallel execution. For each level, at most ${n}_{t}$ threads will evaluate the integrands over independent subspaces of the construction, and will construct a partial sum of the level's contribution. Once all subspaces from a given level have been processed, the partial sums are combined to give the total contribution of the level, which is in turn added to the total solution. For a given number of threads, the division of subspaces between the threads, and the order in which a single thread operates over its assigned subspaces, is fixed. However, the order in which all subspaces are combined will necessarily be different to the single threaded case, which may result in some discrepency in the results between parallel and serial execution.
To mitigate this discrepency, it is recommended that d01esf be instructed to use higherthanworking precision to accumulate the actions over the subspaces. This is done by setting the option ${\mathbf{Summation\; Precision}}=\mathrm{HIGHER}$, which is the default behaviour. This has some computational cost, although this is typically negligible in comparison to the overall runtime, particularly for nontrivial integrands.
If ${\mathbf{Summation\; Precision}}=\mathrm{WORKING}$, then the accumulation will be performed using working precision, which may provide some increase in performance. Again, this is probably negligible in comparison to the overall runtime.
For some problems, typically of lower dimension, there may be insufficient work to warrant direct threading at lower levels. For example, a threedimensional problem will require at most
$3$ subspaces to be evaluated at level
$2$, and at most
$6$ subspaces at level
$3$. Furthermore, level
$2$ subspaces typically contain only
$2$ new multidimensional abscissae, while level
$3$ subspaces typically contain
$2$ or
$4$ new multidimensional abscissae depending on the
Quadrature Rule. If there are more threads than the number of available subspaces at a given level, or the amount of work in each subspace is outweighed by the amount of work required to generate the parallel region, parallel efficiency will be decreased. This may be mitigated to some extent by evaluating the first
${s}_{l}$ levels in serial. The value of
${s}_{l}$ may be altered using the optional parameter
Serial Levels. If
${s}_{l}\ge L$, then all levels will be evaluated in serial and no direct threading will be utilized.
If you use direct threading in the manner just described, you must ensure any access to the
user arrays
iuser and
ruser is done in a threadsafe manner.
These are
classed as OpenMP SHARED, and
are
passed directly to the subroutine
f for every thread.
8.2
The vectorized interface also allows for parallelization inside the subroutine
f by evaluating the required integrands in parallel. Provided the values returned by
f match those that would be returned without parallelizing
f, the final result should match the serial result, barring any discrepencies in accumulation. If you wish to parallelize
f, it is advisable to set a large value for
Maximum Nx, although be aware that increasing
Maximum Nx will increase the memory requirement of the routine. In general, parallelization of
f should not be necessary, as the higherlevel parallelism over different subspaces scales well for many problems.
Not applicable.
10
Example
The example program evaluates an estimate to the set of integrals
where
$\left\mathbf{x}\right={\displaystyle \sum _{\mathit{j}=1}^{d}}j{x}_{j}$.
It also demonstrates a simple method to safely use
iuser and
ruser as workspace for subcalculations when running in parallel.
10.1
Program Text
Program Text (d01esfe.f90)
10.2
Program Data
None.
10.3
Program Results
Program Results (d01esfe.r)
11
Optional Parameters
Several optional parameters in
d01esf control aspects of the algorithm, methodology used, logic or output. Their values are contained in the arrays
iopts and
opts; these must be initialized before calling
d01esf by first calling
d01zkf with
optstr set to
"Initialize = d01esf".
Each optional parameter has an associated default value; to set any of them to a nondefault value, or to reset any of them to the default value, use
d01zkf. The current value of an optional parameter can be queried using
d01zlf.
The remainder of this section can be skipped if you wish to use the default values for all optional parameters.
The following is a list of the optional parameters available. A full description of each optional parameter is provided in
Section 11.1.
11.1
Description of the Optional Parameters
For each option, we give a summary line, a description of the optional parameter and details of constraints.
The summary line contains:
 the keywords, where the minimum abbreviation of each keyword is underlined;
 a parameter value,
where the letters $a$, $i$ and $r$ denote options that take character, integer and real values respectively.
 the default value.
The following symbol represent various machine constants:
 $\epsilon $ represents the machine precision (see x02ajf).
All options accept the value ‘DEFAULT’ in order to return single options to their default states.
Keywords and character values are case insensitive, however they must be separated by at least one space.
Queriable options will return the appropriate value when queried by calling
d01zlf. They will have no effect if passed to
d01zkf.
For
d01esf the maximum length of the argument
cvalue used by
d01zlf is
$15$.
Absolute Tolerance  $r$  Default $\text{}=\sqrt{\epsilon}$ 
$r={\epsilon}_{a}$, the absolute tolerance required.
Index Level  $i$  Default $\text{}=4$ 
The maximum level at which function values are stored for later use. Larger values use increasingly more memory, and require more time to access specific elements. Lower levels result in more repeated computation. The
Maximum Quadrature Level,
${m}_{q}$ is the effective upper limit on
Index Level. If
$i\ge {m}_{q}$,
d01esf will use
${m}_{q}$ as the value of
Index Level.
Constraint: $i\ge 1$.
Maximum Level  $i$  Default $\text{}=5$ 
$i=L$, the maximum number of levels to evaluate.
Constraint: $1<i\le 20$.
Note: the maximum allowable level in any single dimension,
${m}_{q}$, is governed by the
Quadrature Rule selected. If a value greater than
${m}_{q}$ is set, only a subset of subspaces from higher levels will be used. Should this subset be empty for a given level, computation will consider the preceding level to be the maximum level and will terminate.
Maximum Nx  $i$  Default $\text{}=128$ 
$i=\mathrm{max}{n}_{x}$, the maximum number of points to evaluate in a single call to
f.
Constraint: $1\le i\le 16384$.
Maximum Quadrature Level  $i$  Queriable only 
$i={m}_{q}$, the maximum level of the underlying onedimensional quadrature rule (see
Quadrature Rule).
Minimum Level  $i$  Default $\text{}=2$ 
The minimum number of levels which must be evaluated before an error estimate is used to determine convergence.
Constraint: $i>1$.
Note: if the minimum level is greater than the maximum computable level, the maximum level will be used.
Quadrature Rule  $a$  Default $\text{}=\mathrm{Gauss\u2013Patterson}$ 
The underlying onedimensional quadrature rule to be used in the construction. Open rules do not require evaluations at boundaries.
 ${\mathbf{Quadrature\; Rule}}=\mathrm{Gauss\u2013Patterson}$ or $\mathrm{GP}$
 The interlacing Gauss–Patterson rules. Level $\ell $ uses ${2}^{\ell}1$ abscissae. All levels are open. These rules provide high order accuracy. ${m}_{q}=9$.
 ${\mathbf{Quadrature\; Rule}}=\mathrm{Clenshaw\u2013Curtis}$ or $\mathrm{CC}$
 The interlacing Clenshaw–Curtis rules. Level $\ell $ uses ${2}^{\ell 1}+1$ abscissae. All levels above level $1$ are closed. ${m}_{q}=12$.
Relative Tolerance  $r$  Default $\text{}=\sqrt{\epsilon}$ 
$r={\epsilon}_{a}$, the relative tolerance required.
Summation Precision  $a$  Default $\text{}=\mathrm{HIGHER}$ 
Determines whether
d01esf uses working precision or higherthanworking precision to accumulate the actions over subspaces.
 ${\mathbf{Summation\; Precision}}=\mathrm{HIGHER}$ or $\mathrm{H}$
 Higherthanworking precision is used to accumulate the action over a subspace, and for the accumulation of all such actions. This is more expensive computationally, although this is probably negligible in comparison to the cost of evaluating the integrands and the overall runtime. This significantly reduces variation in the result when changing the number of threads.
 ${\mathbf{Summation\; Precision}}=\mathrm{WORKING}$ or $\mathrm{W}$
 Working precision is used to accumulate the actions over subspaces. This may provide some speedup, particularly if ${n}_{i}$ or ${n}_{t}$ is large. The results of parallel simulations will however be more prone to variation.
Note: the following option is relevant only to multithreaded implementations of the NAG Library.
Serial Levels  $i$  Default $\text{}=1$ 
$i={s}_{l}$, the number of levels to be evaluated in serial before initializing parallelization. For relatively trivial integrands, this may need to be set greater than the default to reduce parallel overhead.