Integer type:** int32**** int64**** nag_int** show int32 show int32 show int64 show int64 show nag_int show nag_int

D04 — Numerical Differentiation

This chapter is concerned with calculating approximations to derivatives of a function f$f$.

The problem of numerical differentiation does not receive very much attention nowadays. Although the Taylor series plays a key role in much of classical analysis, the poor reputation enjoyed by numerical differentiation has led numerical analysts to construct techniques for most problems which avoid the explicit use of numerical differentiation.

One may briefly and roughly define the term numerical differentiation as any process in which numerical values of derivatives f^{(s)}(x_{0})${f}^{\left(s\right)}\left({x}_{0}\right)$ are obtained from evaluations f(x_{i})$f\left({x}_{i}\right)$ of the function f(x)$f\left(x\right)$ at several abscissae x_{i}${x}_{i}$ near x_{0}${x}_{0}$. This problem can be stable, well-conditioned, and accurate when complex-valued abscissae are used. This was first pointed out by Lyness and Moler (1967). An item of numerical software for this appears in Lyness and Ande (1971). However, in many applications the use of complex-valued abscissae is either prohibitive or prohibited. The main difficulty in using real abscissae is that amplification of round-off error can completely obliterate meaningful results. In the days when one relied on hand calculating machines and tabular data, the frustration caused by this effect led to the abandonment of serious use of the process.

There are several reasons for believing that, in the present-day computing environment, numerical differentiation might find a useful role. The first is that, by present standards, it is rather a small-scale calculation, so its cost may well be negligible compared with any overall saving in cost that might result from its use. Secondly, the assignment of a step length h$h$ is now generally open. One does not have to rely on tabular data. Thirdly, although the amplification of round-off error is an integral part of the calculation, its effect can be measured reliably and automatically by the function at the time of the calculation.

Thus you do not have to gauge the round-off level (or noise level) of the function values or assess the effect of this on the accuracy of the results. A function does this automatically, returning with each result an error estimate which has already taken round-off error amplification into account.

We now illustrate, by means of a very simple example, the importance of the round-off error effect. A very simple approximation of f^{′}(0)${f}^{\prime}\left(0\right)$, based on the identity

is

If there were no precision problem, this formula would be the only one needed in the theory of first-order numerical differentiation. We could simply take h = 10^{ − 40}$h={10}^{-40}$ (or h = 10^{ − 1000}$h={10}^{-1000}$) to obtain an excellent approximation based on two function values. It is only when we consider in detail how a machine with finite precision comes to calculate (f(h) − f( − h)) / 2h$(f\left(h\right)-f(-h))/2h$ that the necessity for a sophisticated theory becomes apparent.

f ^{′}(0) = (f(h) − f( − h)) / 2h + (h^{2} / 3 ! ) f^{ ′ ′ ′ }(ξ),
$${f}^{\prime}\left(0\right)=(f\left(h\right)-f(-h))/2h+({h}^{2}/3!){f}^{\prime \prime \prime}\left(\xi \right)\text{,}$$ | (1) |

(f(h) − f( − h)) / 2h.
$$(f\left(h\right)-f(-h))/2h\text{.}$$ |

To simplify the subsequent description we shall assume that the quantities involved are neither so close to zero that the machine underflow characteristics need be considered nor so large that machine overflow occurs. The approximation mentioned above involves the function values f(h)$f\left(h\right)$ and f( − h)$f(-h)$. In general no computer has available these numbers exactly. Instead it uses approximations f̂(h)$\hat{f}\left(h\right)$ and f̂( − h)$\hat{f}(-h)$ whose relative accuracy is less than some tolerance ε_{f}${\epsilon}_{\mathrm{f}}$. If the function f(x)$f\left(x\right)$ is a library function, for example sinx$\mathrm{sin}x$, ε_{f}${\epsilon}_{f}$ may coincide with the machine accuracy parameter ε_{m}${\epsilon}_{\mathrm{m}}$. More generally the function f(x)$f\left(x\right)$ is calculated in a user-supplied function and ε_{f}${\epsilon}_{f}$ is larger than ε_{m}${\epsilon}_{m}$ by a small factor 5$5$ or 6$6$ if the calculation is short or by some larger factor in an extended calculation. This factor is not usually known by you.

f̂(h)$\hat{f}\left(h\right)$ and f̂( − h)$\hat{f}(-h)$ are related to f(h)$f\left(h\right)$ and f( − h)$f(-h)$ by

Thus even if the rest of the calculation were carried out exactly, it is trivial to show that

The difference between the quantity actually calculated and the quantity which one attempts to calculate may be as large as ε_{f}f(ξ) / h${\epsilon}_{f}f\left(\xi \right)/h$; for example using h = 10^{ − 40}$h={10}^{-40}$ and ε_{m} = 10^{ − 7}${\epsilon}_{m}={10}^{-7}$ this gives a ‘conditioning error’ of 10^{33}f(ξ)${10}^{33}f\left(\xi \right)$.

$$\begin{array}{ll}\hat{f}\left(h\right)=f\left(h\right)(1+{\theta}_{1}{\epsilon}_{f})\text{,}& \left|{\theta}_{1}\right|\le 1\\ & \\ \hat{f}(-h)=f(-h)(1+{\theta}_{2}{\epsilon}_{f})\text{,}& \left|{\theta}_{2}\right|\le 1\text{.}\end{array}$$ |

( f̂(h) − f̂( − h))/(2h) − (f(h) − f( − h))/(2h) ≃ 2 φ ε _{f} (f(ξ))/(2h), |φ| ≤ 1.
$$\frac{\hat{f}\left(h\right)-\hat{f}(-h)}{2h}-\frac{f\left(h\right)-f(-h)}{2h}\simeq 2\varphi {\epsilon}_{f}\frac{f\left(\xi \right)}{2h}\text{, \hspace{1em}}\left|\varphi \right|\le 1\text{.}$$ |

In practice much more sophisticated formulae than (1) above are used, and for these and for the corresponding higher-derivative formulae the error analysis is different and more complicated in detail. But invariably the theory contains the same overall feature. In a finite length calculation, the error is composed of two main parts: a discretization error which increases as h$h$ becomes large and is zero for h = 0$h=0$; and a ‘conditioning’ error due to amplification of round-off error in function evaluation, which increases as h$h$ becomes small and is infinite for h = 0$h=0$.

The functions in this chapter have to take into account internally both these sources of error in the results. Thus they return pairs of results, der(j)$\mathbf{der}\left(j\right)$ and erest(j)$\mathbf{erest}\left(j\right)$ where der(j)$\mathbf{der}\left(j\right)$ is an approximation to f^{(j)}(x_{0})${f}^{\left(j\right)}\left({x}_{0}\right)$ and erest(j)$\mathbf{erest}\left(j\right)$ is an estimate of the error in the approximation der(j)$\mathbf{der}\left(j\right)$. If the function has not been misled, der(j)$\mathbf{der}\left(j\right)$ and erest(j)$\mathbf{erest}\left(j\right)$ satisfy

In this respect, numerical differentiation functions are fundamentally different from other functions. You do not specify an error criterion. Instead the function provides the error estimate and this may be very large.

|der(j) − f ^{(j)}(x_{0})| ≤ erest(j).
$$|\mathbf{der}\left(j\right)-{f}^{\left(j\right)}\left({x}_{0}\right)|\le \mathbf{erest}\left(j\right)\text{.}$$ |

We mention here a terminological distinction. A fully automatic function is one in which you do not need to provide any information other than that required to specify the problem. Such a function (at a cost in computing time) decides an appropriate internal parameter such as the step length h$h$ by itself. On the other hand a function which requires you to provide a step length h$h$, but automatically chooses from several different formulae each based on the specified step length, is termed a semi-automatic function.

The situation described above must have seemed rather depressing when hand machines were in common use. For example in the simple illustration one does not know the values of the quantities f^{ ′ ′ ′ }(ξ)${f}^{\prime \prime \prime}\left(\xi \right)$ or ε_{f}${\epsilon}_{f}$ involved in the error estimates, and the idea of altering the value of h$h$ and starting again must have seemed appalling. However, by present-day standards, it is a relatively simple matter to write a program which carries out all the previously considered time-consuming calculations which may seem necessary. None of the functions in this chapter are particularly revolutionary in concept. They simply utilize the computer for the sort of task for which it was originally designed. It carries out simple tedious calculations which are necessary to estimate the accuracy of the results of other even simpler tedious calculations.

(a) | One immediate use to which a set of derivatives at a point is likely to be put is that of constructing a Taylor series representation:
_{0}| < R_{c}$|x-{x}_{0}|<{R}_{c}$ (the radius of convergence) and it is only for these values of x$x$ that such a series is likely to be used. In this case in forming the sum, the required accuracy in f^{(j)}(x_{0})${f}^{\left(j\right)}\left({x}_{0}\right)$ diminishes with increasing j$j$.
The series formed from the Taylor series using elementary operations such as integration or differentiation has the same overall characteristic. A technique based on a Taylor series expansion may be quite accurate, even if the individual derivatives are not, so long as the less accurate derivatives are associated with known small coefficients.
The error introduced by using n$n$ approximate derivatives der(j)$\mathbf{der}\left(j\right)$ is bounded by
In this sort of application the accuracy of the result depends on the size of the errors in the numerical differentiation. There are other applications where the effect of large errors erest(j)$\mathbf{erest}\left(j\right)$ is merely to prolong a calculation, but not to impair the final accuracy. |
||||||||||||||||||||||

(b) | A simple Taylor series approach such as described in (a) is used to find a starting value for a rapidly converging but extremely local iterative process. | ||||||||||||||||||||||

(c) | The technique known as ‘subtracting out the singularity’ as a preliminary to numerical quadrature may be extended and may be carried out approximately. Thus suppose we are interested in evaluating
^{ − (1 / 2)}φ(x)${x}^{-(1/2)}\varphi \left(x\right)$ is generally accepted to be difficult for an automatic integrator because of the singularity. If φ(x)$\varphi \left(x\right)$ and some of its derivatives at the singularity x = 0$x=0$ are known one may effectively ‘subtract out’ the singularity using the following identity:
^{′}(0)$A={\varphi}^{\prime}\left(0\right)$ and B = φ^{ ′ ′ }(0)$B={\varphi}^{\prime \prime}\left(0\right)$.
The integrand function on the right of (2) has no singularity, but its third derivative does. Thus using numerical quadrature for this integral is much cheaper than using numerical quadrature for the original integral (in the left-hand side of (2)).
However, (2) is an identity whatever values of A$A$ and B$B$ are assigned. Thus the same procedure can be used with A$A$ and B$B$ being approximations to φ ^{′}(0)${\varphi}^{\prime}\left(0\right)$ and φ^{ ′ ′ }(0)${\varphi}^{\prime \prime}\left(0\right)$ provided by a numerical differentiation function. The integrand would now be more difficult to integrate than if A$A$ and B$B$ were correct but still much less difficult than the original integrand (on the left-hand side of (2)). But, assuming that the automatic quadrature function is reliable, the overall result would still be correct. The effect of using approximate derivatives rather than exact derivatives does not impair the accuracy of the result. It simply makes the result more expensive to obtain, but not nearly as expensive as if no derivatives were used at all. |
||||||||||||||||||||||

(d) | The calculation of a definite integral may be based on the Euler–Maclaurin expansion
_{2s}${B}_{2s}$ is a Bernoulli number. If one fixes a value of l$l$ then as m$m$ is increased the right-hand side (without the remainder term) approaches the true value of the integral. This statement remains true whatever values are used to replace f^{(2s − 1)}(1)${f}^{(2s-1)}\left(1\right)$ and f^{(2s − 1)}(0)${f}^{(2s-1)}\left(0\right)$. If no derivatives are available, and this formula is used (effectively with the derivatives replaced by zero) the rate of convergence is slow (like m^{ − 2}${m}^{-2}$) and a large number of function evaluations may be used in calculating the trapezoidal rule sum for large m$m$ before a sufficiently accurate result is attained. However, if approximate derivatives are used, the initial rate of convergence is enhanced. In fact, in this example any derivative approximation which is closer than the approximation zero is helpful. Thus the use of inaccurate derivatives may have the effect of shortening the overall calculation, since a sufficiently accurate result may be obtained using a smaller value of m$m$, without impairing the accuracy of the result. (The resemblance with Gregory's formula is superficial. Here l$l$ is kept fixed and m$m$ is increased, ensuring a convergent process.)
The examples given above are only intended to illustrate the sort of use to which approximate derivatives may be put. Very simple illustrations have been used for clarity, and in such simple cases there are usually more efficient approaches to the problem. The same ideas applied in a more complicated or restrictive setting may provide an efficient approach to a problem for which no simple standard approach exists. |

(a) | At the present, there is only one numerical differentiation algorithm available in this chapter accessible using direct communication in nag_numdiff (d04aa), and using reverse communication in nag_numdiff_rcomm (d04ba). These are semi-automatic functions for obtaining approximations to the first fourteen derivatives of a real valued function f(x)$f\left(x\right)$ at a specified point x_{0}${x}_{0}$.
For nag_numdiff (d04aa), you must provide a
function
representing f(x)$f\left(x\right)$, the value of x_{0}${x}_{0}$, an upper limit n ≤ 14$n\le 14$ on the order of the derivatives required and a step length h$h$.
For nag_numdiff_rcomm (d04ba), you must supply the value of x_{0}${x}_{0}$, 20$20$ other abscissae and the function values at those abscissae. Both functions return a set of approximations der(j)$\mathbf{der}\left(j\right)$ and corresponding error estimates erest(j)$\mathbf{erest}\left(j\right)$ which hopefully satisfy
It is important that the error estimate erest(j)$\mathbf{erest}\left(j\right)$ be taken into consideration before any use of der(j)$\mathbf{der}\left(j\right)$ is made. The actual size of erest(j)$\mathbf{erest}\left(j\right)$ depends on the analytic structure of the function, on the computational precision used and on the step size h$h$, and is difficult to predict. Thus you have to run the function to find out how accurate the results are. Usually you will find the higher-order derivatives are successively more inaccurate and that past a certain order, say 10$10$ or 11$11$, the size of erest(j)$\mathbf{erest}\left(j\right)$ actually exceeds der(j)$\mathbf{der}\left(j\right)$. Clearly when this happens the approximation der(j)$\mathbf{der}\left(j\right)$ is unusable. |
||

(b) | There is available in the algorithm section of CACM (see Lyness and Ande (1971)) a semi-automatic Fortran function for numerical differentiation of an analytical function f(z)$f\left(z\right)$ at a possibly complex abscissa z_{0}${z}_{0}$. This is a stable problem. It can be used for any problem for which nag_numdiff (d04aa) might be used and produces more accurate results, and derivatives of arbitrary high order. However, even if z_{0}${z}_{0}$ is real and f(z)$f\left(z\right)$ is real for z$z$, this function requires a user-supplied
function
which evaluates f(z)$f\left(z\right)$ for complex values of z$z$ and it makes use of complex arithmetic. |
||

(c) | Routines are available in Chapter E02 to differentiate functions which are polynomials (in Chebyshev series representation) or cubic splines (in B-spline representation). |

Lyness J N and Ande G (1971) Algorithm 413, ENTCAF and ENTCRE: evaluation of normalised Taylor coefficients of an analytic function *Comm. ACM* **14(10)** 669–675

Lyness J N and Moler C B (1967) Numerical differentiation of analytic functions *SIAM J. Numer. Anal.* **4(2)** 202–210

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013