E02 Chapter Contents
E02 Chapter Introduction
NAG Library Manual

# NAG Library Routine DocumentE02JDF

Note:  before using this routine, please read the Users' Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent details.
Note: this routine uses optional parameters to define choices in the problem specification and in the details of the algorithm. If you wish to use default settings for all of the optional parameters, you need only read Sections 1 to 9 of this document. If, however, you wish to reset some or all of the settings please refer to Section 10 for a detailed description of the specification of the optional parameters produced by the routine.

## 1  Purpose

E02JDF computes a spline approximation to a set of scattered data using a two-stage approximation method.
The computational complexity of the method grows linearly with the number of data points; hence large datasets are easily accommodated.

## 2  Specification

 SUBROUTINE E02JDF ( N, X, Y, F, LSMINP, LSMAXP, NXCELS, NYCELS, LCOEFS, COEFS, IOPTS, OPTS, IFAIL)
 INTEGER N, LSMINP, LSMAXP, NXCELS, NYCELS, LCOEFS, IOPTS(*), IFAIL REAL (KIND=nag_wp) X(N), Y(N), F(N), COEFS(LCOEFS), OPTS(*)
Before calling E02JDF, E02ZKF must be called with OPTSTR set to ‘’. Settings for optional algorithmic parameters may be specified by calling E02ZKF before a call to E02JDF.

## 3  Description

E02JDF determines a smooth bivariate spline approximation to a set of data points $\left({x}_{\mathit{i}},{y}_{\mathit{i}},{f}_{\mathit{i}}\right)$, for $\mathit{i}=1,2,\dots ,n$. Here, ‘smooth’ means ${C}^{1}$.
The approximation domain is the bounding box $\left[{x}_{\mathrm{min}},{x}_{\mathrm{max}}\right]×\left[{y}_{\mathrm{min}},{y}_{\mathrm{max}}\right]$, where ${x}_{\mathrm{min}}$ (respectively ${y}_{\mathrm{min}}$) and ${x}_{\mathrm{max}}$ (respectively ${y}_{\mathrm{max}}$) denote the lowest and highest data values of the $\left({x}_{i}\right)$ (respectively $\left({y}_{i}\right)$).
The spline is computed by local approximations on a uniform triangulation of the bounding box. These approximations are extended to a smooth spline representation of the surface over the domain. The local approximation scheme is by least-squares polynomials (Davydov and Zeilfelder (2004)).
The two-stage approximation method employed by E02JDF is derived from the TSFIT package of O. Davydov and F. Zeilfelder.
Values of the computed spline can subsequently be computed by calling E02JEF or E02JFF.

## 4  References

Davydov O and Zeilfelder F (2004) Scattered data fitting by direct extension of local polynomials to bivariate splines Advances in Comp. Math. 21 223–271

## 5  Parameters

1:     N – INTEGERInput
On entry: $n$, the number of data values to be fitted.
Constraint: ${\mathbf{N}}>1$.
2:     X(N) – REAL (KIND=nag_wp) arrayInput
3:     Y(N) – REAL (KIND=nag_wp) arrayInput
4:     F(N) – REAL (KIND=nag_wp) arrayInput
On entry: the $\left({x}_{i},{y}_{i},{f}_{i}\right)$ data values to be fitted.
Constraint: ${\mathbf{X}}\left(j\right)\ne {\mathbf{X}}\left(1\right)$ for some $j=2,\dots ,n$ and ${\mathbf{Y}}\left(k\right)\ne {\mathbf{Y}}\left(1\right)$ for some $k=2,\dots ,n$; i.e., there are at least two distinct $x$ and $y$ values.
5:     LSMINP – INTEGERInput
6:     LSMAXP – INTEGERInput
On entry: are control parameters for the local approximations.
Each local approximation is computed on a local domain containing one of the triangles in the discretization of the bounding box. The size of each local domain will be adaptively chosen such that if it contains fewer than LSMINP sample points it is expanded, else if it contains greater than LSMAXP sample points a thinning method is applied. LSMAXP mainly controls computational cost (in that working with a thinned set of points is cheaper and may be appropriate if the input data is densely distributed), while LSMINP allows handling of different types of scattered data.
Setting ${\mathbf{LSMAXP}}<{\mathbf{LSMINP}}$, and therefore forcing either expansion or thinning, may be useful for computing initial coarse approximations. In general smaller values for these arguments reduces cost.
A calibration procedure (experimenting with a small subset of the data to be fitted and validating the results) may be needed to choose the most appropriate values for LSMINP and LSMAXP.
Constraints:
• $1\le {\mathbf{LSMINP}}\le {\mathbf{N}}$;
• ${\mathbf{LSMAXP}}\ge 1$.
7:     NXCELS – INTEGERInput
8:     NYCELS – INTEGERInput
On entry: NXCELS (respectively NYCELS) is the number of cells in the $x$ (respectively $y$) direction that will be used to create the triangulation of the bounding box of the domain of the function to be fitted.
Greater efficiency generally comes when NXCELS and NYCELS are chosen to be of the same order of magnitude and are such that N is $\mathit{O}\left({\mathbf{NXCELS}}×{\mathbf{NYCELS}}\right)$. Thus for a ‘square’ triangulation — when ${\mathbf{NXCELS}}={\mathbf{NYCELS}}$ — the quantities $\sqrt{{\mathbf{N}}}$ and NXCELS should be of the same order of magnitude. See also Section 8.
Constraints:
• ${\mathbf{NXCELS}}\ge 1$;
• ${\mathbf{NYCELS}}\ge 1$.
9:     LCOEFS – INTEGERInput
10:   COEFS(LCOEFS) – REAL (KIND=nag_wp) arrayOutput
On exit: if ${\mathbf{IFAIL}}={\mathbf{0}}$ on exit, COEFS contains the computed spline coefficients.
Constraint: ${\mathbf{LCOEFS}}\ge \left(\left(\left({\mathbf{NXCELS}}+2\right)×\left({\mathbf{NYCELS}}+2\right)+1\right)/2\right)×10+1$.
11:   IOPTS($*$) – INTEGER arrayCommunication Array
On entry: the contents of IOPTS must not be modified in any way either directly or indirectly, by further calls to E02ZKF, before calling either or both of the evaluation routines E02JEF and E02JFF.
12:   OPTS($*$) – REAL (KIND=nag_wp) arrayCommunication Array
On entry: the contents of OPTS must not be modified in any way either directly or indirectly, by further calls to E02ZKF, before calling either or both of the evaluation routines E02JEF and E02JFF.
13:   IFAIL – INTEGERInput/Output
On entry: IFAIL must be set to $0$, $-1\text{​ or ​}1$. If you are unfamiliar with this parameter you should refer to Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value $-1\text{​ or ​}1$ is recommended. If the output of error messages is undesirable, then the value $1$ is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is $0$. When the value $-\mathbf{1}\text{​ or ​}\mathbf{1}$ is used it is essential to test the value of IFAIL on exit.
On exit: ${\mathbf{IFAIL}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).

## 6  Error Indicators and Warnings

If on entry ${\mathbf{IFAIL}}={\mathbf{0}}$ or $-{\mathbf{1}}$, explanatory error messages are output on the current error message unit (as defined by X04AAF).
Errors or warnings detected by the routine:
${\mathbf{IFAIL}}=2$
On entry, ${\mathbf{N}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{N}}>1$.
${\mathbf{IFAIL}}=4$
On entry, ${\mathbf{LSMINP}}=⟨\mathit{\text{value}}⟩$ and ${\mathbf{N}}=⟨\mathit{\text{value}}⟩$.
Constraint: $1\le {\mathbf{LSMINP}}\le {\mathbf{N}}$.
${\mathbf{IFAIL}}=5$
On entry, ${\mathbf{LSMAXP}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{LSMAXP}}\ge 1$.
${\mathbf{IFAIL}}=6$
On entry, ${\mathbf{NXCELS}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{NXCELS}}\ge 1$.
${\mathbf{IFAIL}}=7$
On entry, ${\mathbf{NYCELS}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{NYCELS}}\ge 1$.
${\mathbf{IFAIL}}=8$
On entry, ${\mathbf{LCOEFS}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{LCOEFS}}\ge \left(\left(\left({\mathbf{NXCELS}}+2\right)×\left({\mathbf{NYCELS}}+2\right)+1\right)/2\right)×10+1$.
${\mathbf{IFAIL}}=9$
Option arrays are not initialized or are corrupted.
${\mathbf{IFAIL}}=11$
${\mathbf{IFAIL}}=12$
On entry, all elements of X or of Y are equal.
${\mathbf{IFAIL}}=21$
The value of optional parameter Polynomial Starting Degree was invalid.
${\mathbf{IFAIL}}=-999$
Dynamic memory allocation failed.

## 7  Accuracy

Technical results on error bounds can be found in Davydov and Zeilfelder (2004).
Local approximation by polynomials of degree $d$ for $n$ data points has optimal approximation order ${n}^{-\left(d+1\right)/2}$.
The approximation error for ${C}^{1}$ global smoothing is $\mathit{O}\left({n}^{-2}\right)$.
Whether maximal accuracy is achieved depends on the distribution of the input data and the choices of the algorithmic parameters. The reference above contains extensive numerical tests and further technical discussions of how best to configure the method.

$n$-linear complexity and memory usage can be attained for sufficiently dense input data if the triangulation parameters NXCELS and NYCELS are chosen as recommended in their descriptions above. For sparse input data on such triangulations, if many expansion steps are required (see LSMINP) the complexity may rise to be loglinear.

## 9  Example

The Franke function
 $fx,y = 0.75 ⁢ exp - 9⁢x-2 2 + 9⁢y-2 2 / 4 + 0.75 ⁢ exp - 9⁢x+1 2 / 49 - 9⁢y+1 / 10 + 0.5 ⁢ exp - 9⁢x-7 2 + 9⁢y-3 2 / 4 - 0.2 ⁢ exp - 9⁢x-4 2 - 9⁢y-7 2$
is widely used for testing surface-fitting methods. The example program randomly generates a number of points on this surface. From these a spline is computed and then evaluated at a vector of points and on a mesh.

### 9.1  Program Text

Program Text (e02jdfe.f90)

### 9.2  Program Data

Program Data (e02jdfe.d)

### 9.3  Program Results

Program Results (e02jdfe.r)

## 10  Optional Parameters

Several optional parameters in E02JDF control aspects of the algorithm, methodology used, logic or output. Their values are contained in the arrays IOPTS and OPTS; these must be initialized before calling E02JDF by first calling E02ZKF with OPTSTR set to ‘’.
Each optional parameter has an associated default value; to set any of them to a nondefault value, or to reset any of them to the default value, use E02ZKF. The current value of an optional parameter can be queried using E02ZLF.
The remainder of this section can be skipped if you wish to use the default values for all optional parameters.
The following is a list of the optional parameters available. A full description of each optional parameter is provided in Section 10.1.

### 10.1  Description of the Optional Parameters

For each option, we give a summary line, a description of the optional parameter and details of constraints.
The summary line contains:
• the keywords;
• a parameter value, where the letters $a$, $i\text{​ and ​}r$ denote options that take character, integer and real values respectively;
• the default value.
Keywords and character values are case insensitive.
For E02JDF the maximum length of the parameter CVALUE used by E02ZLF is $6$.
 Averaged Spline $a$ Default $\text{}=\text{'NO'}$
When the bounding box is triangulated there are 8 equivalent configurations of the mesh. Setting ${\mathbf{Averaged Spline}}=\text{'YES'}$ will use the averaged value of the $8$ possible local polynomial approximations over each triangle in the mesh. This usually gives better results but at (about 8 times) higher computational cost.
Constraint: ${\mathbf{Averaged Spline}}=\text{'YES'}$ or $\text{'NO'}$.
 Minimum Singular Value LPA $r$ Default $\text{}=1.0$
A tolerance measure for accepting or rejecting a local polynomial approximation (LPA) as reliable.
The solution of a local least squares problem solved on each triangle subdomain is accepted as reliable if the minimum singular value $\sigma$ of the matrix (of Bernstein polynomial values) associated with the least squares problem satisfies ${\mathbf{Minimum Singular Value LPA}}\le \sigma$.
In general the approximation power will be reduced as Minimum Singular Value LPA is reduced. (A small $\sigma$ indicates that the local data has hidden redundancies which prevent it from carrying enough information for a good approximation to be made.) Setting Minimum Singular Value LPA very large may have the detrimental effect that only approximations of low degree are deemed reliable.
Minimum Singular Value LPA will have no effect if ${\mathbf{Polynomial Starting Degree}}=0$, and it will have little effect if the input data is ‘smooth’ (e.g., from a known function).
A calibration procedure (experimenting with a small subset of the data to be fitted and validating the results) may be needed to choose the most appropriate value for this parameter.
Constraint: ${\mathbf{Minimum Singular Value LPA}}\ge 0.0$.
 Polynomial Starting Degree $i$ Default $\text{}=1$
The degree to be used in the initial step of each local polynomial approximation.
At the initial step the method will attempt to fit with local polynomials of degree Polynomial Starting Degree. If the approximation is deemed unreliable (according to Minimum Singular Value LPA), the degree will be decremented by one and a new local approximation computed, ending with a constant approximation if no other is reliable.
Polynomial Starting Degree is bounded from above by the maximum possible spline degree, $3$.
The default value gives a good compromise between efficiency and accuracy. In general the best approximation can be obtained by setting ${\mathbf{Polynomial Starting Degree}}=3$.
Constraint: $0\le {\mathbf{Polynomial Starting Degree}}\le 3$.