nag_2d_cheb_fit_lines (e02cac) (PDF version)
e02 Chapter Contents
e02 Chapter Introduction
NAG C Library Manual

NAG Library Function Document

nag_2d_cheb_fit_lines (e02cac)

+ Contents

    1  Purpose
    7  Accuracy

1  Purpose

nag_2d_cheb_fit_lines (e02cac) forms an approximation to the weighted, least squares Chebyshev series surface fit to data arbitrarily distributed on lines parallel to one independent coordinate axis.

2  Specification

#include <nag.h>
#include <nage02.h>
void  nag_2d_cheb_fit_lines (const Integer m[], Integer n, Integer k, Integer l, const double x[], const double y[], const double f[], const double w[], double a[], const double xmin[], const double xmax[], const double nux[], Integer inuxp1, const double nuy[], Integer inuyp1, NagError *fail)

3  Description

nag_2d_cheb_fit_lines (e02cac) determines a bivariate polynomial approximation of degree k in x and l in y to the set of data points xr,s,ys,fr,s, with weights wr,s, for s=1,2,,n and r=1,2,,ms. That is, the data points are on lines y=ys, but the x values may be different on each line. The values of k and l are prescribed by you (for guidance on their choice, see Section 8). The function is based on the method described in Sections 5 and 6 of Clenshaw and Hayes (1965).
The polynomial is represented in double Chebyshev series form with arguments x- and y-. The arguments lie in the range -1 to +1 and are related to the original variables x and y by the transformations
x-=2x-xmax+xmin xmax-xmin   and  y-=2y-ymax+ymin ymax-ymin .
Here ymax and ymin are set by the function to, respectively, the largest and smallest value of ys, but xmax and xmin are functions of y prescribed by you (see Section 8). For this function, only their values x max s  and x min s  at each y=ys are required. For each s=1,2,,n, x max s  must not be less than the largest xr,s on the line y=ys, and, similarly, x min s  must not be greater than the smallest xr,s.
The double Chebyshev series can be written as
i=0kj=0laijTix-Tjy-
where Tix- is the Chebyshev polynomial of the first kind of degree i with argument x-, and Tjy is similarly defined. However, the standard convention, followed in this function, is that coefficients in the above expression which have either i or j zero are written as 12aij, instead of simply aij, and the coefficient with both i and j equal to zero is written as 14a0,0. The series with coefficients output by the function should be summed using this convention. nag_2d_cheb_eval (e02cbc) is available to compute values of the fitted function from these coefficients.
The function first obtains Chebyshev series coefficients cs,i, for i=0,1,,k, of the weighted least squares polynomial curve fit of degree k in x- to the data on each line y=ys, for s=1,2,,n, in turn, using an auxiliary function. The same function is then called k+1 times to fit cs,i, for s=1,2,,n, by a polynomial of degree l in y-, for each i=0,1,,k. The resulting coefficients are the required aij.
You can force the fit to contain a given polynomial factor. This allows for the surface fit to be constrained to have specified values and derivatives along the boundaries x=xmin, x=xmax, y=ymin and y=ymax or indeed along any lines x-= constant or y-= constant (see Section 8 of Clenshaw and Hayes (1965)).

4  References

Clenshaw C W and Hayes J G (1965) Curve and surface fitting J. Inst. Math. Appl. 1 164–183
Hayes J G (ed.) (1970) Numerical Approximation to Functions and Data Athlone Press, London

5  Arguments

1:     m[n]const IntegerInput
On entry: m[s-1] must be set to ms, the number of data x values on the line y=ys, for s=1,2,,n.
Constraint: m[s-1]>0, for s=1,2,,n.
2:     nIntegerInput
On entry: the number of lines y= constant on which data points are given.
Constraint: n>0.
3:     kIntegerInput
On entry: k, the required degree of x in the fit.
Constraint: for s=1,2,,n, inuxp1-1k<mdists+inuxp1-1, where mdists is the number of distinct x values with nonzero weight on the line y=ys. See Section 8.
4:     lIntegerInput
On entry: l, the required degree of y in the fit.
Constraints:
  • l0;
  • inuyp1-1l<n+inuyp1-1.
5:     x[dim]const doubleInput
Note: the dimension, dim, of the array x must be at least s=1 n m[s-1].
On entry: the x values of the data points. The sequence must be
  • all points on y=y1, followed by
  • all points on y=y2, followed by
  • all points on y=yn.
Constraint: for each ys, the x values must be in nondecreasing order.
6:     y[n]const doubleInput
On entry: y[s-1] must contain the y value of line y=ys, for s=1,2,,n, on which data is given.
Constraint: the ys values must be in strictly increasing order.
7:     f[dim]const doubleInput
Note: the dimension, dim, of the array f must be at least s=1 n m[s-1].
On entry: f, the data values of the dependent variable in the same sequence as the x values.
8:     w[dim]const doubleInput
Note: the dimension, dim, of the array w must be at least s=1 n m[s-1].
On entry: the weights to be assigned to the data points, in the same sequence as the x values. These weights should be calculated from estimates of the absolute accuracies of the fr, expressed as standard deviations, probable errors or some other measure which is of the same dimensions as fr. Specifically, each wr should be inversely proportional to the accuracy estimate of fr. Often weights all equal to unity will be satisfactory. If a particular weight is zero, the corresponding data point is omitted from the fit.
9:     a[dim]doubleOutput
Note: the dimension, dim, of the array a must be at least k+1×l+1.
On exit: contains the Chebyshev coefficients of the fit. a[i×l+1+j-1] is the coefficient aij of Section 3 defined according to the standard convention. These coefficients are used by nag_2d_cheb_eval (e02cbc) to calculate values of the fitted function.
10:   xmin[n]const doubleInput
On entry: xmin[s-1] must contain xmin s , the lower end of the range of x on the line y=ys, for s=1,2,,n. It must not be greater than the lowest data value of x on the line. Each xmin s  is scaled to -1.0 in the fit. (See also Section 8.)
11:   xmax[n]const doubleInput
On entry: xmax[s-1] must contain x max s , the upper end of the range of x on the line y=ys, for s=1,2,,n. It must not be less than the highest data value of x on the line. Each xmax s  is scaled to +1.0 in the fit. (See also Section 8.)
Constraint: xmax[s-1]>xmin[s-1].
12:   nux[inuxp1]const doubleInput
On entry: nux[i-1] must contain the coefficient of the Chebyshev polynomial of degree i-1 in x-, in the Chebyshev series representation of the polynomial factor in x- which you require the fit to contain, for i=1,2,,inuxp1. These coefficients are defined according to the standard convention of Section 3.
Constraint: nux[inuxp1-1] must be nonzero, unless inuxp1=1, in which case nux is ignored.
13:   inuxp1IntegerInput
On entry: inux+1, where inux is the degree of a polynomial factor in x- which you require the fit to contain. (See Section 3, last paragraph.)
If this option is not required, inuxp1 should be set equal to 1.
Constraint: 1inuxp1k+1.
14:   nuy[inuyp1]const doubleInput
On entry: nuy[i-1] must contain the coefficient of the Chebyshev polynomial of degree i-1 in y-, in the Chebyshev series representation of the polynomial factor which you require the fit to contain, for i=1,2,,inuyp1. These coefficients are defined according to the standard convention of Section 3.
Constraint: nuy[inuyp1-1] must be nonzero, unless inuyp1=1, in which case nuy is ignored.
15:   inuyp1IntegerInput
On entry: inuy+1, where inuy is the degree of a polynomial factor in y- which you require the fit to contain. (See Section 3, last paragraph.) If this option is not required, inuyp1 should be set equal to 1.
16:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument value had an illegal value.
NE_INT
On entry, inuxp1=value.
Constraint: inuxp11.
On entry, inuyp1=value.
Constraint: inuyp11.
On entry, k=value.
Constraint: k0.
On entry, l=value.
Constraint: l0.
On entry, n=value.
Constraint: n>0.
NE_INT_2
On entry, inuxp1=value and k=value.
Constraint: inuxp1k+1.
On entry, inuyp1=value and l=value.
Constraint: inuyp1l+1.
NE_INT_3
On entry, n=value, l=value and inuyp1=value.
Constraint: inuyp1-1l<n+inuyp1-1.
On entry, n=value, l=value and inuyp1=value.
Constraint: l0 and
On entry, n=value, l=value and inuyp1=value.
Constraint: nl-inuyp1+2.
NE_INT_ARRAY
On entry, i=value, m[i-1]=value, k=value and inuxp1=value.
Constraint: m[i-1]k-inuxp1+2.
On entry, inuxp1=value, nux[inuxp1-1]=value, inuyp1=value and nuy[inuyp1-1]=value.
Constraint: if nux[inuxp1-1]=0.0, inuxp1=1; if nuy[inuyp1-1]=0.0, inuyp1=1.
On entry, n=value and m[s-1]=value.
Constraint: m[s-1]>0, for s=1,2,,n.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_NON_ZERO_WEIGHTS
On entry, the number of distinct x values with nonzero weight on y=y[i-1] is less than k-inuxp1+2: i=value, y[i-1]=value, k=value and inuxp1=value.
NE_NOT_NON_DECREASING
On entry, the data x values are not nondecreasing for y=y[i-1]: i=value and y[i-1]=value.
NE_NOT_STRICTLY_INCREASING
On entry, i=value, y[i-1]=value and y[i-2]=value.
Constraint: y[i-1]>y[i-2].
NE_REAL_ARRAY
On entry, xmin[i-1] and xmax[i-1] do not span the data x values on y=y[i-1]: i=value, xmin[i-1]=value, xmax[i-1]=value and y[i-1]=value.

7  Accuracy

No error analysis for this method has been published. Practical experience with the method, however, is generally extremely satisfactory.

8  Further Comments

The time taken is approximately proportional to k×k× s=1 n m[s-1]+n×l2.
The reason for allowing xmax and xmin (which are used to normalize the range of x) to vary with y is that unsatisfactory fits can result if the highest (or lowest) data values of the normalized x on each line y=ys are not approximately the same. (For an explanation of this phenomenon, see page 176 of Clenshaw and Hayes (1965).) Commonly in practice, the lowest (for example) data values x1,s, while not being approximately constant, do lie close to some smooth curve in the x,y plane. Using values from this curve as the values of xmin, different in general on each line, causes the lowest transformed data values x-1,s to be approximately constant. Sometimes, appropriate curves for xmax and xmin will be clear from the context of the problem (they need not be polynomials). If this is not the case, suitable curves can often be obtained by fitting to the lowest data values x1,s and to the corresponding highest data values of x, low degree polynomials in y, using function nag_1d_cheb_fit (e02adc), and then shifting the two curves outwards by a small amount so that they just contain all the data between them. The complete curves are not in fact supplied to the present function, only their values at each ys; and the values simply need to lie on smooth curves. More values on the complete curves will be required subsequently, when computing values of the fitted surface at arbitrary y values.
Naturally, a satisfactory approximation to the surface underlying the data cannot be expected if the character of the surface is not adequately represented by the data. Also, as always with polynomials, the approximating function may exhibit unwanted oscillations (particularly near the ends of the ranges) if the degrees k and l are taken greater than certain values, generally unknown but depending on the total number of coefficients k+1×l+1 should be significantly smaller than, say not more than half, the total number of data points. Similarly, k+1 should be significantly smaller than most (preferably all) the ms, and l+1 significantly smaller than n. Closer spacing of the data near the ends of the x and y ranges is an advantage. In particular, if y-s = - cos πs-1/ n-1 , for s=1,2,,n and x-r,s = - cos πr-1/ m-1 , for r=1,2,,m, (thus ms=m for all s), then the values k=m-1 and l=n-1 (so that the polynomial passes exactly through all the data points) should not give unwanted oscillations. Other datasets should be similarly satisfactory if they are everywhere at least as closely spaced as the above cosine values with m replaced by k+1 and n by l+1 (more precisely, if for every s the largest interval between consecutive values of arccosx-r,s, for r=1,2,,m, is not greater than π/k, and similarly for the y-s). The polynomial obtained should always be examined graphically before acceptance. Note that, for this purpose it is not sufficient to plot the polynomial only at the data values of x and y: intermediate values should also be plotted, preferably via a graphics facility.
Provided the data are adequate, and the surface underlying the data is of a form that can be represented by a polynomial of the chosen degrees, the function should produce a good approximation to this surface. It is not, however, the true least squares surface fit nor even a polynomial in x and y, the original variables (see Section 6 of Clenshaw and Hayes (1965), ), except in certain special cases. The most important of these is where the data values of x are the same on each line y=ys, (i.e., the data points lie on a rectangular mesh in the x,y plane), the weights of the data points are all equal, and xmax and xmin are both constants (in this case they should be set to the largest and smallest data values of x, respectively).
If the dataset is such that it can be satisfactorily approximated by a polynomial of degrees k and l, say, then if higher values are used for k and l in the function, all the coefficients aij for i>k or j>l will take apparently random values within a range bounded by the size of the data errors, or rather less. (This behaviour of the Chebyshev coefficients, most readily observed if they are set out in a rectangular array, closely parallels that in curve-fitting, examples of which are given in Section 8 of Hayes (1970).) In practice, therefore, to establish suitable values of k and l, you should first be seeking (within the limitations discussed above) values for k and l which are large enough to exhibit the behaviour described. Values for k and l should then be chosen as the smallest which do not exclude any coefficients significantly larger than the random ones. A polynomial of degrees k and l should then be fitted to the data.
If the option to force the fit to contain a given polynomial factor in x is used and if zeros of the chosen factor coincide with data x values on any line, then the effective number of data points on that line is reduced by the number of such coincidences. A similar consideration applies when forcing the y-direction. No account is taken of this by the function when testing that the degrees k and l have not been chosen too large.

9  Example

This example reads data in the following order, using the notation of the argument list for nag_2d_cheb_fit_lines (e02cac) above:
nkl y[i-1]m[i-1]xmin[i-1]xmax[i-1], for ​i=1,2,,n x[i-1]f[i-1]w[i-1], for ​i=1,2,, s=1 n m[s-1].
The data points are fitted using nag_2d_cheb_fit_lines (e02cac), and then the fitting polynomial is evaluated at the data points using nag_2d_cheb_eval (e02cbc).
The output is:

9.1  Program Text

Program Text (e02cace.c)

9.2  Program Data

Program Data (e02cace.d)

9.3  Program Results

Program Results (e02cace.r)

Produced by GNUPLOT 4.4 patchlevel 0 0 1 2 3 4 5 6 7 8 9 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Fitted Polynomials P(x,y=Y) x Example Program Calculation and Evaluation of Least-squares Bi-variate Polynomial Fit y = 4 y = 2 y = 1 y = 0 polynomials in x for constant y 50*residual data points

nag_2d_cheb_fit_lines (e02cac) (PDF version)
e02 Chapter Contents
e02 Chapter Introduction
NAG C Library Manual

© The Numerical Algorithms Group Ltd, Oxford, UK. 2012