4.4. Piecewise Polynomial Approximating Functions: Splines and Hermite Cubics#

Co-authored with Stephen Roberts of the Australian National University.

References:

The idea of approximating a function (or interpolating between a set of data points) with a function that is piecewise polynomial takes its simplest form using continuous piecewise linear functions. Indeed, this is the method most commonly used to produce a graph from a large set of data points: for example, the command plot from matplotlib.pyplot (for Python) or PyPlot (for Julia) does it.

The idea is simply to draw straight lines between each successive data point. It is worth analysing this simple method before considering more accurate approaches.

Consider a set of \(n+1\) points \((x_0,y_0),(x_1,y_1),\dots ,(x_n,y_n)\) again, this time requiring the \(x\) values to be in increasing order. Then define the linear functions

\[\begin{equation*} L_i(x)=y_i+(x-x_i)\frac{y_{i+1}-y_i}{x_{i+1}-x_i},\quad x_i\leq x\leq x_{i+1},\quad 0\leq i<n \end{equation*}\]

These can be joined together into a continuous function

\[\begin{equation*} L(x)=L_i(x)\text{ for }x_i\leq x\leq x_{i+1} \end{equation*}\]

with the values \(L(x_i)=y_i\) at all nodes, so that the definition is consistent at the points where the domains join, also guaranteeing continuity.

Spline Interpolation#

Reference: Section 6.4 of [Kincaid and Chenney, 1990].

If a piecewise linear approximation is approximated that passes through a given set of \(n+1\) points or knots

\[\begin{equation*} (t_0,y_0),\dots ,(t_n,y_n) \end{equation*}\]

and is linear in each of the \(n\) interval between them, the “smoothest” curve that one can get is the continuous one given by using linear interpolation between each consecutive pair of points. Less smooth functions are possible, for example the piecewise constant approximation where \(L(x)=y_i\) for \(x_i\leq x<x_{i+1}\).

The general strategy of spline interpolation is to approximate with a piecewise polynomial function, with some fixed degree \(k\) for the polynomials, and is as smooth as possible at the joins between different polynomials. Smoothness is measured by the number of continuous derivatives that the function has, which is only in question at the knots of course.

The traditional and most important case is that of cubic splines interpolants, which have the form

\[\begin{equation*} S(x)=S_i(x),\quad t_i\leq x\leq t_{i+1},\quad 0\leq i<n \end{equation*}\]

where each \(S_i(x)\) is a cubic and the interpolation conditions are

\[\begin{equation*} S_i(t_i)=y_i,\quad S_i(t_{i+1})=y_{i+1},\qquad 0\leq i<n \end{equation*}\]

These conditions automatically give continuity, but leave many degrees of freedom to impose more smoothness. Each cubic is described by four coefficients and so there are \(4n\) in all, and the interpolation conditions give only \(2n\) conditions. There are \(n-1\) knots where different cubics join, so requiring \(S\) to have continuous first and second derivatives imposes \(2(n-1)\) further conditions for a total of \(4n-2\). This is the best smoothness possible without \(S(x)\) becoming a single cubic, and leaves two degrees of freedom. These will be dealt with later, but one approach is imposing zero second derivatives at each end of the interval.

Thus we have the equations

\[S_{i-1}^{\prime }(t_i) = S^{\prime }(t_i)\]

and

\[S_{i-1}^{\prime \prime }(t_i) = S^{\prime \prime }(t_i),\]

\(1\leq i\leq n-1\).

The brute force method would be to write something like

\[S_i(x)=a_ix^3+b_ix^2+c_ix+d_i\]

which would leave to a set of \(4n\) simultaneous linear equations for these \(4n\) unknowns once the two missing conditions have been chosen.

This could then be solved numerically, but the size and cost of the problem can be considerably reduced, to a tridiagonal system of \(n-1\) equations.

Start by considering the second derivative of \(S(x)\), which must be continuous and piecewise linear. Its values at the knots can be called \(x_i=S_i^{\prime \prime }(t_i)\) and the lengths of the interval called \(h_i=x_{i+1}-x_i\) so that

\[\begin{equation*} S_i^{\prime \prime }(x)=\frac{z_i}{h_i}(t_{i+1}-x)+\frac{z_{i+1}}{h_i}(x-t_i) \end{equation*}\]

Integrating twice,

\[ S_i(x)=\frac{z_i}{6h_i}(t_{i+1}-x)^3+\frac{z_{i+1}}{6h_i} (x-t_i)^3+C_i(t_{i+1}-x)+D_i(x-t_i) \]

The interpolation conditions then determine \(C_i\) and \(D_i\):

(4.9)#\[S_i(x) = \frac{z_i}{6h_i}(t_{i+1}-x)^3+\frac{z_{i+1}}{6h_i}(x-t_i)^3 + \left( \frac{y_i}{h_i}-\frac{z_ih_i}6\right) (t_{i+1}-x)+\left( \frac{y_{i+1}}{h_i}-\frac{z_{i+1}h_i}6\right) (x-t_i) \]

In effect, three quarters of the equations have been solved explicitly, leaving only the \(z_i\) to be determined using the remaining condition of the continuity of \(S^{\prime }(x)\).

Differentiating the above expression and evaluating at the appropriate points gives the expressions

(4.10)#\[S_i^{\prime }(t_i) = -\frac{h_i}3z_i-\frac{h_i}6z_{i+1}-\frac{y_i}{h_i}+ \frac{y_{i+1}}{h_i} \]
(4.11)#\[S_{i-1}^{\prime }(t_i) = -\frac{h_{i-1}}6z_{i-1}+\frac{h_{i-1}}3z_i-\frac{y_{i-1}}{h_{i-1}}+\frac{y_i}{h_{i-1}} \]

Equating these at the internal knots (and simplifying a bit) gives

(4.12)#\[ h_{i-1}z_{i-1}+2(h_i+h_{i-1})z_i+h_iz_{i+1}=\frac 6{h_i}(y_{i+1}-y_i)-\frac 6{h_{i-1}}(y_i-y_{i-1}) \]

These are \(n-1\) linear equations in the \(n+1\) unknowns \(z_i\), so various different cubic spline interpolants can be constructed by adding two extra conditions in the form of two more linear equations. The traditional way is the one mentioned above: require the second derivative to vanish at the two endpoints. That is

\[\begin{equation*} S^{\prime\prime}(t_0) = S^{\prime\prime}(t_n) = 0 \end{equation*}\]

which gives a natural spline.

In terms of the \(z_i\) this gives the trivial equations \(z_0 = z_n = 0\). Thus these two unknowns can be eliminated from the equations in (4.12) giving the following tridiagonal system:

\[\begin{eqnarray*} &&\left[ \begin{array}{cccc} 2(h_0+h_1) & {h_1} & & \\ h_1 & 2(h_1+h_2) & \ddots & \\ & \ddots & \ddots & h_{n-2} \\ & & h_{n-2} & 2(h_{n-2}+h_{n-1}) \end{array} \right] \left[ \begin{array}{c} z_1 \\ z_2 \\ \vdots \\ z_{n-1} \end{array} \right] \\ &=&\left[ \begin{array}{c} 6((y_2-y_1)/h_1-(y_1-y_0)/{h_0}) \\ 6((y_3-y_2)/{h_2}-(y_2-y_1)/{h_1}) \\ \vdots \\ 6((y_n-y_{n-1})/{h_{n-1}}-(y_{n-1}-y_{n-2})/{h_{n-2}})) \end{array} \right] \end{eqnarray*}\]

Solving tridiagonal systems is far more efficient if it can be done without pivoting by the method seen earlier, and this is a good method if the matrix is diagonally dominant.

That is true here: recalling that the \(t_i\) are in increasing order, each \(h_i\) is positive, so each diagonal element is at least twice the sum of the absolute values of all other elements in the same row. This result incidentally also shows that the equations have a unique solution, which means that the natural cubic spline exists and is determined uniquely by the data, requiring about \(O(n)\) operations.

Evaluation of \(S(x)\) is then done by finding the \(i\) such that \(t_i \leq x < t_{i+1}\) and then evaluating the appropriate case in (4.9).

Clamped Splines and Error Bounds#

Reference: Section 3.6 of [Kincaid and Chenney, 1990].

Though the algorithm for natural cubic spline interpolation is widely available in software [TO DO: add Numpy/Julia references] it is worth knowing the details. In particular, it is then easy to consider minor changes, like different conditions at the end points.

Recall that the natural or free spline has the boundary conditions

(4.13)#\[ S^{\prime\prime}(t_0) = S^{\prime\prime}(t_n) = 0 \]

When the spline is to be used to approximate a function \(f(x)\) one useful alternative choice of boundary conditions is to specify the derivative of the spline function to match that of \(f\) at the endpoints:

(4.14)#\[S^{\prime}(t_0) = f^{\prime}(t_0), \qquad S^{\prime}(t_n) = f^{\prime}(t_n)\]

This is called a clamped spline.

When the function \(f\) or its derivatives are not known, they can be approximated from the data itself. Thus a generalisation of the last condition is

(4.15)#\[S^{\prime}(t_0) = d_0, \qquad S^{\prime}(t_n) = d_n\]

for some approximations of the derivatives.

The subject of approximating a function’s derivative using a finite collection of values of the function will be taken up soon in more detail, but the simplest approach is to use the difference quotient from the definition of the derivative. This gives

\[\begin{eqnarray*} d_0 &:=& \frac{y_1-y_0}{h_0} = \frac{f(t_1)-f(t_0)}{t_1-t_0} \\ d_n &:=& \frac{y_n-y_{n-1}}{h_{n-1}} = \frac{f(t_n)-f(t_{n-1})}{t_n-t_{n-1}} \end{eqnarray*}\]

as one choice for the approximate derivatives.

The cubic splines given by using some such approximate derivatives will be called modified clamped spline.

These new conditions require a revision of the previous algorithm, but one benefit is that there is a better result guaranteeing the accuracy of the approximation.

To derive the new equations and algorithm for [modified] clamped splines return to the equations (4.10) and (4.11) used to derive the equation (4.12) that defines the tridiagonal system of \(n-1\) equations for the second derivatives \(z_1,\dots ,z_{n-1}\).

Instead of eliminating the two unknowns \(z_0\) and \(z_n\), we can add two more linear equations by using those equations (4.10) and (4.11) respectively at \(t_0\) and \(t_n\) [i.e. for \(i=0\) and \(i=n\)] and equating to the values to whatever \(d_0\) and \(d_n\) we are using:

\[\begin{eqnarray*} S^{\prime }(t_0) &=&S_0^{\prime }(t_0) \\ &=&-\frac{h_0}3z_0-\frac{h_0}6z_1-\frac{y_0}{h_0}+\frac{y_1}{h_0} \\ &=&d_0 \\ S^{\prime }(t_n) &=&S_{n-1}^{\prime }(t_n) \\ &=&\frac{h_{n-1}}6z_{n-1}+\frac{h_{n-1}}3z_n-\frac{y_{n-1}}{h_{n-1}}+\frac{% y_n}{h_{n-1}} \\ &=&d_n \end{eqnarray*}\]

In conjunction with equation (4.12), this gives the new tridiagonal system

\[\begin{eqnarray*} &&\left[ \begin{array}{ccccc} 2{h_0} & {h_0} & & & \\ h_0 & 2(h_0+h_1) & h_1 & & \\ & \ddots & \ddots & \ddots & \\ & & h_{n-2} & 2(h_{n-2}+h_{n-1}) & h_{n-1} \\ & & & {h_{n-1}} & 2{h_{n-1}} \end{array} \right] \left[ \begin{array}{c} z_0 \\ z_1 \\ \vdots \\ z_{n-1} \\ z_n \end{array} \right] \\ &=&\left[ \begin{array}{c} 6\left( (y_1-y_0)/h_0-d_0\right) \\ 6((y_2-y_1)/h_1-(y_1-y_0)/h_1) \\ \vdots \\ 6((y_n-y_{n-1})/h_{n-1}-(y_{n-1}-y_{n-2})/h_{n-2}) \\ 6\left( d_n-(y_n-y_{n-1})/h_{n-1}\right) \end{array} \right] \end{eqnarray*}\]

As in the case of the tridiagonal system for natural splines, the rows of the matrix also satisfy the condition of diagonal dominance, so again this system has a unique solution that can be computed accurately with only \(O(n)\) operations and no pivoting.

Error Bounds for Approximation with Clamped Splines#

If the exact derivatives mentioned in (4.14) are available, the errors are bounded as follows

Theorem 4.5

Suppose that \(f(x)\) is four times continuously differentiable on the interval \([a,b]\), with \(\max_{a\leq x\leq b}|f^{(4)}(x)|\leq M\). Then the clamped cubic spline approximation \(S(x)\) using the points \(a=t_0<t_1<\dots <t_n=b\) and \(y_i=f(t_i)\) satisfies

\[|f(x)-S(x)|\leq M\frac 5{384}\left( \max_{0\leq i\leq n-1}h_i\right) ^4\]

for every point \(x\in [a,b]\).

There is also an error bound of the same “fourth order” form for the natural cubic spline- that is, one of the form of some constant depending on \(f\) times the fourth power of \(\max_{0\leq i\leq n-1}h_i\). However it is far more complicated to describe: see page 138 of Burden and Faires for more comments on this.

When we have studied methods for approximating derivatives, it will be possible to establish error bounds for modified clamped splines with various approximations for the derivatives at the endpoints, so that they depend only on the values of \(f\) at the knots. With care, these more practical approximations can also be made fourth order accurate.

Hermite Cubic Approximation#

Reference: Section 6.2 of [Kincaid and Chenney, 1990].

Hermite interpolation in general consists in finding a polynomial \(H(x)\) to approximate a function \(f(x)\) by giving a set of points \(t_0,\dots ,t_n\) and requiring that the value of the polynomial and its first few derivatives match that of the original function.

The simplest case that is not simply polynomial interpolation or Taylor polynomial approximation is where there are two points, and first derivatives are required to match. This gives four conditions

\[\begin{eqnarray*} H(t_0) &=&f(t_0)=y_0,H^{\prime }(t_0)=f^{\prime }(t_0)=y_0^{\prime } \\ H(t_1) &=&f(t_1)=y_1,H^{\prime }(t_1)=f^{\prime }(t_1)=y_0^{\prime } \end{eqnarray*}\]

and counting constants suggests that there should be a unique cubic \(h\) with these properties. From now on, I will use “cubic” to include the degenerate cases that are actually quadratics and so on.

To determine this cubic it is convenient to put it in the form

\[\begin{equation*} H(x)=a+b(x-t_0)+(x-t_0)^2[c+d(x-t_{i+1})] \end{equation*}\]

and let \(h=t_1-t_0\): then applying the four conditions in turn gives

\[\begin{eqnarray*} a &=&y_0,\qquad b=y_0^{\prime } \\ c &=&\frac{y_1-y_0}{h^2}-\frac{y_0^{\prime }}h,\qquad d=\frac{y_1^{\prime }-y_0^{\prime }}{3h^2}-\frac{2(y_1-y_0)}{3h^3} \end{eqnarray*}\]

With more points, one could look for higher order polynomials, but it is useful in some cases to construct a piecewise cubic approximation, with the cubic between each consecutive pair of nodes determined only by the value of the function and its derivative at those nodes. Thus the piecewise Hermite cubic approximation to \(f\) on the interval \([a,b]\) for the points \(a=t_0<t_1<\dots <t_n\) is given by a set of \(n\) cubics

\[\begin{equation*} H(x)=H_i(x)=a_i+b_i(x-t_i)+(x-t_i)^2[c_i+d_i(x-t_{i+1})],\quad t_i\leq x<t_{i+1} \end{equation*}\]

with

\[\begin{eqnarray*} a_i &=&y_i,\qquad b_i=y_i^{\prime } \\ c_i &=&\frac{y_{i+1}-y_i}{h_i^2}-\frac{y_i^{\prime }}{h_i} \\ d_i &=&\frac{y_{i+1}^{\prime }-y_i^{\prime }}{3h_i^2}-\frac{2(y_{i+1}-y_i)}{3h_i^3} \end{eqnarray*}\]

where \(y_i:=f(t_i)\), \(y_i^{\prime }:=f^{\prime }(t_i)\) and \(h_i:=t_{i+1}-t_i\). Most often, the points are equally spaced so that

\[\begin{equation*} h_i-h:=(b-a)/n. \end{equation*}\]

There is an error formula for this (which is also an error formula for a clamped spline in the case \(n=1\))

Theorem 4.6

For \(x\in [t_t,t_{i+1}]\)

\[f(x)-H(x)=\frac{f^{(4)}(\xi )}{4!}[(x-t_i)(x-t_{i+1})]^2\]

where \(\xi \in [t_t,t_{i+1}]\) Thus if \(|f^{(4)}(x)|\leq M_i\) for \(x \in [t_t,t_{i+1}]\),

\[\left| f(x)-H(x)\right| \leq \frac{M_i}{384}h_i^4\]

Proof. See page 311 of [Kincaid and Chenney, 1990].

Thus the accuracy is about as good as for clamped splines: the trade off is that the Hermite approximation is less smooth (only one continuous derivative at the nodes), but the error is “localised”. That is, if the fourth derivative of \(f\) is large or non-existent in one interval, the accuracy of the Hermite approximation only suffers in that interval, not over the whole domain.

However this comparison is a bit unfair, as the Hermite approximation uses the extra information about the derivatives of \(f\). This is also often impractical: either the derivatives are not known, or there is no known function \(f\) but only a collection of values \(y_i\).

To overcome this problem, the derivatives needed in the above formulas can be approximated from the \(y_i\) as was done for modified clamped splines. To do this properly, it is worth taking a thorough look at methods for approximating derivatives and bounding the accuracy of such approximations.