3.8. Faster Methods for Solving \(Ax = b\) for Tridiagonal and Banded matrices, and Strict Diagonal Dominance#

Reference:

Section 6.6 Special Types of Matrices in [Burden et al., 2016], the sub-sections on Band Matrices and Tridiagonal Matrices.

Tridiagonal systems#

Differential equations often lead to the need to solve systems of equations \(T x = b\) where the matrix \(T\) is tri-diagonal: the only non-zero elements are on the main diagonal and the diagonals adjacent to it on either side, so that \(T_{i,j} = 0\) if \(|i - j| > 1\). That is, the system looks like:

Definition 3.8 (Tridiagonal matrix)

A matrix \(T\) is tridiagonal if the only non-zero elements are on the main diagonal and the diagonals adjacent to it on either side, so that \(T_{i,j} = 0\) if \(|i - j| > 1\). That is, the system looks like:

\[\begin{split} T x = \left[\begin{array}{cccccc} d_1 & u_1 \\ l_1 & d_2 & u_2 & \\ & l_2 & d_3 & u_3\\&& \ddots & \ddots & \ddots\\&&& l_{n-2} & d_{n-1} & u_{n-1}\\&&&& l_{n-1} & d_{n}\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\\vdots\\x_n \end{array}\right] = \left[\begin{array}{c}b_1\\b_2\\\vdots\\b_n\end{array}\right] \end{split}\]

with all “missing” entries being zeros. The notation used here suggests one efficient way to store such a matrix: as three 1D arrays \(d\), \(l\) and \(u\).

(Such equations also arise in other important situations, such as Spline Interpolation.)

It can be verified that LU factorization preserves all the non-zero values, so that the Doolittle algorithm — if it succeeds without any division by zero — gives \(T = L U\) with the form

\[\begin{split} L = \left[\begin{array}{cccccc} 1 \\ L_1 & 1 \\ & L_2 & 1 \\ && \ddots & \ddots \\ &&& L_{n-2} & 1 \\ &&&& L_{n-1} & 1 \end{array}\right], \; U = \left[\begin{array}{cccccc} D_1 & u_1 \\ & D_2 & u_2 \\ && D_3 & u_3 \\ &&& \ddots & \ddots \\ &&&& D_{n-1} & u_{n-1} \\ &&&&& D_n \end{array}\right] \end{split}\]

Note that the first non-zero element in each column is unchanged, as with a full matrix, but now it means that the upper diagonal elements \(u_i\) are unchanged.

Again, one way to describe and store this information is with just the two new 1D arrays \(L\) and \(D\), along with the unchanged array \(u\).

Algorithms#

Algorithm 3.9 (LU factorization)

\(D_1 = d_1\)
for i from 2 to n
\(\quad L_{i-1} = l_{i-1}/D_{i-1}\)
\(\quad D_{i} = d_{i} - L_{i-1} u_{i-1}\)
end

Algorithm 3.10 (Forward substitution)

\(c_1 = b_1\)
for i from 2 to n
\(\quad c_{i} = b_{i} - L_{i-1} c_{i-1}\)
end

Algorithm 3.11 (Backward substitution)

\(x_n = c_n/D_n\)
for i from n-1 down to 1
\(\quad x_{i} = (c_{i} - u_{i} x_{i+1})/D_i\)
end

Generalizing to banded matrices#

As we have seen, approximating derivatives to higher order of accuracy and approximating derivatives of order greater than two requires more than three nodes, but the locations needed are all close to the ones where the derivative is being approximated. For example, the simplest symmetric approximation of the fourth derivative \(D^4 f(x)\) used values from \(f(x-2h)\) to \(f(x+2h)\). Then row \(i\) of the corresponding matrix has all its non-zero elements at locations \((i,i-2)\) to \((i, i+2)\): the non-zero elements lie in the narrow “band” where \(|i - j| \leq 2\), and thus on five “central” diagonals.

This is a penta-digonal matrix, and an example of the larger class of banded matrices: ones in which all the non-zero elements have indices \( -p \leq j - i \leq q\) for \(p\) and \(q\) smaller than \(n\) — usually far smaller; \(p = q = 2\) for a penta-digonal matrix.

Let us recap the general Doolittle algorithm for computing an LU factorization:

Algorithm 3.12 (Doolittle algorithm for computing an LU factorization)

The top row is unchanged:
for j from 1 to n
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The left column requires no sums:
for i from 2 to n
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end

The main loop: for k from 2 to n
\(\quad\) for j from k to n
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \sum_{s=1}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to n
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \sum_{s=1}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end

Eliminating redundant calculation in the above#

With a banded matrix, many of the entries at right are zero, particularly in the two sums, which is where most of the operations are. Thus we can rewrite, exploiting the fact that all elements with indices \(j-i < -p\) or \(j-i > q\) are zero. To start with, the top diagonal is not modified, as already noted for the tridiagonal case: \(u_{k,k+q} = a_{k,k+q}\) for \(1 \leq k \leq n-q\).

Algorithm 3.13 (LU factorization of a banded matrix)

The top row is unchanged:
for j from 1 to 1+q
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The top non-zero diagonal is unchanged:
for k from 1 to n - q
\(\quad\) \(u_{k,k+q} = a_{k,k+q}\)
end
The left column requires no sums:
for i from 2 to 1+p
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end
The main loop:
for k from 2 to n
\(\quad\) for j from k to min(n, k+q-1)
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \displaystyle\sum_{s=max(1, k-p, j-q)}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to min(n,k+p-1)
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \displaystyle\sum_{s=max(1,i-p,k-q)}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end

It is common for a banded matrix to have equal band-width on either side, \(p=q\), as with tridiagonal and pentadiagonal matrices. Then the algorithm is somewhat simpler:

Algorithm 3.14 (LU factorization of a banded matrix, \(p=q\))

The top row is unchanged:
for j from 1 to 1+p
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The top non-zero diagonal is unchanged:
for k from 1 to n - p
\(\quad\) \(u_{k,k+p} = a_{k,k+p}\)
end
The left column requires no sums:
for i from 2 to 1+p
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end

The main loop:
for k from 2 to n
\(\quad\) for j from k to min(n, k+p-1)
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \displaystyle\sum_{s=max(1, j-p)}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to min(n,k+p)
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \displaystyle\sum_{s=max(1,i-p)}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end

Strict diagonal dominance helps again#

These algorithms for banded matrices do no pivoting, and that is highly desirable, because pivoting creates non-zero elements outside the “band” and so can force one back to the general algorithm. Fortunately, we have seen one case where this is fine: the matrix being either row-wise or column-wise strictly diagonally dominant.