3.6. Error bounds for linear algebra, condition numbers, matrix norms, etc.#

References:

  • Section 2.3.1 Error Magnification and Condition Number of [Sauer, 2019].

  • Section 7.5 Error Bounds and Iterative Refinement of [Burden et al., 2016] — but you may skip the last part, on Iterative Refinement; that is not relevant here.

  • Section 8.4 of [Chenney and Kincaid, 2012].

Residuals, backward errors, forward errors, and condition numbers#

For an approximation \(x_a\) of the solution \(x\) of \(A x = b\), the residual \(r = A x_a - b\) measures error as backward error, often measured by a single number, the residual norm \(\| A x_a - b \|\). Any norm could be used, but the maximum norm is usualt preferred, for reasons that we will see soon.

The corresponding (dimensionless) measure of relative error is defined as

\[\frac{\|r\|}{\|b\|}.\]

However, these can greatly underestimate the forward errors in the solution: the absolute error \(\|x - x_a\|\) and relative error

\[Rel(x_a) = \frac{\|x - x_a\|}{\| x \|}\]

To relate these to the residual, we need the concepts of a matrix norm and the condition number of a matrix.

Matrix norms induced by vector norms#

Given any vector norm \(\| \cdot \|\) — such as the maximum (“infinity”) norm \(\| \cdot \|_\infty\) or the Euclidean norm (length) \(\| \cdot \|_2\) — the correponding induced matrix norm is

\[ \| A \| := \max_{x \neq 0} \frac{\| Ax \|}{\| x \|}, = \max_{\|x\|=1} \| Ax \| \]

This maximum exists for ethe rof these vector norms, and for the infinity norm there ia an explicit formula for it: for any \(m\times n\) matrix,

\[ \|A\|_\infty = \max_{i=1}^m \sum_{j=1}^n |a_{ij}| \]

(On the other hand, it is far harder to compute the Euclidean norm of a matrix: the formula requires computing eigenvalues.)

Note that when the matrix is a vector considered as a matrix with a single column — so \(n=1\) — the sum goes away, and this agrees with the infinity vector norm. This allows us to consider vectors as being just matrices with a single column, which we will often do from now on.

Properties of (induced) matrix norms#

These induced matrix norms have many properties in common with Euclidean length and other vector norms, but there can also be products, and then one has to be careful.

  1. \(\|A\| \geq 0\) (positivity)

  2. \(\| A \| = 0\) if and only if \(A = 0\) (definiteness)

  3. \(\| c A \| = |c| \, \|A\|\) for any constant \(c\) (absolute homogeneity)

  4. \(\| A + B \| \leq \| A \| + \| B \|\) (sub-additivity or the triangle inequality),
    and when the product of two matrices makes sense (including matrix-vector products),

  5. \(\| A B \| \leq \| A \| \, \| B \|\) (sub-multiplicativity)

Note the failure to always have equality with products. Indeed one can have \(A B = 0\) with \(A\) and \(B\) both non-zero, such as when \(A\) is a singular matrix and \(B\) is a null-vector for it.

Remark 3.17 (Other matrix norms)

There are other matrix norms of use in some contexts, in particular the Frobenius norm. Then the above properties are often used to define what it is to be a matrix form, much as the first four define what it is to be a vector norm.

Remark 3.18 (Julia functions norm and opnorm)

Julia package LinearAlgebra provides the functions norm and opnorm for evaluating matrix norms, as seen in the examples in the previous section Solving Ax = b With Both Pivoting and LU Factorization, where norm computes the vector norms \(\|v\|_p\) and opnorm computes the coresponding matrix norms (“operator norms”) \(\|A\|_p\).

  • If \(p\) is omitted, it defaults to \(p=2\), so norm(v) is the familiar Euclidean vector norm.

  • To get the “maximum” or “\(\infty\)” norm, use value Inf for p.

Warning. Even if the argument of norm is a matrix, it is treated as a vector: for example, norm(A, Inf) returns the maximum of all the absolute values of the elements of an array A.

Relative error bound and condition number#

It can be proven that, for any choice of norm,

\[ Rel(x_a) = \frac{\|x - x_a\|}{\| x \|} \leq \|A\|\|A^{-1}\|\frac{\|r\|}{\|b\|}, \]

where the last factor is the relative backward error.

Since we can (though often with considerable effort, due to the inverse!) compute the right-hand side when the infinity norm is used, we can compute an upper bound on the relative error. From this, an upper bound on the absolute error can be computed if needed.

The growth factor between the relative backward error measured by the residual and the relative (forward) error is called the condition number, \(K(A)\):

\[\kappa(A) := \|A\| \|A^{-1}\|\]

so that the above bound on the relative error can be restated as

\[\text{Rel}(x_a) = \frac{\|x - x_a\|}{\| x \|} \leq \kappa(A) \frac{\|r\|}{\|b\|}\]

Actually there is one condition number for each choice of norm, so we work with

\[\kappa_\infty(A) := \|A\|_\infty \|A^{-1}\|_\infty\]

Note that for a singular matrix, this is undefined: we can intuitively say that the condition number is then infinite.
At the other extreme, the identity matrix \(I\) has norm 1 and condition number 1 (using any norm), and this is the best possible because in general \(\kappa(A) \geq 1\). (This follows from sub-multiplicativity.)

Aside: estimating \(\| A^{-1} \|_\infty\) and thence the condition number#

In Julia, good approximations of condition numbers are given by the function cond from package LinearAlgebra.
As with functions norm and opnorm, the simple form cond(A) defaults to \(\kappa_2(A)\) based on the Euclidian length \(\| \cdot \|_2\) for vectors; to get the infinity norm version \(\kappa_\infty(A)\) use cond(A, Inf).

This is not done exactly, since computing the inverse is a lot of work for large matrices and good estimates can be got far more quickly. The basic idea is start with the formula

\[\| A^{-1} \| = \max_{\|x\|=1} \| A ^{-1} x \|\]

and instead compute the maximum over some finite selection of values for \(x\): call them \(x^{(k)}\). Then to evaluate \(y^{(k)} = A ^{-1} x^{(k)}\), express this through the equation \(A y^{(k)} = x^{(k)}\). Once we have an LU factorization for \(A\) (which one probably would have when exploring errors in a numerical solution of \(Ax = b\)) each of these systems can be solved relatively fast: Then

\[\| A^{-1} \| \approx \max_k \| y^{(k)} \|.\]

Well-conditioned and ill-conditioned problems and matrices#

Condition numbers, giving upper limit on the ratio of forward error to backward error, measure the amplification of errors, and have counterparts in other contexts. For example, with an approximation \(r_a\) of a root \(r\) of the equation \(f(x) = 0\), the ratio of forward error to backward error is bounded by \(\displaystyle \max 1/| f'(x) | = \frac{1}{\min | f'(x) |}\), where the maximum only need be taken over an interval known to contain both the root and the approximation. This condition number becomes “infinite” for a multiple root, \(f'(r) = 0\), related to the problems we have seen in that case.

Careful calculation of an approximate solution \(x_a\) of \(Ax = b\) can often get a residual that is at the level of machine rounding error, so that roughly the relative backward error is of size comparable to the machine unit, \(u\). The condition number then guarantees that the (forward) relative error is no greater than about \(u \, \kappa(A)\).

In terms of significant bits, with \(p\) bit machine arithmetic, one can hope to get \(p - \log_2(\kappa(A))\) significant bits in the result, but can not rely on more, so one loses \(\log_2(\kappa(A))\) significant bits. Compare this to the observation that one can expect to lose at least \(p/2\) significant bits when using the approximation \(Df(x) \approx D_hf(x) - (f(x+h) = f(x))/h\).

A well-conditioned problem is one that is not too highly sensitive to errors in rounding or input data; for an eqution \(Ax = b\), this corresponds to the condition number of \(A\) not being to large; the matrix \(A\) is then sometimes also called well-conditioned. This is of course vague, but might typically mean that \(p - \log_2(\kappa(A))\) is a sufficient number of significant bits for a particular purpose.

A problem that is not deemed well-conditioned is called ill-conditioned, so that a matrix of uncomfortably large condition number is also sometimes called ill-conditioned. An ill-conditioned problem might still be well-posed, but just requiring careful and precise solution methods.

Example 3.6 (the Hilbert matrices)

The \(n \times n\) Hilbert matrix \(H_n\) has elements

\[H_{i, j} = \frac{1}{i+j-1}\]

For example

\[\begin{split}H_4 = \left[ \begin{array}{cccc} 1 & 1/2 & 1/3 & 1/4 \\ 1/2 & 1/3 & 1/4 & 1/5 \\1/3 & 1/4 & 1/5 & 1/6 \\1/4 & 1/5 & 1/6 & 1/7 \end{array} \right]\end{split}\]

and for larger or smaller \(n\), one simply adds or remove rows below and columns at right.

These matrices arise in important situations like finding the polynomial of degree \(n-1\) that fits given data in the sense of minimizing the root-mean-square error — as we will discuss later in this course if there is time and interest.

Unfortunately as \(n\) increases the condition number grows rapidly, causing severe rounding error problems. To illustrate this, I will do something that one should usually avoid: compute the inverse of these matrices. This is also a case that shows the advatage of the LU factorization, since one computes the inverse by succesively computing each column, by solving \(n\) different systems of equations, each with the same matrix \(A\) on the left-hand side.

include("NumericalMethods.jl")
using .NumericalMethods: lu_factorize, forwardsubstitution, backwardsubstitution, solvelinearsystem, printmatrix
using LinearAlgebra: norm, opnorm, cond
using Random: rand
function inverse(A, demomode=false)
    # Use sparingly; there is usually a way to avoid computing inverses that is faster and with less rounding error!
    n = size(A)[1]  # First index of the size, which is (n, n)
    A_inverse = zeros(size(A))
    (L, U) = lu_factorize(A)
    for i in 1:n
        if demomode; println("i=$i"); end
        e_i = zeros(n)
        e_i[i] = 1.0
        if demomode; println("e_$i=$e_i"); end
        c = forwardsubstitution(L, e_i)
        A_inverse[:,i] = backwardsubstitution(U, c)
        #A_inverse[:,i] = solvelinearsystem(A, e_i)
    end
    return A_inverse
end;
function hilbert(n)
    H = zeros(n,n)
    for i in 1:n
        for j in 1:n
            H[i,j] = 1.0/(i + j - 1.0)
        end
    end
    return H
end;
for n in 2:5
    H_n = hilbert(n)
    println("H_$n is")
    printmatrix(round.(H_n, sigdigits=4))
    H_n_inverse = inverse(H_n)
    println("and its inverse is")
    printmatrix(round.(H_n_inverse, sigdigits=4))
    println("to verify, their product is")
    printmatrix(round.(H_n * H_n_inverse, sigdigits=2))
    println()
end
H_2 is
[ 1.0 0.5 
  0.5 0.3333 ]
and its inverse is
[ 4.0 -6.0 
  -6.0 12.0 ]
to verify, their product is
[ 1.0 0.0 
  0.0 1.0 ]

H_3 is
[ 1.0 0.5 0.3333 
  0.5 0.3333 0.25 
  0.3333 0.25 0.2 ]
and its inverse is
[ 9.0 -36.0 30.0 
  -36.0 192.0 -180.0 
  30.0 -180.0 180.0 ]
to verify, their product is
[ 1.0 0.0 0.0 
  0.0 1.0 0.0 
  0.0 0.0 1.0 ]

H_4 is
[ 1.0 0.5 0.3333 0.25 
  0.5 0.3333 0.25 0.2 
  0.3333 0.25 0.2 0.1667 
  0.25 0.2 0.1667 0.1429 ]
and its inverse is
[ 16.0 -120.0 240.0 -140.0 
  -120.0 1200.0 -2700.0 1680.0 
  240.0 -2700.0 6480.0 -4200.0 
  -140.0 1680.0 -4200.0 2800.0 ]
to verify, their product is
[ 1.0 0.0 2.3e-13 -1.1e-13 
  -1.3e-16 1.0 1.1e-13 -1.5e-13 
  -2.3e-15 2.2e-14 1.0 -4.5e-14 
  -4.0e-15 6.8e-14 -6.4e-14 1.0 ]

H_5 is
[ 1.0 0.5 0.3333 0.25 0.2 
  0.5 0.3333 0.25 0.2 0.1667 
  0.3333 0.25 0.2 0.1667 0.1429 
  0.25 0.2 0.1667 0.1429 0.125 
  0.2 0.1667 0.1429 0.125 0.1111 ]
and its inverse is
[ 25.0 -300.0 1050.0 -1400.0 630.0 
  -300.0 4800.0 -18900.0 26880.0 -12600.0 
  1050.0 -18900.0 79380.0 -117600.0 56700.0 
  -1400.0 26880.0 -117600.0 179200.0 -88200.0 
  630.0 -12600.0 56700.0 -88200.0 44100.0 ]
to verify, their product is
[ 1.0 5.9e-13 -8.3e-13 1.2e-12 8.5e-13 
  2.3e-14 1.0 8.2e-14 -1.0e-12 2.0e-13 
  -9.4e-16 3.6e-13 1.0 -6.0e-13 -8.7e-13 
  -1.4e-14 6.8e-13 -2.7e-12 1.0 -1.8e-12 
  -8.6e-15 2.5e-13 -1.8e-12 1.8e-12 1.0 ]

Note how the inverses have some surprisingly large elements; this is the matrix equivalent of a number being very close to zero and so with a very large reciprocal.

Since we have the inverses, we can compute the matrix norms of each \(H_n\) and its inverse, and thence their condition numbers.

for n in 2:5
    H_n = hilbert(n)
    println("H_$n is")
    printmatrix(round.(H_n, sigdigits=6))
    println("with infinity norm $(round(opnorm(H_n, Inf), sigdigits=4))")
    H_n_inverse = inverse(H_n)
    println("and its inverse is")
    printmatrix(round.(H_n_inverse, sigdigits=6))
    println("with infinity norm $(round(opnorm(H_n_inverse, Inf), sigdigits=4))") 
    println("Thus the condition number of H_$n is $(round(opnorm(H_n, Inf) * opnorm(H_n_inverse, Inf), sigdigits=4))")
    println()
end
H_2 is
[ 1.0 0.5 
  0.5 0.333333 ]
with infinity norm 1.5
and its inverse is
[ 4.0 -6.0 
  -6.0 12.0 ]
with infinity norm 18.0
Thus the condition number of H_2 is 27.0

H_3 is
[ 1.0 0.5 0.333333 
  0.5 0.333333 0.25 
  0.333333 0.25 0.2 ]
with infinity norm 1.833
and its inverse is
[ 9.0 -36.0 30.0 
  -36.0 192.0 -180.0 
  30.0 -180.0 180.0 ]
with infinity norm 408.0
Thus the condition number of H_3 is 748.0

H_4 is
[ 1.0 0.5 0.333333 0.25 
  0.5 0.333333 0.25 0.2 
  0.333333 0.25 0.2 0.166667 
  0.25 0.2 0.166667 0.142857 ]
with infinity norm 2.083
and its inverse is
[ 16.0 -120.0 240.0 -140.0 
  -120.0 1200.0 -2700.0 1680.0 
  240.0 -2700.0 6480.0 -4200.0 
  -140.0 1680.0 -4200.0 2800.0 ]
with infinity norm 13620.0
Thus the condition number of H_4 is 28370.0

H_5 is
[ 1.0 0.5 0.333333 0.25 0.2 
  0.5 0.333333 0.25 0.2 0.166667 
  0.333333 0.25 0.2 0.166667 0.142857 
  0.25 0.2 0.166667 0.142857 0.125 
  0.2 0.166667 0.142857 0.125 0.111111 ]
with infinity norm 2.283
and its inverse is
[ 25.0 -300.0 1050.0 -1400.0 630.0 
  -300.0 4800.0 -18900.0 26880.0 -12600.0 
  1050.0 -18900.0 79380.0 -117600.0 56700.0 
  -1400.0 26880.0 -117600.0 179200.0 -88200.0 
  630.0 -12600.0 56700.0 -88200.0 44100.0 ]
with infinity norm 413300.0
Thus the condition number of H_5 is 943700.0

Next, experiment with solving equations, to compare residuals with actual errors.

I will use the testing strategy of starting with a known solution \(x\), from which the right-hand side \(b\) is computed; then slight simulated error is introduced to \(b\). Running this repeatedly with use of different random “errors” gives an idea of the actual error.

Remark 3.19 (Julia function collect)

The function collect converts the abstract “range” object given by function range into an ordinary 1D array.

for n in 2:5
    println("n=$n")
    H_n = hilbert(n)
    x = collect(range(1.0, n, n))
    println("x is $x")
    b = H_n * x
    println("b is $b")
    error_scale = 1e-8
    b_imperfect = b + 2.0 * error_scale * (rand(Float64, (n)) .- 0.5) # add random "errors" between -error_scale and error_scale
    println("b has been slightly changed to $b_imperfect")
    x_computed = solvelinearsystem(H_n, b_imperfect)
    residual = b - H_n * x_computed
    relative_backward_error = norm(residual, Inf)/norm(b, Inf)
    println("The residual maximum norm is $(round(norm(residual, Inf), sigdigits=2))")
    println("and the relative backward error ||r||/||b|| is $(round(relative_backward_error, sigdigits=2))")
    absolute_error = norm(x - x_computed, Inf)
    relative_error = absolute_error/norm(x, Inf)
    println("The absolute error is $(round(absolute_error, sigdigits=2))")
    println("The relative error is $(round(relative_error, sigdigits=2))")
    error_bound = cond(H_n, Inf) * relative_backward_error
    println("For comparison, the relative error bound from the formula above is $(round(error_bound, sigdigits=2))")
    println("\nBeware: the relative error is larger than the relative backward error by a factor ",
        "$(round(relative_error/relative_backward_error, sigdigits=2))")
    println()
end
n=2
x is [1.0, 2.0]
b is [2.0, 1.1666666666666665]
b has been slightly changed to [2.0000000080803297, 1.1666666764748042]
The residual maximum norm is 9.8e-9
and the relative backward error ||r||/||b|| is 4.9e-9
The absolute error is 6.9e-8
The relative error is 3.5e-8
For comparison, the relative error bound from the formula above is 1.3e-7

Beware: the relative error is larger than the relative backward error by a factor 7.1

n=3
x is [1.0, 2.0, 3.0]
b is [3.0, 1.9166666666666665, 1.4333333333333333]
b has been slightly changed to [2.999999991652919, 1.9166666621675716, 1.433333325752684]
The residual maximum norm is 8.3e-9
and the relative backward error ||r||/||b|| is 2.8e-9
The absolute error is 8.1e-7
The relative error is 2.7e-7
For comparison, the relative error bound from the formula above is 2.1e-6

Beware: the relative error is larger than the relative backward error by a factor 96.0

n=4
x is [1.0, 2.0, 3.0, 4.0]
b is [4.0, 2.716666666666667, 2.1, 1.7214285714285713]
b has been slightly changed to [4.000000006782543, 2.7166666759068043, 2.099999995598059, 1.72142857008299]
The residual maximum norm is 9.2e-9
and the relative backward error ||r||/||b|| is 2.3e-9
The absolute error is 4.6e-5
The relative error is 1.2e-5
For comparison, the relative error bound from the formula above is 6.6e-5

Beware: the relative error is larger than the relative backward error by a factor 5000.0

n=5
x is [1.0, 2.0, 3.0, 4.0, 5.0]
b is [5.0, 3.5500000000000003, 2.8142857142857145, 2.3464285714285715, 2.0174603174603174]
b has been slightly changed to [5.000000006354475, 3.5500000045593687, 2.814285713289512, 2.346428570499546, 2.0174603141398526]
The residual maximum norm is 6.4e-9
and the relative backward error ||r||/||b|| is 1.3e-9
The absolute error is 0.00036
The relative error is 7.1e-5
For comparison, the relative error bound from the formula above is 0.0012

Beware: the relative error is larger than the relative backward error by a factor 56000.0

We see in these experiments that:

  • As the condition number increases, the relative error becomes increasingly larger than the backward error computed from the residual.

  • It is less than the above bound \(\displaystyle \text{Rel}(x_a) = \frac{\|x - x_a\|}{\| x \|} \leq \kappa(A) \frac{\|r\|}{\|b\|},\) and typically quite a bit less.