Numerical Analysis Notes

Math 342 - Spring 2026

Jump to: Math 342 homepage, Week 1, Week 2, Week 3, Week 4, Week 5, Week 6, Week 7, Week 8, Week 9, Week 10, Week 11, Week 12, Week 13, Week 14

Week 1 Notes

Day Topic
Mon, Jan 12 Floating point arithmetic
Wed, Jan 14 Relative error, significant digits
Fri, Jan 16 Taylorโ€™s theorem

Mon, Jan 12

We talked about how computers store floating point numbers. Most modern programming languages store floating point numbers using the IEEE 754 standard.

In the IEEE 754 standard, a 64-bit floating point number has the form

๐ฏ๐š๐ฅ๐ฎ๐ž=(โˆ’1)๐ฌ๐ข๐ ๐งร—2(๐ž๐ฑ๐ฉ๐จ๐ง๐ž๐ง๐ญโˆ’1023)ร—(1+๐Ÿ๐ซ๐š๐œ๐ญ๐ข๐จ๐ง)\mathbf{value} = (-1)^\mathbf{sign} \, \times \, 2^{(\mathbf{exponent} - 1023)} \, \times \, (1 + \mathbf{fraction})

where

We talked about how to convert between binary numbers and decimal numbers. This system works very well, but it leads to weird outcomes like 0.1 + 0.1 + 0.1 = 0.30000000000000004.

When you convert 110=0.1\tfrac{1}{10} = 0.1 to binary, you get a infinitely repeating binary decimal: 110=(0.000110011001100โ€ฆ)2.\tfrac{1}{10} = (0.000110011001100\ldots)_2. So any finite binary representation of 110\tfrac{1}{10} will have rounding errors.

We did the following examples in class:

  1. Convert (10110)2 to decimal. (https://youtu.be/a2FpnU9Mm3E)

  2. Convert 35 to binary. (https://youtu.be/gGiEu7QTi68)

  3. What is the 64-bit string that represents the number 35 in the IEEE standard?

  4. What are the largest and smallest 64-bit floating point numbers that can be stored?

  5. In Python, compute 2.0**1024 and 2**1024. Why do you get different results?

  6. In Python, compare 2.0**1024 with 2.0**(-1024) and 2.0**(-1070). What do you notice?

Wed, Jan 14

Today we talked about significant digits. Here is a quick video on how these work.

Rules for Significant Digits.

  1. Addition/Subtraction. The last common digit that is significant for both numbers is the last significant digit of the answer.
  2. Multiplication/Division. Result has significant digits equal to the minimum number of significant digits of the two inputs.

Next we defined absolute and relative error:

Definition. Let x*x^* be an approximation of a real number xx.

The base-10 logarithm of the relative error is approximately the number of significant digits, so you can think of significant digits as a measure of relative error.

Intuitively, addition & subtraction โ€œplay niceโ€ with absolute error while multiplication and division โ€œplay niceโ€ with relative error. This can lead to problems:

  1. Catastrophic cancellation. When you subtract two numbers of roughly the same size, the relative error can get much worse. For example, both 53.76 and 53.74 have 4 significant digits, but 53.76โˆ’53.74=0.0253.76 - 53.74 = 0.02 only has 1 significant digit.

  2. Useless precision. If you add two numbers with very different magnitudes, then having a very low relative error in the smaller one will not be useful.

We did these examples in class.

  1. ฯ€=3.141592...\pi = 3.141592.... What is the absolute and relative error if your round ฯ€\pi to 3.143.14?

Rounding Error. The worst case relative error from rounding to kk significant digits is ๐ซ๐ž๐ฅ๐š๐ญ๐ข๐ฏ๐ž๐ž๐ซ๐ซ๐จ๐ซโ‰ค{5ร—10โˆ’k(decimal)2โˆ’k(binary).\mathbf{relative~error} \le \begin{cases} 5 \times 10^{-k} & (\text{decimal}) \\ 2^{-k} & (\text{binary}). \end{cases} Since 64-bit floating point numbers have up to 53 significant digits, they typically have a relative error of up to 2โˆ’53โ‰ˆ1.11ร—10โˆ’162^{-53} \approx 1.11 \times 10^{-16}. This quantity is known as machine epsilon.

You can sometimes re-write algorithms on a computer to avoid issues with floating point numbers such as overflow/underflow and catastrophic cancellation.

  1. Consider the function f(x)=1โˆ’cosxsinxf(x) = \dfrac{1 - \cos x}{\sin x}. Use Python to compute f(10โˆ’7)f(10^{-7}).

  2. The exact answer to previous question is 0.00000005=5ร—10โˆ’80.00000005 = 5 \times 10^{-8} (accurate to 22 decimal places). Use this to find the relative error in your previous calculation.

  3. A better way to compute f(x)f(x) is to use a trick to avoid the catastrophic cancellation: f(x)=1โˆ’cosxsinx=1โˆ’cosxsinxโ‹…(1+cosx1+cosx)=sinx1+cosx.f(x) = \dfrac{1-\cos x}{\sin x} = \dfrac{1 - \cos x}{\sin x} \cdot \left( \frac{1+ \cos x}{1+\cos x} \right) = \dfrac{\sin x}{1 + \cos x}. Use this new formula to compute f(10โˆ’7)f(10^{-7}). What is the relative error now?

Stirlingโ€™s formula is a famous approximation for the factorial function. n!โ‰ˆ2ฯ€nnnen.n! \approx \sqrt{2 \pi n} \frac{n^n}{e^n}. We approximated Stirlingโ€™s formula using the following code.

import math
n = 100
print(float(math.factorial(n)))
f = lambda n: math.sqrt(2 * math.pi * n) * n ** n / math.exp(n)
print(f(n))

Our formula worked well until n=143n=143, then we got an overflow error. The problem was that nnn^n got too big to convert to a floating point number. But you can prevent the overflow error by adjusting the formula slightly to.

n = 143
f = lambda n: math.sqrt(2 * math.pi * n) * (n / math.e) ** n
print(f(n))

Fri, Jan 26

Today we reviewed Taylor series. We recalled the following important Maclaurin series (which are Taylor series with center c=0c = 0):

We graphed Maclaurin polynomials for cosx\cos x and 11โˆ’x\dfrac{1}{1-x} on Desmos to see how they converge with different radii of convergence.

We also use the Maclaurin series for sinxx\dfrac{\sin x}{x} to approximate the integral

โˆซโˆ’ฯ€ฯ€sinxxdx.\displaystyle \int_{-\pi}^\pi \frac{\sin x}{x} \, dx.

Then we did the following workshop in class.


Week 2 Notes

Day Topic
Mon, Jan 19 Martin Luther King day - no class
Wed, Jan 21 Taylorโ€™s theorem - conโ€™d
Fri, Jan 23 Babylonian algorithm for square roots

Wed, Jan 21

Today we reviewed some theorems that we will need throughout the course. The first is probably the most important theorem in numerical analysis since it lets us estimate error when using Taylor series approximations.

Taylorโ€™s Theorem. Let ff be a function that has (n+1)(n+1) derivatives in the interval between xx and cc. Then there exists a zz strictly inside the interval from xx to cc such that f(x)โˆ’Pn(x)=f(n+1)(z)(n+1)!(xโˆ’c)n+1f(x) - P_n(x) = \frac{f^{(n+1)}(z)}{(n+1)!} (x-c)^{n+1} where PnP_n is the nnth degree Taylor polynomial for ff centered at cc.

A special case of Taylorโ€™s theorem is when n=0n = 0. Then you get the Mean Value Theorem (MVT):

Mean Value Theorem. Let ff be a function that is differentiable in the interval between aa and bb. Then there exists a cc strictly inside the interval from aa to bb such that fโ€ฒ(c)=f(b)โˆ’f(a)bโˆ’a.f'(c) = \frac{f(b) - f(a)}{b-a}.

We did this example:

  1. Use Taylorโ€™s theorem to estimate the error in using the 0th and 2nd degree Maclaurin polynomials to estimate cos(0.03)\cos(0.03) and cos(0.6)\cos(0.6).

Then we started this workshop

Fri, Jan 23

Last time we started this workshop about using Taylorโ€™s remainder formula and the triangle inequality to find upper bounds for functions. Today we revisited that workshop, but first we talked about the following.

Taylorโ€™s Error Formula. Let ff be a function that has (n+1)(n+1) derivatives in the interval between xx and cc. Then |f(x)โˆ’Pn(x)|โ‰คMโ‹…|xโˆ’c|n+1(n+1)!|f(x) - P_n(x)| \le \frac{M \cdot |x-c|^{n+1}}{(n+1)!} where MM is the maximum value of |f(n+1)(z)||f^{(n+1)}(z)| with zz between xx and cc.

This error formula gives a way to estimate the worst case (absolute) error when you use a Taylor polynomial approximation.

  1. The 3rd degree Taylor polynomial for cosx\cos x centered at c=ฯ€c = \pi is P3(x)=โˆ’1+12(xโˆ’ฯ€)2P_3(x) = -1 + \tfrac{1}{2}(x-\pi)^2 (the coefficient on the 3rd degree term is zero). What is the worst case absolute error using this polynomial to estimate cos3\cos 3?

After that we talked about the triangle inequality.

Triangle Inequality. For any numbers aa and bb (real or complex), |a+b|โ‰ค|a|+|b|.|a+b| \le |a| + |b|.

We talked about how you can use the triangle inequality to find upper bounds for functions. We also talked about tight upper bounds versus upper bounds that are not tight. We did this example.

  1. Use the triangle inequality to find an upper bound for |x2+3xcosx||x^2 + 3x \cos x|.

Week 3 Notes

Day Topic
Mon, Jan 26 Bisection method
Wed, Jan 28 Newtonโ€™s method
Fri, Jan 30 Rates of convergence

Week 4 Notes

Day Topic
Mon, Feb 2 Secant method
Wed, Feb 4 Fixed point iteration
Fri, Feb 6 Newtonโ€™s method with complex numbers

Week 5 Notes

Day Topic
Mon, Feb 9 Solving nonlinear systems
Wed, Feb 11 Systems of linear equations
Fri, Feb 13 LU decomposition

Week 6 Notes

Day Topic
Mon, Feb 16 LU decomposition with pivoting
Wed, Feb 18 Review
Fri, Feb 20 Midterm 1

Week 7 Notes

Day Topic
Mon, Feb 23 Matrix norms and conditioning
Wed, Feb 25 Inner-products and orthogonality
Fri, Feb 27 Unitary and Hermitian matrices

Week 8 Notes

Day Topic
Mon, Mar 2 Gram-Schmidt algorithm
Wed, Mar 4 QR decomposition
Fri, Mar 6 Orthogonal projections

Week 9 Notes

Day Topic
Mon, Mar 16 Least squares problems
Wed, Mar 18 Least squares problems - conโ€™d
Fri, Mar 20 Orthogonal functions

Week 10 Notes

Day Topic
Mon, Mar 23 Continuous least squares
Wed, Mar 25 Fourier series
Fri, Mar 27 Fourier series - conโ€™d

Week 11 Notes

Day Topic
Mon, Mar 30 Numerical integration
Wed, Apr 1 Newton-Cotes methods
Fri, Apr 3 Error in Newton-Cotes methods

Week 12 Notes

Day Topic
Mon, Apr 6 Numerical differentiation
Wed, Apr 8 Review
Fri, Apr 10 Midterm 2

Week 13 Notes

Day Topic
Mon, Apr 13 Eigenvectors and eigenvalues
Wed, Apr 15 Power iteration
Fri, Apr 17 Schur triangular decomposition

Week 14 Notes

Day Topic
Mon, Apr 20 QR algorithm
Wed, Apr 22 Singular value decomposition
Fri, Apr 24 Applications of the SVD
Mon, Apr 27 Last day, recap & review