10.1 Discretization: turning calculus into arithmetic

Every numerical algorithm in this chapter rests on the same trick: replace continuous calculus — limits, integrals, derivatives — with arithmetic on a discrete grid of values. The trick has consequences. Approximating f(x)f'(x) from a finite number of nearby samples introduces an error. The size of that error depends on the spacing hh and the order of the approximation; the choice of approximation pattern controls how rapidly the error shrinks as hh does. This first lesson sets up the vocabulary — truncation error, order of accuracy, roundoff — that recurs in every subsequent lesson.

Finite-difference approximations

The derivative of a smooth function ff at xx is the limit

f(x)  =  limh0f(x+h)f(x)h.f'(x) \;=\; \lim_{h \to 0} \frac{f(x + h) - f(x)}{h}.

To compute it numerically, you have to stop taking the limit at some finite hh. The expression

D+f(x;h)    f(x+h)f(x)hD_+ f(x; h) \;\equiv\; \frac{f(x + h) - f(x)}{h}

is the forward difference approximation. Three obvious questions: how good is it? How does the answer scale with hh? And what other choices are there?

A Taylor expansion gives the answer at once. Write

f(x+h)  =  f(x)+hf(x)+12h2f(x)+16h3f(x)+,f(x + h) \;=\; f(x) + h\, f'(x) + \tfrac12 h^2 f''(x) + \tfrac16 h^3 f'''(x) + \cdots,

so

f(x+h)f(x)h  =  f(x)+12hf(x)+16h2f(x)+.\frac{f(x + h) - f(x)}{h} \;=\; f'(x) + \tfrac12 h\, f''(x) + \tfrac16 h^2 f'''(x) + \cdots.

The forward difference equals the true derivative plus an error term of order hh. Doubling hh doubles the error. We call this a first-order approximation, and we write the error as O(h)\mathcal{O}(h).

A symmetric variant, the centred difference, halves the error in a striking way:

Dcf(x;h)    f(x+h)f(xh)2h.D_c f(x; h) \;\equiv\; \frac{f(x + h) - f(x - h)}{2 h}.

Taylor expansion of both terms and subtracting kills every even-degree derivative of ff. The leading error is now 16h2f(x)\tfrac{1}{6} h^2 f'''(x), i.e. O(h2)\mathcal{O}(h^2). Halving hh now divides the error by four. This is a second-order approximation, qualitatively much better than first order at any reasonable step size.

There are higher-order schemes (the 5-point stencil 112h[f(x+2h)+8f(x+h)8f(xh)+f(x2h)]\frac{1}{12 h}[-f(x + 2h) + 8 f(x + h) - 8 f(x - h) + f(x - 2h)] is O(h4)\mathcal{O}(h^4)), and one-sided variants for boundaries, and approximations to second derivatives like

f(x)    f(x+h)2f(x)+f(xh)h2,f''(x) \;\approx\; \frac{f(x + h) - 2 f(x) + f(x - h)}{h^2},

which is O(h2)\mathcal{O}(h^2). The wave-equation simulations in 10.3 use exactly this second-order centred difference in space, paired with a similar one in time.

Truncation error, in pictures

10^-810^-710^-610^-510^-410^-310^-210^-110^010^210^010^-210^-410^-610^-810^-1010^-1210^-1410^-16absolute error vs h (log–log)step size h →at h = 1.00e-1exact: 0.540302forward (O(h))0.497364err: 4.29e-2backward (O(h))0.581441err: 4.11e-2centred (O(h²))0.539402err: 9.00e-4
function:

The log–log plot reveals the asymptotic convergence rates. The forward and backward differences are O(h) — straight lines of slope 1. The centred difference is O(h²) — slope 2, much faster decay. Together they show why higher-order methods are nearly always worth the extra work. But notice what happens at very small h: the errors stop decreasing and start *rising* again. That's *floating-point cancellation*: numerator and denominator both shrink, and limited machine precision (about 16 digits in double precision) leaves little signal. The minimum error sits near h ≈ √ε ≈ 1.5 × 10⁻⁸ for the first-order methods, and near h ≈ ε^(1/3) ≈ 6 × 10⁻⁶ for the centred difference.

The interactive shows absolute error versus step size hh on a log–log plot for three schemes. The slopes encode the orders directly: forward and backward differences have slope 1 (every decade of hh halves… no, tenths the error), centred difference has slope 2 (every decade of hh shrinks error by 100). The behaviour at very small hh is the subject of the next section.

Roundoff: why small h is not always better

Smaller hh does not always mean smaller error. Look at the bottom of the interactive’s log–log plot: each curve eventually turns around and starts rising. This is roundoff error, and it is the part of numerical analysis you have to remember even when working at extremely high precision.

The issue: double-precision floats carry about 16 significant digits. If f(x+h)f(x + h) and f(x)f(x) agree to 14 digits because hh is small, the subtraction f(x+h)f(x)f(x + h) - f(x) keeps only 2 significant digits. Dividing by hh produces a catastrophic cancellation: the absolute error in the numerator is roughly machine epsilon ε1016\varepsilon \approx 10^{-16} times f|f|, and dividing by tiny hh amplifies that.

The total error is

error    Chtruncation  +  εfhroundoff.\text{error} \;\sim\; \underbrace{C\, h}_{\text{truncation}} \;+\; \underbrace{\frac{\varepsilon\, |f|}{h}}_{\text{roundoff}}.

Minimising over hh:

hopt    εf/C    108.h_{\text{opt}} \;\sim\; \sqrt{\varepsilon\, |f| / C} \;\sim\; 10^{-8}.

For the centred difference the truncation error is O(h2)\mathcal{O}(h^2), and the corresponding optimal step is hoptε1/3105h_{\text{opt}} \sim \varepsilon^{1/3} \sim 10^{-5} — larger but giving a smaller minimum error.

Pushing hh past these optima just hurts. The interactive’s “error stops decreasing” behaviour at small hh is exactly this.

Discretisation in space vs in time

Most simulations have both a spatial and a temporal step. The wave-equation finite-difference scheme uses hh for the grid spacing in xx and Δt\Delta t for the time step; the heat equation has the same. Choosing them independently is dangerous because the two affect stability (whether the scheme blows up) as well as accuracy. The von Neumann stability analysis of 10.3 connects the two.

What we use this for

Discretisation is the bottom layer of every numerical algorithm in the chapter:

The next lesson develops ODE solvers built from these primitives.