History
A chronological narrative.
The historical episodes from across this book, assembled in chronological order. Each entry links back to the lesson where it appears in full context.
13 history entries from this book, in chronological order.
Before 1700
1665 Newton, Leibniz, and why we have multiple notations 1 Derivatives
Differential calculus was developed independently by Isaac Newton in England (1665–1666, "fluxions") and Gottfried Wilhelm Leibniz in Germany (1675–1684). The two formulations are mathematically equivalent but use different notation: Newton's $\dot x$ for time derivatives, $\ddot x$ for second derivatives; Leibniz's $df/dx$, $d^2 f/dx^2$. Leibniz's notation generalises cleanly to multivariable calculus and made his approach dominant on the continent; Newton's notation survived in physics and mechanics, where time is a privileged variable.
The dispute over priority — fuelled by national rivalries and by Newton's accusations that Leibniz had plagiarised his work — soured Anglo-Continental mathematics for nearly a century. Britain stayed loyal to Newton's clunkier "fluxional" calculus; the Continent ran with Leibniz's notation and produced Euler, Lagrange, Laplace, and Fourier. The British eventually capitulated in the early 1800s. We use both notations today as a residue of the history: $\dot x$ for time, $\partial / \partial x$ for space, $f'$ when there is one variable and we don't want to be fussy about which.
1687 From Newton's spring to the bandpass filter 5 Second-order linear ODEs
The equation $m \ddot x + k x = 0$ for simple harmonic motion appears as Proposition 38, Book I, of Newton's 1687 *Principia*, in his analysis of a body oscillating on a "perfectly elastic" spring. Newton already knew that the solution was sinusoidal and that the period depended only on $\sqrt{m/k}$ — independent of amplitude. The same equation governs the small-angle pendulum (his Proposition 52), which is where the more famous SHM derivation lives.
Damping was added gradually through the 18th and 19th centuries; Lord Rayleigh's *Theory of Sound* (1877) gives the equation $m \ddot x + b \dot x + k x = 0$ in the modern form. The classification of regimes — *overdamped*, *underdamped*, *critically damped* — comes from late-19th-century galvanometer design, where engineers cared about getting the needle to settle as quickly as possible without ringing. The optimum is critical damping, and "critical damping" is a term of art that crossed over from galvanometers into acoustics, mechanical engineering, and circuit design wholesale.
The complex-impedance approach to forced oscillators ($\tilde X = F_0 / [(k - m\omega^2) + ib\omega]$ written as one line of algebra) was systematised by Charles Steinmetz for AC circuits in the 1890s — see also the [Complex Exponentials chapter](/foundations/complex-exponentials). The same algebra of impedances ties together acoustic, electrical, and mechanical filters; the **bandpass filter** of every audio EQ and every radio tuner is exactly this driven-oscillator equation, with different physical meanings for the symbols.
18th century
1733 From de Moivre to Laplace to Gauss 10 The Gaussian and the central limit theorem
The bell curve's first appearance was in 1733, when Abraham de Moivre computed the limiting shape of the binomial distribution as $n \to \infty$. He derived $\binom{n}{k} p^k (1-p)^{n-k}$ as an approximate Gaussian for large $n$, what we'd now call a special case of the Central Limit Theorem. The result was buried in an obscure pamphlet; few people read it.
The curve was rediscovered and popularised by Pierre-Simon Laplace, who derived a more general central-limit result in his 1812 *Théorie analytique des probabilités*. Laplace argued that sums of many independent measurement errors should be Gaussian-distributed, regardless of the individual error distributions — the modern CLT framing.
Carl Friedrich Gauss developed the distribution from a completely different angle in 1809: he asked, *what distribution makes the sample mean the maximum-likelihood estimator of the true value?* The unique answer is the Gaussian. This is why we call it Gaussian today, even though de Moivre had the curve a century earlier and Laplace had the limit theorem.
The proof of the CLT in its modern form is due to Aleksandr Lyapunov in 1901 and Jarl Waldemar Lindeberg in 1922. The Lindeberg condition — a precise statement of "no individual $X_i$ should dominate the sum" — is what makes the theorem rigorous.
1747 d'Alembert, Euler, and the vibrating-string controversy 6 The 1-D wave equation: d’Alembert and characteristics
Jean le Rond d'Alembert derived the traveling-wave solution $u(x, t) = F(x - ct) + G(x + ct)$ in his 1747 *Recherches sur la courbe que forme une corde tendue mise en vibration*, the first solution of a partial differential equation in history. The setup was a vibrating string of length $L$ pinned at both ends; his solution combined right- and left-going waves to satisfy both the wave equation and the boundary conditions.
A controversy followed almost immediately. Leonhard Euler in 1748 pointed out that d'Alembert's $F$ and $G$ — being functions of the spatial coordinate $x \pm ct$ — could in principle be *arbitrary* curves, not just analytic formulae. D'Alembert insisted on smooth analytic functions only; Euler insisted on admitting "geometric" curves like piecewise-linear shapes. Daniel Bernoulli in 1753 proposed a third approach: the solution should be a superposition of sinusoidal modes — exactly the [Fourier-series picture](/foundations/fourier/fourier-series) — which led to a further dispute between Bernoulli, d'Alembert, and Euler over whether *any* function could be represented as such a sum.
The full reconciliation came only after Fourier's 1822 *Théorie analytique de la chaleur* (and a century of subsequent foundational work in analysis): yes, the two pictures are equivalent and both admit arbitrary reasonable functions, but doing so required a more careful understanding of what "function" and "convergence" meant. The 75-year vibrating-string controversy ended up being the seed dispute that motivated modern analysis. See also the [History block in 7.1](/foundations/fourier/fourier-series) — the two stories are continuous.
1748 Euler 1748, Steinmetz 1893 3 Euler's formula and the phasor
Leonhard Euler stated the identity $e^{i\theta} = \cos\theta + i\sin\theta$ in his 1748 *Introductio in analysin infinitorum*. He derived it from the Taylor series, much as above, treating the substitution $x \to i\theta$ in the exponential series as a formal manipulation. At the time the legitimacy of complex numbers was contested — some mathematicians regarded $\sqrt{-1}$ as a meaningless symbol — and Euler's identity was one of the strongest arguments for taking them seriously. The special case $\theta = \pi$ gives $e^{i\pi} + 1 = 0$, often cited as the most beautiful equation in mathematics for the way it ties together five fundamental constants.
The use of complex exponentials as *phasors* for engineering analysis came nearly 150 years later. Charles Proteus Steinmetz, a German-American engineer at General Electric, introduced the phasor method in an 1893 paper to handle AC-circuit analysis. Before Steinmetz, the equations of alternating-current networks were solved by trigonometric identities — slow, error-prone, and unscalable. Steinmetz's phasor representation collapsed the algebra into single-line formulas, and within a decade AC power systems were the standard. The same trick reaches into acoustics through the wave-equation phasor solutions you'll meet in [Sound Ch 5](/sound/energy-and-impedance/plane-harmonic-waves).
1763 Bayes 1763, Laplace 1774, and a 200-year argument 10 Bayesian inference and signal detection
Thomas Bayes was a Presbyterian minister and amateur mathematician in 18th-century England. He wrote *An Essay towards solving a Problem in the Doctrine of Chances* sometime before his death in 1761, but never published it. The manuscript was found among his papers by Richard Price, who edited and submitted it to the Royal Society; it appeared in the *Philosophical Transactions* in 1763, two years after Bayes had died.
The paper introduced what we now call Bayes' rule — initially as a special case for the binomial distribution — and applied it to the problem of estimating an unknown probability from observed successes and failures. The crucial conceptual move was to treat the unknown parameter (the probability of success) as itself having a distribution. This was philosophically radical: parameters were generally thought of as fixed unknowns, not as random variables.
Pierre-Simon Laplace independently rediscovered and generalised the rule in his 1774 *Mémoire sur la probabilité des causes par les événements*. Laplace took it much further — using Bayesian arguments throughout his career to tackle problems from celestial mechanics (determining the orbits of comets) to demography (estimating population sizes from birth-rate data).
The Bayesian / frequentist split crystallised in the early 20th century, with Ronald Fisher, Jerzy Neyman, and Karl Pearson on the frequentist side arguing for objective, parameter-free statistics, and Harold Jeffreys, Bruno de Finetti, and L. J. Savage on the Bayesian side defending the subjective-probability interpretation. The argument lasted decades; modern statistics largely shrugs and uses both. The rise of computational Bayesian methods (Markov-chain Monte Carlo, variational inference) in the 1990s tipped the practical balance toward Bayesian methods for complex models, and machine-learning's adoption of probabilistic-programming languages (Stan, PyMC, Pyro) has made Bayes the default for most inference today.
19th century
1805 Gauss had the FFT in 1805 9 The FFT
The Cooley–Tukey algorithm was published in 1965, in James Cooley and John Tukey's six-page paper *An algorithm for the machine calculation of complex Fourier series*. The paper is credited as the foundation of modern digital signal processing — within a decade, every audio compression scheme, every radar pulse compression, every MRI reconstruction depended on it. By the 1980s the FFT was running on dedicated DSP chips in millions of consumer devices.
The algorithm had been written down before. Carl Friedrich Gauss, in 1805, was fitting trigonometric series to astronomical observations of the orbits of the asteroids Pallas and Juno. He computed the Fourier coefficients of his data points via what we now recognise as a radix-2 decomposition — the same butterfly structure as Cooley–Tukey, with the same $\mathcal{O}(N \log N)$ scaling. He wrote the calculation in a Latin notebook entry that was never published in his lifetime; the relevant section appeared only in Volume 3 of his collected works in 1866, a year after Cooley and Tukey were born. The Gauss algorithm was found by historians of mathematics in the 1970s — *after* the FFT had already conquered signal processing under Cooley and Tukey's names.
The lesson, as far as there is one: an algorithm that no one knows about benefits no one. The FFT's 160-year hibernation between Gauss and Cooley–Tukey is one of the clearer cases of "the right idea, in the wrong notebook, at the wrong time." Modern numerical computing's debt is to the *rediscovery* and its consequences, not to the original.
1807 Fourier's heat equation and a rejected memoir 6 The heat equation and Laplace’s equation
Joseph Fourier wrote the heat equation $\partial_t u = D \nabla^2 u$ in his 1807 memoir to the French Academy of Sciences, *Sur la propagation de la chaleur dans les corps solides*. To solve it on a bounded interval, he proposed expanding the initial temperature as a sum of sinusoidal modes — what we now call a [Fourier series](/foundations/fourier/fourier-series) — and showing that each mode decayed independently with rate $D k^2$.
The memoir was rejected. Lagrange, on the review panel, objected that "arbitrary functions" could not in general be expressed as such a sum, and the mathematics of convergence wasn't rigorous enough to settle the question. Fourier rewrote, expanded, and resubmitted; the work was published as *Théorie analytique de la chaleur* in 1822. By then it was already influencing all of mathematics: the analytical machinery built to make Fourier's claims rigorous — Cauchy's theory of convergence, Riemann's theory of integration, Cantor's set theory, Lebesgue's measure theory — became the foundation of modern analysis. The same machinery underwrites every PDE technique in this chapter and the [Fourier methods of Foundations 7](/foundations/fourier).
The irony is that the heat equation, derived by Fourier as the physical motivation for the series, ended up far less famous in physics than the *Fourier transform* that came out of the analytic theory built to validate his solution. Generations of physics students meet Fourier methods without ever learning that he was trying to solve the heat-diffusion problem.
1822 Fourier, Bernoulli, and the function controversy 7 Fourier series
Joseph Fourier introduced the trigonometric-series decomposition in his 1822 *Théorie analytique de la chaleur* ([Fourier 1822](/foundations/bibliography#fourier-1822)), motivated by the heat equation. His claim — that *any* function on a bounded interval could be expanded as such a series — was sharply contested by Lagrange and others, because it required admitting functions with corners, jumps, and other "pathological" features that the 18th-century theory of analysis could not handle.
The same dispute, in different form, had played out 75 years earlier between d'Alembert, Euler, and Daniel Bernoulli over the vibrating-string solution (see [Sound 3.3](/sound/oscillator-to-wave/dalembert)). Fourier's work forced the resolution: a "function" is anything that takes input to output, not just an analytic formula. Modern analysis — Cauchy's theory of convergence, Riemann's theory of integration, Cantor's set theory, Lebesgue's measure theory — was built to make Fourier's claim rigorous. Acoustics ended up getting its frequency-domain methods as a byproduct.
The Gibbs phenomenon is a footnote in the same story. Wilbraham noticed the overshoot in 1848, but his paper was forgotten. In 1898 the physicist Albert Michelson — of Michelson-Morley fame — built a mechanical harmonic analyser and observed the overshoot. When he wrote a letter to *Nature* asking whether this was an artefact of his apparatus, Gibbs replied in 1899 with the mathematical explanation. The phenomenon was named for Gibbs even though Wilbraham had it first.
1850 From Cayley to Hilbert: a century building the spectral theorem 4 Eigenvalues and eigenvectors
Matrix algebra as we know it was assembled by Arthur Cayley and James Joseph Sylvester in the 1850s in England. Cayley's 1858 *Memoir on the Theory of Matrices* defined matrix addition, multiplication, and the characteristic polynomial — the equation $\det(A - \lambda I) = 0$ from this lesson. Sylvester coined the word "matrix" in 1850 and introduced "discriminant" and "minor" along with much of the modern vocabulary. The two were friends and lifelong correspondents; the era is sometimes called the *Cayley–Sylvester period* of algebra.
The eigenvalue–eigenvector machinery was fully understood for finite matrices by the 1880s. The leap to infinite dimensions — operators on function spaces, the natural home of PDEs and quantum mechanics — was made by David Hilbert in the early 1900s, in his work on integral equations. Hilbert's six papers from 1904–1910 established what we now call **Hilbert space**, and the proof that self-adjoint operators on a Hilbert space have a complete orthonormal eigenbasis is the **spectral theorem**, the deepest result in the chain. The full machinery was reformulated and extended by Hilbert's student John von Neumann in the 1930s, providing the mathematical foundation that Werner Heisenberg's matrix mechanics and Erwin Schrödinger's wave mechanics needed to be the same theory. Eigenvalues, in other words, ran the central arc of mathematical physics from 1850 to 1930.
1877 Rayleigh, Buckingham, and the dimensionless number 8 Dimensional analysis
Lord Rayleigh's 1877 *Theory of Sound* used dimensional reasoning throughout — to guess scaling laws, to check derivations, to argue that certain phenomena could only depend on dimensionless combinations of parameters. He didn't formalise the technique; he just used it everywhere. By the early 1900s "Rayleigh's method" was an informal craft.
In 1883 Osborne Reynolds, studying flow through pipes, identified what we now call the **Reynolds number** $Re = \rho U L / \mu$ — a dimensionless group whose value distinguished laminar from turbulent flow regardless of the absolute scale of the pipe. This was the first time a *named* dimensionless number was understood as the physically-meaningful parameter of a problem.
In 1914 Edgar Buckingham (US Bureau of Standards) formalised what Rayleigh had been doing: if a physical relationship involves $n$ variables with $k$ independent dimensions, it can be rewritten as a relation among $n - k$ dimensionless groups. The **Buckingham π theorem** turned dimensional analysis from craft into a recipe. Almost every dimensionless number in physics — Mach, Reynolds, Prandtl, Strouhal, Helmholtz — emerged from this framework.
1880 The vectors that fought a war 2 Divergence and curl
Vector calculus as we use it — gradient, divergence, curl, the $\nabla$ operator — was assembled between 1853 and the 1890s out of two competing formalisms.
William Rowan Hamilton invented **quaternions** in 1843 (allegedly carving the formula $i^2 = j^2 = k^2 = ijk = -1$ into the stone of Brougham Bridge in Dublin). He intended them as the natural algebra for three-dimensional rotations and physical quantities, and spent the rest of his life evangelising for them. James Clerk Maxwell's *Treatise on Electricity and Magnetism* (1873) was written in a hybrid quaternion notation: the operator we now call $\nabla$ was Hamilton's "nabla" (named after a Hebrew harp shaped like the symbol).
In the 1880s, J. Willard Gibbs (at Yale) and Oliver Heaviside (in England, working independently) extracted a stripped-down "vector algebra" from Hamilton's quaternions — keeping the dot and cross products, abandoning the quaternion arithmetic — and used it to reformulate Maxwell's equations into the form we now see. A pitched war broke out in the late-19th-century mathematical journals between the quaternion adherents (Peter Guthrie Tait was the loudest) and the new vector-calculus camp (Gibbs, Heaviside). The vector-calculus side won decisively. By 1900, physics and engineering had abandoned quaternions; today they survive only in computer graphics (for rotation interpolation) and in pure mathematics. The notation $\nabla$ and the calculus you use here is the residue of Gibbs and Heaviside's victory.
Early 20th century
1926 Schrödinger 1926, and the two quantum mechanicses 6 The Schrödinger equation
Quantum mechanics was discovered twice in the same year. Werner Heisenberg's 1925 paper introduced **matrix mechanics**: physical observables were represented by infinite matrices and the dynamics by matrix multiplication. The mathematics was unfamiliar to physicists — Born and Jordan had to teach Heisenberg what a matrix was — but it correctly predicted the spectral lines of the hydrogen atom and the spectra of more complicated atoms.
Erwin Schrödinger, working independently in early 1926, was guided by de Broglie's 1924 hypothesis that matter has wave-like character. He wrote down the wave equation $i\hbar\, \partial_t \Psi = \hat H \Psi$ and showed that its eigenvalues for the hydrogen-atom potential gave the Bohr energy levels exactly. The mathematics was the *separation-of-variables* technique already familiar from acoustics — which is precisely the parallel this lesson develops.
The two formulations looked utterly different. Heisenberg's was algebraic and discrete; Schrödinger's was differential and continuous. Within months of publication (1926), Schrödinger himself proved that the two were mathematically equivalent — different representations of the same theory. Paul Dirac's 1930 textbook *The Principles of Quantum Mechanics* and John von Neumann's 1932 *Mathematische Grundlagen der Quantenmechanik* gave the unified abstract formulation in terms of operators on Hilbert space, which is the formulation modern physics uses. The same Hilbert space, complete with self-adjoint operators and the [spectral theorem](/foundations/linear-algebra/spectral-theorem), that runs through the rest of [Foundations 6](/foundations/pdes/what-is-a-pde).