The derivative is the central object of calculus: a way to talk about rates of change with the same precision arithmetic uses to talk about quantities. Almost every physical formula on this bookshelf is a relation among derivatives, or an equation derived by setting a derivative equal to something. Newton’s laws, Maxwell’s equations, the wave equation, the heat equation — all are differential equations, which is to say all of them are statements about derivatives.
This first lesson is the working refresher. It introduces the derivative geometrically and algebraically, lists the four manipulation rules we lean on constantly, and ends with the priority dispute that shaped how we write derivatives today.
The derivative as a limit
The derivative of f(t) at t is defined as the limit
f′(t)=dtdf=h→0limhf(t+h)−f(t).
Geometrically it is the slope of the tangent line to the curve y=f(t) at the point t. Physically it is the instantaneous rate of change of f — how fast f would be changing if its current behaviour continued for an instant longer. When the independent variable is time we also write f˙; when it is space, f′. Both conventions appear across the books.
function:
f(x) = 1.00slope f′(x) = 2.00
The interactive shows the central definition of the derivative made visible: f′(x) at a point is the slope of the tangent line. Pick a function, drag the point along the curve, watch the slope number track. If you take nothing else from single-variable calculus, take this picture — every other operation we do with derivatives (chain rule, optimisation, Taylor expansion, linearisation) is built on it.
⏳The history— Newton, Leibniz, and why we have multiple notations
Differential calculus was developed independently by Isaac Newton in England (1665–1666, “fluxions”) and Gottfried Wilhelm Leibniz in Germany (1675–1684). The two formulations are mathematically equivalent but use different notation: Newton’s x˙ for time derivatives, x¨ for second derivatives; Leibniz’s df/dx, d2f/dx2. Leibniz’s notation generalises cleanly to multivariable calculus and made his approach dominant on the continent; Newton’s notation survived in physics and mechanics, where time is a privileged variable.
The dispute over priority — fuelled by national rivalries and by Newton’s accusations that Leibniz had plagiarised his work — soured Anglo-Continental mathematics for nearly a century. Britain stayed loyal to Newton’s clunkier “fluxional” calculus; the Continent ran with Leibniz’s notation and produced Euler, Lagrange, Laplace, and Fourier. The British eventually capitulated in the early 1800s. We use both notations today as a residue of the history: x˙ for time, ∂/∂x for space, f′ when there is one variable and we don’t want to be fussy about which.
The four manipulation rules
Most derivatives in this bookshelf are computed by combining four rules. Memorise them; everything else follows.
Linearity
(af+bg)′=af′+bg′
for constants a,b. The derivative of a sum is the sum of the derivatives, scaled by the constants. This is what allows Fourier methods: differentiating a sum of sinusoids gives a sum of (differentiated) sinusoids, with frequencies preserved.
Example:dtd[3sin(t)+2t2]=3cos(t)+4t.
Product rule
(fg)′=f′g+fg′.
The derivative of a product is not the product of the derivatives. Each factor in turn gets differentiated while the other holds still; sum the two contributions.
Example:dtd[t2sint]=2tsint+t2cost.
The reason: writing fg at t+h and expanding, (f+f′h)(g+g′h)=fg+(f′g+fg′)h+f′g′h2, and the h2 term drops in the limit. Both factors contribute, but only their first-order pieces.
Chain rule
(f(g(t)))′=f′(g(t))g′(t).
The derivative of a composed function is the product of the outer derivative (evaluated at g) times the inner derivative. This is the single rule that most often “fails” when used carelessly — especially forgetting the inner g′(t) factor.
Example:dtdsin(t2)=cos(t2)⋅2t.
composition:
Drag x. The blue probe sits on g(x); its height u = g(x) sets the horizontal position of the red probe on f(u); the height of the red probe is h = f(u) = f(g(x)), which is the height of the black probe on the composed curve at the same x. Each panel's tangent line is the local rate of its own function. The chain rule below multiplies the two component rates to recover the slope of the composition.
The interactive shows the three pieces — inner g(x), outer f(u), and composition h(x)=f(g(x)) — for several common compositions. As you slide x, watch how a change in x first changes g, which then changes f. The product of those two rates is h′(x). The dashed gold guide lines trace the variable flow: x→u=g(x)→h=f(u).
Inverse rule
If y=f(x) and x=f−1(y), then
dydx=dy/dx1.
The slope of an inverse function at a point is the reciprocal of the slope of the original function at the corresponding point.
Example: if y=ex, then x=lny, and dydlny=1/ex=1/y. So (lny)′=1/y — the standard logarithmic derivative falls out of the inverse rule applied to the exponential.
Geometrically: the graph of f−1 is the reflection of the graph of f across the line y=x. Reflection swaps rise and run, so slopes are reciprocals.
What’s next
The next lesson, 1.2 — Integrals, develops integration: the inverse operation that recovers a function from its derivative. The rules for integrals mirror the rules for derivatives — substitution is the chain rule run backwards, integration by parts is the product rule run backwards — and together they form the closed loop that the fundamental theorem of calculus makes precise.