1.7 The Shepard tone

The Shepard tone — named after Roger Shepard, who described it in 1964 — is a sound that seems to climb endlessly in pitch, like an Escher staircase in audio. It also has a descending sibling that seems to fall forever. Try the interactive below first; the explanation follows.

501002005001k2k5k10kamplitude envelopefrequency (Hz, log)tone amplitude
0.50 oct/s
use headphones; close your eyes and try to find where the rising ends

Then a more produced version from Vsauce, for comparison:

Vsauce's classic exploration of the Shepard tone, the auditory analogue of a Penrose staircase.

What is happening

The Shepard tone is a constructed sound, not a physical impossibility. It is a sum of pure sine tones at frequencies spaced an octave apart:

s(t)=i=0N1wi(t)cos ⁣(2πfi(t)t),s(t) = \sum_{i=0}^{N-1} w_i(t) \cos\!\bigl(2\pi f_i(t)\, t\bigr),

where each fi(t)f_i(t) rises in lockstep with the others (e.g. all doubling over the same time interval), and the amplitude weights wi(t)w_i(t) come from a fixed envelope in log-frequency — typically a Gaussian centred around the middle of the audible range. Tones whose current frequency is near the centre of the envelope are loud; tones whose frequency has drifted to the very low or very high end are nearly silent.

As all fif_i rise by an octave, the set of frequencies present is unchanged (every tone has just become the next-higher tone in the stack), and the envelope has not moved. The signal at time t+Tt + T is essentially the same signal as at time tt — yet during the interval the listener heard each individual tone rising.

The brain tracks the individual tones. Each tone really did rise. So the listener hears an unambiguous rise, even though the global stimulus is periodic and goes nowhere. The illusion exposes the brain’s commitment to interpreting the auditory scene as a small number of streams — and its preference for continuous, slowly-varying streams over abrupt jumps. Frequency proximity wins, the brain stitches each rising tone to itself across cycles, and the listener experiences a perpetual ascent.

We come back to streaming in movement 8, where the rules the brain uses for grouping spectral components into streams are the central topic. The Shepard tone is one of streaming’s purest demonstrations.