7.1 The hearing-aid DSP pipeline

A 2026 hearing aid is — physically — a 2 cubic-centimetre plastic shell containing two microphones, an analog front end, a digital signal processor, a Bluetooth radio, a battery, and a small loudspeaker (the “receiver” in hearing-aid jargon, distinct from any radio receiver). It is — algorithmically — a 16-stage real-time DSP pipeline that takes microphone samples in at one end and delivers acoustic samples to the receiver at the other, with about 5 ms of total latency.

The block diagram of that pipeline, schematically:

  Microphone (×2)
        ↓
  Analog amp + AGC
        ↓
  ADC (16–24 bit, 24 kHz)
        ↓
  Wind-noise reduction
        ↓
  Adaptive feedback canceller
        ↓
  Analysis filterbank (12–24 channels)
        ↓
  Directional beamformer (mic 1, mic 2)
        ↓
  Single-channel noise reduction
        ↓
  Multichannel WDRC gain / compression
        ↓
  Frequency lowering
        ↓
  MPO limiter
        ↓
  Synthesis filterbank
        ↓
  DAC
        ↓
  Class-D output amp
        ↓
  Receiver (loudspeaker)
        ↓
  Tympanic membrane

Every block is real (well, modulo vendor-specific reordering and merging), every block has a clinical purpose, and every block has computational and acoustic tradeoffs the audiologist needs to understand to fit the device intelligently.

Why a filterbank

The cochlea performs an instantaneous frequency analysis: a single broadband acoustic signal at the ear is decomposed into roughly 30 overlapping cochlear filter bands by the basilar membrane (see Hearing 4.4 — The place map refresher →). The hearing-aid filterbank is the digital analogue: split the broadband input into 12 to 24 frequency bands so that gain, compression, and noise-reduction parameters can be applied independently in each band. This matters because:

Hearing loss is frequency-dependent. A typical audiogram has a steep high-frequency drop and near-normal low frequencies. Applying flat gain would amplify the low-frequency speech components (already audible) just as much as the high-frequency ones (the inaudible ones we need to recover), making the result loud and muddy. Per-band gain shaping prescribes the right gain at each band.
Compression parameters need to differ across bands. Soft speech has its energy in different bands than loud speech (low-frequency vowels are loud relative to high-frequency consonants by 20–30 dB) — and the patient’s residual dynamic range differs across bands. Multichannel WDRC (Lesson 7.2) applies a tailored input/output curve in each band.
Noise reduction works channel-by-channel. Statistical noise reduction looks for steady-state vs modulated signal patterns in each band independently. A band dominated by steady noise gets attenuated; a band rich in speech modulation is left alone.

The choice of 12, 16, 20, or 24 bands is a manufacturer-level decision; clinical evidence does not strongly favour any specific count above ~8 bands. Computational cost rises linearly with band count; latency rises with the analysis-window length needed to resolve narrow bands.

Latency budget

The total signal latency through the pipeline — from the moment a wavefront strikes the microphone to the moment the corresponding acoustic output emerges from the receiver — is the dominant time-domain concern. Three reasons:

Comb-filter distortion at low latencies. For an open-fit hearing aid, the user simultaneously hears the direct sound (through the open vent) and the processed sound (through the aid). If the processing delay is comparable to a wave period, the two paths interfere — comb-filtering produces audible spectral notches and characteristic “hollow” or “tinny” timbre. Audiologists report that latencies above 10 ms are noticeably distracting for music; above 20 ms they degrade speech intelligibility; above 40 ms they cause complaints of an echo.
Lip-sync distortion at high latencies. Above about 40 ms the audio lags visible lip movement by enough to disrupt the McGurk-type integration of visual and auditory speech cues — patients with significant lip-reading dependence (most patients with severe loss) complain of “watching a badly-dubbed film.”
Cochlear-implant integration. Patients with one hearing aid and one cochlear implant (bimodal fitting) need the two devices’ latencies to match within a few milliseconds for interaural cues to fuse.

Modern hearing aids typically achieve 5–8 ms total latency. The filterbank is the dominant contributor (analysis-window + synthesis-window length); the remaining stages add about 1 ms total. Lowering the filterbank latency below ~3 ms requires going to a low-band-count or warped-frequency design that gives less frequency selectivity.

Form factors

The DSP pipeline is largely independent of physical form factor, but the form factor determines microphone placement, receiver placement, battery capacity, and venting — all of which interact with the DSP choices:

| Form | Where it sits | Best for | |---|---|---| | BTE (behind-the-ear) | Body behind pinna; receiver in body; sound piped through tubing | Severe-profound loss; pediatric (no in-ear hardware) | | RIC / RITE (receiver-in-canal) | Body behind pinna; receiver in canal connected by thin wire | Mild-moderate loss; the modern dominant form | | ITE (in-the-ear) | Whole device in concha bowl | Cosmetic alternative to BTE; medium-severe loss | | ITC / CIC / IIC (in-the-canal, completely-in-canal, invisible-in-canal) | Whole device down the canal | Cosmetic; mild-moderate loss; limited battery | | CROS / BiCROS (contralateral routing) | Transmitter in non-hearing ear sends to receiver in hearing ear | Unilateral profound loss |

The RIC form has dominated the market since about 2015. The thin wire to the canal-placed receiver lets the body sit far enough from the canal to avoid feedback issues with high gain, while the open-fit eartip preserves natural low-frequency hearing and avoids occlusion (the patient’s own voice booming in their head when the ear canal is sealed). RIC devices typically apply 30–45 dB of gain at high frequencies in this open-fit configuration — a feat made possible only by the feedback-cancellation algorithm of Lesson 7.4.

Connectivity

Bluetooth Low Energy Audio (LE Audio, ratified 2020, adopted in mainstream hearing aids by 2022) has made the hearing aid into a wireless headset. Phone calls, audio streaming, TV connectivity, and (most recently) Auracast broadcast audio in public venues all flow into the aid directly, bypassing the microphone. The streaming-mode signal flow is:

Bluetooth packet → buffer → decoder → equaliser → multichannel WDRC (same as acoustic mode) → MPO limiter → DAC → receiver

The compression chain is shared with acoustic mode, so a streamed phone call gets the same prescription-conforming gain shaping the patient receives from acoustic input. The streaming-mode latency is higher (typically 80–120 ms one-way for LE Audio) because of buffer requirements; this is acceptable for media but is on the edge of conversation-disrupting for synchronous phone calls.

⏳ The history — From vacuum tube to RIC

The first electronic hearing aid was the carbon-microphone “Acousticon” of 1902, marketed by Miller Reese Hutchison. It was the size of a small radio and the user wore the microphone at chest level. The carbon microphone’s signal-to-noise ratio was poor and the device produced only mild amplification.

Vacuum tubes (1920s) allowed dramatically more gain but required large batteries; a “wearable” hearing aid of the 1940s was a 1-pound body-worn device with a wire to an earphone. The first transistor hearing aid (1953, Sonotone 1010) reduced the body to wristwatch size; the body-worn era persisted into the 1970s.

The first behind-the-ear hearing aid (Otarion Listener Model L8, 1956) put the entire device behind the pinna. The 1980s saw mass-market analog ITE devices. The first commercial digital hearing aid was the Nicolet “Phoenix” of 1987 — a single-channel device that arrived in offices on a wheeled trolley because the DSP chip wouldn’t fit in the device itself, but the principle (digitise the signal, compute, output) was settled.

The shift to fully on-device digital processing came with the Widex Senso (1995), the first commercial fully-DSP hearing aid that fit entirely behind the ear. Multichannel compression (Resound DigiFocus, 1996), adaptive feedback cancellation (Phonak Claro, 1999), and adaptive directional microphones (Oticon Adapto, 2001) followed in rapid succession.

The RIC form arrived in 2005–2008 (Phonak Audéo, Resound Live) and quickly became the dominant form. Bluetooth audio streaming through MFi (Made for iPhone, 2014) and LE Audio (2022) closed the connectivity gap. By 2026, most “premium” hearing aids contain 12–24 compression channels, two omnidirectional microphones, adaptive feedback cancellation, statistical noise reduction, Bluetooth LE Audio with Auracast, rechargeable lithium batteries with a few days of life, and either explicit AI/DNN-based environmental classification or noise-reduction algorithms (Widex Moment Sheer, Starkey Genesis AI, Phonak Sphere). The pace of change has not slowed.

Next lesson: the algorithm that fits 80 dB of acoustic dynamic range into a patient with perhaps 30 dB of residual dynamic range — wide-dynamic-range compression.