3.1 Speech reception threshold and word recognition

The two foundational measurements of speech audiometry are the speech reception threshold (SRT) and the word recognition score (WRS). Together they characterise the listener’s two complementary speech abilities: at what level speech becomes intelligible, and how clearly the listener can identify words once they are well above that level.

The SRT

The speech reception threshold is the lowest level (in dB HL) at which the listener correctly identifies 50% of presented two-syllable words (“spondees” — words with equal stress on each syllable like baseball, hotdog, cowboy). The CID W-1 word list of 36 spondees has been the US standard since the 1950s; equivalent lists exist in most clinical languages.

The procedure mirrors pure-tone audiometry:

  1. Present a spondee at a level well above expected threshold (typically pure-tone average + 30 dB).
  2. Lower the level in 10-dB steps until the patient fails to repeat.
  3. Increase in 5-dB steps until reliable repetition resumes.
  4. The SRT is the lowest level at which the patient correctly repeats at least 50% of presentations.

The SRT should agree with the pure-tone average (PTA at 500, 1000, 2000 Hz) within about ±6 dB. A discrepancy of more than 10 dB suggests one of:

The WRS

The word recognition score measures, at a level well above threshold, what percentage of mono-syllabic words the patient correctly identifies. The standard test in the US is the CID W-22 list (50 phonetically-balanced mono-syllables like give, toe, bath), each presented in the carrier phrase “You will say…”. The patient repeats; the audiologist scores.

Mono-syllables (as opposed to the redundant spondees used for SRT) are chosen because they offer minimal contextual cues — the patient must rely on the acoustic signal alone, without the linguistic redundancy that would let context fill in missed phonemes. This makes the WRS sensitive to subtle phonemic confusions that the SRT would miss.

The WRS is reported as a percent-correct at a specified presentation level, conventionally 40 dB above the SRT (or at the patient’s “most comfortable level,” MCL). A normal-hearing listener should score 92–100% at 40 dB SL.

The psychometric function

If the WRS is measured at multiple presentation levels, the resulting performance-intensity function shows the standard psychometric shape: a sigmoidal climb from 0% to a maximum, with the 50% point at or near the SRT.

0%20%40%60%80%100%02040608010050% (SRT)presentation level (dB HL)word recognition (%)SRT55 dB HLslope8% / 5 dBmax performance88%
listener:

Moderate SNHL shifts the curve further and lowers the ceiling — the patient can never reach 100% no matter how loud. The speech reception threshold (SRT) is the level at which the listener correctly repeats 50% of presented words. The maximum word recognition score (WRS) is the asymptotic performance at high presentation levels. The slope of the psychometric function reflects the listener's ability to use additional acoustic information as level increases. Rollover — a decline in WRS at high levels — is the classical audiometric flag for retrocochlear pathology (e.g., vestibular schwannoma) and warrants MRI.

Pick a listener type and read the psychometric function. Three features deserve attention:

Most-comfortable and uncomfortable levels

Two additional speech-derived measurements:

History

The history — Carhart and the CID lists

The Central Institute for the Deaf (CID) in St. Louis, founded in 1914, became the primary American research centre for clinical hearing measurement under its director Edmund Prince Fowler and then Hallowell Davis. During and after WWII, with thousands of veterans needing aural rehabilitation, CID developed the standardised word lists that became the US clinical baseline.

The CID W-1 spondee list (36 two-syllable words) was published by Hudgins, Hawkins, Karlin, and Stevens in 1947 and remains in clinical use today, though digital recordings have replaced the original 78-rpm phonograph records. The CID W-22 mono-syllabic word list (50 phonetically-balanced words across four 50-word sub-lists) was developed by Hirsh, Davis, Silverman, Reynolds, Eldert, and Benson in 1952 as the open-set word-recognition standard.

Raymond Carhart, at Northwestern, championed using both — SRT plus a separate WRS at a comfortable level — as the speech-audiometry “fingerprint” of a hearing loss. Carhart’s clinical protocols, codified in his 1971 Modern Developments in Audiology chapter on speech audiometry, are essentially the protocols US audiologists still follow.

The biggest modern evolution is the move from quiet to noise: speech-in-noise testing (HINT, QuickSIN, BKB-SIN, AzBio) developed from the 1990s onward to address the well-documented fact that quiet WRS poorly predicts real-world function. We cover those in 3.3.

What’s next

The next lesson, 3.2 — The speech banana and the audibility map, pivots from behavioural speech testing to a spectral picture of speech: the long-term average speech spectrum, the famous “speech banana” on the audiogram, and the question of which phonemes a given threshold curve renders inaudible.