8.4 Predictive coding as the implementation hypothesis

The framework above is mathematically clean. But how do real neurons implement it?

The dominant implementation hypothesis — sometimes called the canonical microcircuit for predictive coding — was developed by Bastos and colleagues (2012), drawing on prior work. The hypothesis identifies prediction units and prediction-error units with specific cortical layers and cell types: roughly, superficial pyramidal cells in layers 2/3 of cortex are proposed to carry prediction errors upward, while deep pyramidal cells in layers 5/6 carry predictions downward. The thalamocortical loop and the cortico-cortical projections are organized to support this segregation.

The hypothesis is plausible. It is consistent with a wide range of neurophysiological and anatomical data. It is also not yet fully proven. The strongest direct evidence — recordings of distinct prediction and error units in cortex — is suggestive but not unequivocal. The Keller & Mrsic-Flogel (2018) review summarizes the state of the evidence honestly.

This is where I want to be careful. Predictive coding is the best-developed framework we currently have for understanding how the cortex implements perceptual inference. It explains a wide range of phenomena — repetition suppression, mismatch negativity, attention effects, certain illusions — with mathematical elegance and broad scope. It is also a framework, not a settled implementation. The specifics — exactly which cells do what, exactly how predictions are computed and compared with input, exactly how precision is set — remain partially open.

A reader new to the field is likely to encounter several other formulations alongside predictive coding: the Bayesian brain (the broader umbrella claim that perception is approximately Bayesian), the free-energy principle (a more general claim, due to Friston, that the brain minimizes a quantity called variational free energy under all of perception and action), and active inference (an extension that treats action as inference over policies). Each is associated with a literature; each has critics; each makes claims of varying generality.

For the purposes of this essay, the modest claim is the right one: perceptual inference is approximately Bayesian; predictive coding is the most coherent implementational framework we have for how that inference is carried out in cortex; we will use it as the working framework while remaining open about its limits.