4.2 Vectors and matrices

This lesson is the working dictionary: what a vector is, what a matrix is, and the operations we’ll lean on for the rest of the chapter. It is short and almost-entirely algebraic, but every concept here is used silently in Foundations 5 — ODEs, Foundations 6 — PDEs, and across Sound and Hearing.

Vectors

A vector in $\mathbb{R}^n$ is an ordered tuple of $n$ real numbers, conventionally written as a column:

\mathbf{v} \;=\; \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix}.

In 2-D, $\mathbf{v} = (x, y)$ is the displacement from the origin to that point. In 3-D, it’s $(x, y, z)$ . In higher dimensions the geometric picture stops being literally drawable, but the algebra carries through unchanged.

The two operations that define a vector space:

Addition is componentwise:

\mathbf{v} + \mathbf{w} \;=\; \begin{pmatrix} v_1 + w_1 \\ v_2 + w_2 \\ \vdots \\ v_n + w_n \end{pmatrix}.

Geometrically: place $\mathbf{w}$ ‘s tail at $\mathbf{v}$ ‘s tip; the sum is the arrow from the origin to where $\mathbf{w}$ now ends.

Scalar multiplication scales every component by the same number:

c \, \mathbf{v} \;=\; \begin{pmatrix} c\, v_1 \\ c\, v_2 \\ \vdots \\ c\, v_n \end{pmatrix}.

Geometrically: stretch (or compress) the arrow without rotating it. Negative $c$ flips direction.

A set of vectors $\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}$ is linearly independent if no one of them can be written as a linear combination of the others. A maximal independent set is a basis for the vector space; the number of vectors in any basis is the dimension. For $\mathbb{R}^n$ the standard basis is $\{\mathbf{e}_1, \ldots, \mathbf{e}_n\}$ , where $\mathbf{e}_k$ has a 1 in slot $k$ and zeros elsewhere.

Matrices

A matrix is a rectangular array of numbers. An $m \times n$ matrix has $m$ rows and $n$ columns; for our purposes the most useful case is the square matrix where $m = n$ :

A \;=\; \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix}.

The entry $a_{ij}$ sits in row $i$ , column $j$ . A column can be read as a single vector $(a_{1j}, a_{2j}, \ldots, a_{nj})$ — useful because, as we saw in 4.1, the columns of a matrix tell you where the basis vectors get sent.

Matrix–vector multiplication

The fundamental operation. Given an $n \times n$ matrix $A$ and a vector $\mathbf{v} \in \mathbb{R}^n$ , the product $A \mathbf{v}$ is another vector in $\mathbb{R}^n$ , defined component-by-component as

(A \mathbf{v})_i \;=\; \sum_{j=1}^n a_{ij}\, v_j.

In words: the $i$ -th component of the output is the dot product of the $i$ -th row of $A$ with $\mathbf{v}$ .

For a $2 \times 2$ matrix this gives the formula we already saw:

\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \;=\; \begin{pmatrix} a_{11} x + a_{12} y \\ a_{21} x + a_{22} y \end{pmatrix}.

There is a second interpretation, which is sometimes more useful: $A\mathbf{v}$ is a linear combination of the columns of $A$ with coefficients given by $\mathbf{v}$ :

A \mathbf{v} \;=\; v_1 \begin{pmatrix} a_{11} \\ a_{21} \end{pmatrix} + v_2 \begin{pmatrix} a_{12} \\ a_{22} \end{pmatrix} \;=\; v_1 (A \mathbf{e}_1) + v_2 (A \mathbf{e}_2).

Both interpretations are correct; they’re the same arithmetic written differently. The “row-by-row” reading is the standard mechanical recipe; the “column-combination” reading is the conceptual one — it says exactly what we said geometrically in 4.1, that the matrix is determined by where it sends the basis.

Matrix–matrix multiplication

Given two square matrices $A$ and $B$ of the same size, their product $A B$ is the matrix that composes their transformations: applying $A B$ to $\mathbf{v}$ first applies $B$ to $\mathbf{v}$ , then applies $A$ to the result. That is, $(AB) \mathbf{v} = A(B\mathbf{v})$ .

In components:

(A B)_{ij} \;=\; \sum_{k=1}^n a_{ik}\, b_{kj}.

Mnemonic: “row $i$ of $A$ dot column $j$ of $B$ .” This is matrix multiplication’s most-memorised recipe and the most-forgotten thing about it: it is composition of transformations, which is why $A B$ is generally not equal to $B A$ . The order matters because composing rotation-then-shear is different from shear-then-rotation.

Transpose, identity, inverse

Transpose. The transpose $A^T$ swaps rows and columns: $(A^T)_{ij} = A_{ji}$ . Geometrically, transposing has no clean interpretation in general; algebraically, it interacts cleanly with the dot product: $(A \mathbf{v}) \cdot \mathbf{w} = \mathbf{v} \cdot (A^T \mathbf{w})$ .

Identity matrix. The matrix $I$ has $1$ s on the diagonal and $0$ s elsewhere: $I \mathbf{v} = \mathbf{v}$ for every $\mathbf{v}$ . It is the “do nothing” transformation.

Inverse. A square matrix $A$ is invertible if there exists a matrix $A^{-1}$ such that $A A^{-1} = A^{-1} A = I$ . Geometrically: $A^{-1}$ undoes the transformation $A$ . Algebraically: $A$ is invertible if and only if $\det A \neq 0$ (no information lost) if and only if the columns of $A$ are linearly independent if and only if the equation $A \mathbf{x} = \mathbf{b}$ has a unique solution for every $\mathbf{b}$ . These “if and only ifs” are central; they will reappear in 4.3.

For a $2 \times 2$ matrix the inverse has an explicit formula:

\begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} \;=\; \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix},

which fails (denominator zero) exactly when the determinant vanishes. For larger matrices the inverse is computed by row reduction (next lesson) or by other algorithms; explicit formulas exist but are rarely the most efficient route.

The determinant

For a $2 \times 2$ matrix, $\det A = a_{11} a_{22} - a_{12} a_{21}$ — the same number that appears in the inverse formula. Geometrically, $|\det A|$ is the area scaling factor of the transformation: the unit square goes to a parallelogram of area $|\det A|$ .

In $n$ dimensions, $\det A$ is the signed volume scaling factor. The sign tells you whether the orientation is preserved (positive) or flipped (negative). When $\det A = 0$ , the transformation collapses the $n$ -dimensional space onto a lower-dimensional subspace.

For larger matrices the determinant is computed by cofactor expansion along any row or column, or — much more efficiently — by row-reducing to upper triangular form and multiplying the diagonal entries (Gaussian elimination, again).

What we use this for

The operations above are the ground floor for everything else in the chapter:

4.3 uses matrix–vector multiplication to write linear systems and the inverse / determinant to characterise when they have solutions.
4.4 defines eigenvalues using the equation $A \mathbf{v} = \lambda \mathbf{v}$ , a matrix–vector product set equal to a scalar multiple of the input.
4.5 uses transposes (and the dot product they relate to) to define angles, projections, and orthogonality.
4.6 puts the pieces together for self-adjoint matrices and operators — the matrices $A^T = A$ — which is the algebraic structure underwriting separation of variables in PDEs.

The next lesson actually solves a linear system: given a matrix $A$ and a vector $\mathbf{b}$ , find $\mathbf{x}$ such that $A \mathbf{x} = \mathbf{b}$ . That is the everyday task of linear algebra, and the algorithm — Gaussian elimination — is one of the oldest and most useful in mathematics.