4.2 Vectors and matrices

This lesson is the working dictionary: what a vector is, what a matrix is, and the operations we’ll lean on for the rest of the chapter. It is short and almost-entirely algebraic, but every concept here is used silently in Foundations 5 — ODEs, Foundations 6 — PDEs, and across Sound and Hearing.

Vectors

A vector in Rn\mathbb{R}^n is an ordered tuple of nn real numbers, conventionally written as a column:

v  =  (v1v2vn).\mathbf{v} \;=\; \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix}.

In 2-D, v=(x,y)\mathbf{v} = (x, y) is the displacement from the origin to that point. In 3-D, it’s (x,y,z)(x, y, z). In higher dimensions the geometric picture stops being literally drawable, but the algebra carries through unchanged.

The two operations that define a vector space:

Addition is componentwise:

v+w  =  (v1+w1v2+w2vn+wn).\mathbf{v} + \mathbf{w} \;=\; \begin{pmatrix} v_1 + w_1 \\ v_2 + w_2 \\ \vdots \\ v_n + w_n \end{pmatrix}.

Geometrically: place w\mathbf{w}‘s tail at v\mathbf{v}‘s tip; the sum is the arrow from the origin to where w\mathbf{w} now ends.

Scalar multiplication scales every component by the same number:

cv  =  (cv1cv2cvn).c \, \mathbf{v} \;=\; \begin{pmatrix} c\, v_1 \\ c\, v_2 \\ \vdots \\ c\, v_n \end{pmatrix}.

Geometrically: stretch (or compress) the arrow without rotating it. Negative cc flips direction.

A set of vectors {v1,,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} is linearly independent if no one of them can be written as a linear combination of the others. A maximal independent set is a basis for the vector space; the number of vectors in any basis is the dimension. For Rn\mathbb{R}^n the standard basis is {e1,,en}\{\mathbf{e}_1, \ldots, \mathbf{e}_n\}, where ek\mathbf{e}_k has a 1 in slot kk and zeros elsewhere.

Matrices

A matrix is a rectangular array of numbers. An m×nm \times n matrix has mm rows and nn columns; for our purposes the most useful case is the square matrix where m=nm = n:

A  =  (a11a12a1na21a22a2nan1an2ann).A \;=\; \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix}.

The entry aija_{ij} sits in row ii, column jj. A column can be read as a single vector (a1j,a2j,,anj)(a_{1j}, a_{2j}, \ldots, a_{nj}) — useful because, as we saw in 4.1, the columns of a matrix tell you where the basis vectors get sent.

Matrix–vector multiplication

The fundamental operation. Given an n×nn \times n matrix AA and a vector vRn\mathbf{v} \in \mathbb{R}^n, the product AvA \mathbf{v} is another vector in Rn\mathbb{R}^n, defined component-by-component as

(Av)i  =  j=1naijvj.(A \mathbf{v})_i \;=\; \sum_{j=1}^n a_{ij}\, v_j.

In words: the ii-th component of the output is the dot product of the ii-th row of AA with v\mathbf{v}.

For a 2×22 \times 2 matrix this gives the formula we already saw:

(a11a12a21a22)(xy)  =  (a11x+a12ya21x+a22y).\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \;=\; \begin{pmatrix} a_{11} x + a_{12} y \\ a_{21} x + a_{22} y \end{pmatrix}.

There is a second interpretation, which is sometimes more useful: AvA\mathbf{v} is a linear combination of the columns of AA with coefficients given by v\mathbf{v}:

Av  =  v1(a11a21)+v2(a12a22)  =  v1(Ae1)+v2(Ae2).A \mathbf{v} \;=\; v_1 \begin{pmatrix} a_{11} \\ a_{21} \end{pmatrix} + v_2 \begin{pmatrix} a_{12} \\ a_{22} \end{pmatrix} \;=\; v_1 (A \mathbf{e}_1) + v_2 (A \mathbf{e}_2).

Both interpretations are correct; they’re the same arithmetic written differently. The “row-by-row” reading is the standard mechanical recipe; the “column-combination” reading is the conceptual one — it says exactly what we said geometrically in 4.1, that the matrix is determined by where it sends the basis.

Matrix–matrix multiplication

Given two square matrices AA and BB of the same size, their product ABA B is the matrix that composes their transformations: applying ABA B to v\mathbf{v} first applies BB to v\mathbf{v}, then applies AA to the result. That is, (AB)v=A(Bv)(AB) \mathbf{v} = A(B\mathbf{v}).

In components:

(AB)ij  =  k=1naikbkj.(A B)_{ij} \;=\; \sum_{k=1}^n a_{ik}\, b_{kj}.

Mnemonic: “row ii of AA dot column jj of BB.” This is matrix multiplication’s most-memorised recipe and the most-forgotten thing about it: it is composition of transformations, which is why ABA B is generally not equal to BAB A. The order matters because composing rotation-then-shear is different from shear-then-rotation.

Transpose, identity, inverse

Transpose. The transpose ATA^T swaps rows and columns: (AT)ij=Aji(A^T)_{ij} = A_{ji}. Geometrically, transposing has no clean interpretation in general; algebraically, it interacts cleanly with the dot product: (Av)w=v(ATw)(A \mathbf{v}) \cdot \mathbf{w} = \mathbf{v} \cdot (A^T \mathbf{w}).

Identity matrix. The matrix II has 11s on the diagonal and 00s elsewhere: Iv=vI \mathbf{v} = \mathbf{v} for every v\mathbf{v}. It is the “do nothing” transformation.

Inverse. A square matrix AA is invertible if there exists a matrix A1A^{-1} such that AA1=A1A=IA A^{-1} = A^{-1} A = I. Geometrically: A1A^{-1} undoes the transformation AA. Algebraically: AA is invertible if and only if detA0\det A \neq 0 (no information lost) if and only if the columns of AA are linearly independent if and only if the equation Ax=bA \mathbf{x} = \mathbf{b} has a unique solution for every b\mathbf{b}. These “if and only ifs” are central; they will reappear in 4.3.

For a 2×22 \times 2 matrix the inverse has an explicit formula:

(abcd)1  =  1adbc(dbca),\begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} \;=\; \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix},

which fails (denominator zero) exactly when the determinant vanishes. For larger matrices the inverse is computed by row reduction (next lesson) or by other algorithms; explicit formulas exist but are rarely the most efficient route.

The determinant

For a 2×22 \times 2 matrix, detA=a11a22a12a21\det A = a_{11} a_{22} - a_{12} a_{21} — the same number that appears in the inverse formula. Geometrically, detA|\det A| is the area scaling factor of the transformation: the unit square goes to a parallelogram of area detA|\det A|.

In nn dimensions, detA\det A is the signed volume scaling factor. The sign tells you whether the orientation is preserved (positive) or flipped (negative). When detA=0\det A = 0, the transformation collapses the nn-dimensional space onto a lower-dimensional subspace.

For larger matrices the determinant is computed by cofactor expansion along any row or column, or — much more efficiently — by row-reducing to upper triangular form and multiplying the diagonal entries (Gaussian elimination, again).

What we use this for

The operations above are the ground floor for everything else in the chapter:

The next lesson actually solves a linear system: given a matrix AA and a vector b\mathbf{b}, find x\mathbf{x} such that Ax=bA \mathbf{x} = \mathbf{b}. That is the everyday task of linear algebra, and the algorithm — Gaussian elimination — is one of the oldest and most useful in mathematics.