# Matrix representation of linear operators; matrix multiplication

References: edX online course MIT 8.05.1x Week 3.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 3.

A linear operator ${T}$ can be represented as a matrix with elements ${T_{ij}}$, but in order to do this, we need to specify which basis we’re using for the vector space ${V}$. Suppose we have a set of basis vectors ${\left\{ v\right\} =\left(v_{1},v_{2},\ldots,v_{n}\right)}$ and we know the result of operating on each basis vector with ${T}$. We can express the result of ${Tv_{j}}$ as another vector ${v_{j}^{\prime}}$ which can be written in terms of the original basis vectors as

$\displaystyle v_{j}^{\prime}=\sum_{i=1}^{n}T_{ij}v_{i} \ \ \ \ \ (1)$

This defines the matrix elements ${T_{ij}}$ in the basis ${\left\{ v\right\} }$. [In Zwiebach’s notes, he usually uses ${v_{i}}$ to represent the basis vectors, while in his lectures he tends to use ${e_{i}}$. I’ll stick to ${v_{i}}$ to be consistent with the notes.]

Equation 1 may not look quite right, since we are summing over the rows of the matrix ${T_{ij}}$ multiplied by the vectors ${v_{i}}$. Usually in matrix multiplication, we sum over the columns of the matrix on the left and the rows of the matrix (or vector) on the right. However, 1 isn’t actually a matrix multiplication formula, since each ${v_{i}}$ is an entire basis vector, and not a component from one vector.

To see that this formula does make sense, and does coincide with the usual definition of matrix multiplication, suppose we have an orthonormal basis where each vector ${v_{i}}$ is represented as a column vector with all entries equal to zero except for the ${i}$th element, which is 1. In that case, the result of operating on one particular basis vector ${v_{k}}$ with ${T}$ is

 $\displaystyle Tv_{k}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{ccc} \ldots & T_{1k} & \ldots\\ \vdots & T_{2k} & \vdots\\ \vdots & \vdots & \vdots\\ \vdots & \vdots & \vdots\\ \ldots & T_{nk} & \ldots \end{array}\right]\left[\begin{array}{c} 0\\ \vdots\\ 1\\ \vdots\\ 0 \end{array}\right]\ \ \ \ \ (2)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{c} T_{1k}\\ T_{2k}\\ \vdots\\ \vdots\\ T_{nk} \end{array}\right]\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle T_{1k}\left[\begin{array}{c} 1\\ 0\\ \vdots\\ \vdots\\ 0 \end{array}\right]+T_{2k}\left[\begin{array}{c} 0\\ 1\\ 0\\ \vdots\\ 0 \end{array}\right]+\ldots+T_{nk}\left[\begin{array}{c} 0\\ \vdots\\ \vdots\\ \vdots\\ 1 \end{array}\right]\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{i=1}^{n}T_{ij}v_{i} \ \ \ \ \ (5)$

In the column vector in the first line, all entries are zero except for the ${k}$th entry which is 1. Multiplying a square ${n\times n}$ matrix ${T_{ij}}$ into this column vector using the normal rules for matrix multiplication simply copies the ${k}$th column of ${T_{ij}}$ into a column vector, as shown in the second line.

Although the matrix entries ${T_{ij}}$ in general depend on the basis, the identity operator ${I}$ is the same in all bases. Since ${Iv_{j}=v_{j}}$ we must have

$\displaystyle Iv_{j}=\sum_{i=1}^{n}T_{ij}v_{i}=v_{j} \ \ \ \ \ (6)$

This can be true for all ${v_{j}}$ only if ${T_{ij}=\delta_{ij}}$.

Matrix multiplication

If we start with 1 as the definition of the matrix elements of a linear operator ${T}$, we can actually derive the traditional formula for matrix multiplication from it. If we didn’t know the matrix multiplication formula beforehand (that is, the formula where we multiply a row of the left matrix into a column of the right matrix), we might naively assume that in order to multiply two matrices, we just multiply together the corresponding entries in the two matrices. If that were true, then matrix multiplication could be defined only for two matrices that had the same dimensions, as in two ${n\times m}$ matrices, say.

As you probably know, the accepted formula for the product of two matrices is valid if the number of columns in the left matrix equals the number of rows in the right matrix. To see how this formula arises naturally out of the matrix representation of linear operators, we’ll consider two vectors ${a}$ and ${b}$ and look at their components along some basis ${\left\{ v\right\} }$ in a vector space ${V}$. That is, we can expand ${a}$ and ${b}$ as (to save writing, I’ll use the summation convention in which any pair of repeated indices is assumed to be summed):

 $\displaystyle a$ $\displaystyle =$ $\displaystyle a_{i}v_{i}\ \ \ \ \ (7)$ $\displaystyle b$ $\displaystyle =$ $\displaystyle b_{i}v_{i} \ \ \ \ \ (8)$

Now suppose there is a linear operator ${T}$ that transforms ${a}$ into ${b}$, so that

$\displaystyle b=Ta \ \ \ \ \ (9)$

If we know the effect of operating on each basis vector ${v_{i}}$ with ${T}$, we can plug 1 into this equation to get

 $\displaystyle b$ $\displaystyle =$ $\displaystyle Ta_{i}v_{i}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a_{i}Tv_{i}\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a_{i}T_{ji}v_{j}\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(T_{ji}a_{i}\right)v_{j} \ \ \ \ \ (13)$

In the second line, we used the fact that the ${a_{i}}$ are just numbers (not vectors), so they commute with ${T}$. In the last line, the quantity ${T_{ji}a_{i}}$ is the sum over the columns of ${T}$ and the rows of ${a}$ (we’re writing ${a}$ as a column vector), and so is a traditional product of an ${n\times n}$ matrix into an ${n}$-component column vector. Also, referring back to 8, we see that ${T_{ji}a_{i}}$ is ${b_{j}}$, the component of ${b}$ along the basis vector ${v_{j}}$.

We can apply similar logic to the product of two operators, ${T}$ and ${S}$. Suppose the product ${TS}$ operates on a basis vector ${v_{j}}$.

 $\displaystyle \left(TS\right)v_{j}$ $\displaystyle =$ $\displaystyle T\left(Sv_{j}\right)\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle TS_{pj}v_{p}\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle S_{pj}Tv_{p}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle S_{pj}T_{ip}v_{i}\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(T_{ip}S_{pj}\right)v_{i}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(TS\right)_{ij}v_{i} \ \ \ \ \ (19)$

In this derivation, we’ve used the fact that the matrix elements ${T_{ij}}$ and ${S_{ij}}$ are just numbers, so they commute with all operators. We also applied 1 in the second and fourth lines. By comparing the last two lines, we see that the matrix element ${\left(TS\right)_{ij}}$ of the product is formed by taking the usual matrix product of ${T}$ and ${S}$, that is, by multiplying rows of ${T}$ into columns of ${S}$:

$\displaystyle \left(TS\right)_{ij}=T_{ip}S_{pj} \ \ \ \ \ (20)$

Thus the traditional matrix product is actually a consequence of a consistent definition of a product of linear operators.

## 4 thoughts on “Matrix representation of linear operators; matrix multiplication”

1. Pingback: Projection operators | Physics pages