Matrix representation of linear operators: change of basis

References: edX online course MIT 8.05.1x Week 3.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 3.

We’ve seen that the matrix representation of a linear operator depends on the basis we’ve chosen within a vector space {V}. We now look at how the matrix representation changes if we change the basis. In what follows, we’ll consider two sets of basis vectors {\left\{ v\right\} } and {\left\{ u\right\} } and two operators {A} and {B}. Operator {A} transforms the basis {\left\{ v\right\} } into the basis {\left\{ u\right\} }, while {B} does the reverse. That is

\displaystyle   Av_{i} \displaystyle  = \displaystyle  u_{i}\ \ \ \ \ (1)
\displaystyle  Bu_{i} \displaystyle  = \displaystyle  v_{i} \ \ \ \ \ (2)

for all {i=1,\ldots,n}. From this definition, we can see that {A=B^{-1}} and {B=A^{-1}}, since

\displaystyle   u_{i} \displaystyle  = \displaystyle  Av_{i}=ABu_{i}\ \ \ \ \ (3)
\displaystyle  v_{i} \displaystyle  = \displaystyle  Bu_{i}=BAv_{i} \ \ \ \ \ (4)

Theorem 1 An operator (like {A} or {B} above) that transforms one set of basis vectors into another has the same matrix representation in both bases.

Proof: In matrix form, we have (remember we’re using the summation convention on repeated indices):

\displaystyle   Av_{i} \displaystyle  = \displaystyle  A_{ji}\left(\left\{ v\right\} \right)v_{j}\ \ \ \ \ (5)
\displaystyle  Au_{i} \displaystyle  = \displaystyle  A_{ji}\left(\left\{ u\right\} \right)u_{j} \ \ \ \ \ (6)

Note that the matrix elements depend on different bases in the two equations.

We can now operate with {A} again, using 1, to get

\displaystyle   Au_{i} \displaystyle  = \displaystyle  A\left(Av_{i}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  A\left(A_{ji}\left(\left\{ v\right\} \right)v_{j}\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}\left(\left\{ v\right\} \right)Av_{j}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}\left(\left\{ v\right\} \right)u_{j} \ \ \ \ \ (10)

Comparing the last line with 6, we see that

\displaystyle  A_{ji}\left(\left\{ v\right\} \right)=A_{ji}\left(\left\{ u\right\} \right)

Since the matrix elements are just numbers, this means that the elements in the two matrices {A_{ji}\left(\left\{ v\right\} \right)} and {A_{ji}\left(\left\{ u\right\} \right)} are the same.

We could do the same analysis using the {B} operator with the same result:

\displaystyle  B_{ji}\left(\left\{ v\right\} \right)=B_{ji}\left(\left\{ u\right\} \right) \ \ \ \ \ (11)

\Box

We can now turn to the matrix representations of a general operator {T} in two different bases. In this case, {T} can perform any linear transformation, so it doesn’t necessarily transform one set of basis vectors into another set of basis vectors. Consider first the case where {T} operates on each set of basis vectors given above:

\displaystyle   Tv_{i} \displaystyle  = \displaystyle  T_{ji}\left(\left\{ v\right\} \right)v_{j}\ \ \ \ \ (12)
\displaystyle  Tu_{i} \displaystyle  = \displaystyle  T_{ji}\left(\left\{ u\right\} \right)u_{j} \ \ \ \ \ (13)

Unless {T} is an operator like {A} or {B} above, in general {T_{ji}\left(\left\{ v\right\} \right)\ne T_{ji}\left(\left\{ u\right\} \right)}. We can see how these two matrices are related by using operators {A} and {B} above to write

\displaystyle   Tu_{i} \displaystyle  = \displaystyle  T\left(A_{ji}v_{j}\right)\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}Tv_{j}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}T_{kj}\left(\left\{ v\right\} \right)v_{k}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}T_{kj}\left(\left\{ v\right\} \right)Bu_{k}\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}T_{kj}\left(\left\{ v\right\} \right)A^{-1}u_{k}\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  A_{ji}T_{kj}\left(\left\{ v\right\} \right)A_{pk}^{-1}u_{p}\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \left[A_{pk}^{-1}T_{kj}\left(\left\{ v\right\} \right)A_{ji}\right]u_{p}\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  T_{pi}\left(\left\{ u\right\} \right)u_{p} \ \ \ \ \ (21)

We don’t need to specify the basis for the {A} or {B} matrices since the matrices are the same in both bases as we just saw above. The last line is just the expansion of {Tu_{i}} in terms of the {\left\{ u\right\} } basis. In the penultimate line, we see that the quantity in square brackets is the product of 3 matrices:

\displaystyle  A_{pk}^{-1}T_{kj}\left(\left\{ v\right\} \right)A_{ji}=\left[A^{-1}T\left(\left\{ v\right\} \right)A\right]_{pi} \ \ \ \ \ (22)

The required transformation is therefore

\displaystyle  T\left(\left\{ u\right\} \right)=A^{-1}T\left(\left\{ v\right\} \right)A \ \ \ \ \ (23)

where {u_{i}=Av_{i}}.

As a check, note that if {T=A} or {T=B=A^{-1}}, we reclaim the result in the theorem above, namely that {A\left(\left\{ u\right\} \right)=A\left(\left\{ v\right\} \right)} and {B\left(\left\{ u\right\} \right)=B\left(\left\{ v\right\} \right)}.

Trace and determinant

The trace of a matrix is the sum of its diagonal elements, written as {\mbox{tr }T}. A useful property of the trace is that

\displaystyle  \mbox{tr }\left(AB\right)=\mbox{tr }\left(BA\right) \ \ \ \ \ (24)

We can prove this by looking at the components. If {C=AB} then

\displaystyle  C_{ij}=A_{ik}B_{kj} \ \ \ \ \ (25)

The trace of {C} is the sum of its diagonal elements, written as {C_{ii}}, so

\displaystyle   \mbox{tr }C \displaystyle  = \displaystyle  \mbox{tr }\left(AB\right)\ \ \ \ \ (26)
\displaystyle  \displaystyle  = \displaystyle  A_{ik}B_{ki}\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  B_{ki}A_{ik}\ \ \ \ \ (28)
\displaystyle  \displaystyle  = \displaystyle  \left[BA\right]_{kk}\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  \mbox{tr }\left(BA\right) \ \ \ \ \ (30)

From this we can generalize to the case of the trace of a product of any number of matrices and obtain the cyclic rule:

\displaystyle  \mbox{tr}\left(A_{1}A_{2}\ldots A_{n}\right)=\mbox{tr}\left(A_{n}A_{1}A_{2}\ldots A_{n-1}\right) \ \ \ \ \ (31)

Going back to 23, we have

\displaystyle   \mbox{tr }T\left(\left\{ u\right\} \right) \displaystyle  = \displaystyle  \mbox{tr}\left(A^{-1}T\left(\left\{ v\right\} \right)A\right)\ \ \ \ \ (32)
\displaystyle  \displaystyle  = \displaystyle  \mbox{tr}\left(AA^{-1}T\left(\left\{ v\right\} \right)\right)\ \ \ \ \ (33)
\displaystyle  \displaystyle  = \displaystyle  \mbox{tr }T\left(\left\{ v\right\} \right) \ \ \ \ \ (34)

Thus the trace of any linear operator is invariant under a change of basis.

For the determinant, we have the results that the determinant of a product of matrices is equal to the product of the determinants, and the determinant of a matrix inverse is the reciprocal of the determinant of the original matrix. Therefore

\displaystyle   \mbox{det}\left(T\left(\left\{ u\right\} \right)\right) \displaystyle  = \displaystyle  \mbox{det}\left(A^{-1}T\left(\left\{ v\right\} \right)A\right)\ \ \ \ \ (35)
\displaystyle  \displaystyle  = \displaystyle  \frac{\mbox{det}A}{\mbox{det}A}\mbox{det}T\left(\left\{ v\right\} \right)\ \ \ \ \ (36)
\displaystyle  \displaystyle  = \displaystyle  \mbox{det}T\left(\left\{ v\right\} \right) \ \ \ \ \ (37)

Thus the determinant is also invariant under a change of basis.

4 thoughts on “Matrix representation of linear operators: change of basis

  1. Pingback: Projection operators | Physics pages

  2. Pingback: Unitary operators | Physics pages

  3. Pingback: Diagonalization of matrices | Physics pages

  4. Pingback: Unitary operators: active and passive transformations of an operator | Physics pages

Leave a Reply

Your email address will not be published. Required fields are marked *