# Eigenvalues and eigenvectors

References: edX online course MIT 8.05.1x Week 3.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 5.

While studying quantum mechanics, we have made extensive use of the eigenvalues and eigenvectors (the latter usually called eigenstates in quantum theory) of hermitian operators, since an observable quantity in quantum mechanics is always represented by a hermitian operator and the spectrum of possible values for a given observable is equivalent to the set of eigenvalues of that operator.

It’s useful to re-examine eigenvalues and eigenvectors from a strictly mathematical viewpoint, since this allows us to put precise definitions on many of the terms in common use. As usual, suppose we start with a vector space ${V}$ and an operator ${T}$. Suppose there is a one-dimensional subspace ${U}$ of ${V}$ which has the property that for any vector ${u\in U}$, ${Tu=\lambda u}$. That is, the operator ${T}$ maps any vector ${u}$ back into another vector in the same subspace ${U}$. In that case, ${U}$ is said to be an invariant subspace under the operator ${T}$.

You can think of this in geometric terms if we have some ${n}$-dimensional vector space ${V}$, and a one-dimensional subspace ${U}$ consisting of all vectors parallel to some straight line within ${V}$. The operator ${T}$ acting on any vector ${u}$ parallel to that line produces another vector which is also parallel to the same line. Of course we can’t push the geometric illustration too far, since in general ${V}$ and ${U}$ can be complex vector spaces, so the result of acting on ${u}$ with ${T}$ might give you some complex number ${\lambda}$ multiplied by ${u}$.

The equation

$\displaystyle Tu=\lambda u \ \ \ \ \ (1)$

is called an eigenvalue equation, and the number ${\lambda\in\mathbb{F}}$ is called the eigenvalue. The vector ${u}$ itself is called the eigenvector corresponding to the eigenvalue ${\lambda}$. Since we can multiply both sides of this equation by any number ${c}$, any multiple of ${u}$ is also an eigenvector corresponding to ${\lambda}$, so any vector ‘parallel’ to ${u}$ is also an eigenvector. (I’ve put ‘parallel’ in quotes, since we’re allowing for multiplication of ${u}$ by complex as well as real numbers.)

It can happen that, for a particular value of ${\lambda}$, there are two or more linearly independent (that is, non-parallel) eigenvectors. In that case, the subspace spanned by the eigenvectors is two- or higher-dimensional.

Another way of writing 1 is by introducing the identity operator ${I}$:

$\displaystyle \left(T-\lambda I\right)u=0 \ \ \ \ \ (2)$

If this equation has a solution other than ${u=0}$, then the operator ${T-\lambda I}$ has a non-trivial null space, which in turn means that ${T-\lambda I}$ is not injective (not one-to-one) and therefore not invertible. Also, the eigenvectors of ${T}$ with eigenvalue ${\lambda}$ are those vectors ${u}$ in the null space of ${T-\lambda I}$.

An important result is

Theorem 1 Suppose ${\lambda_{1},\ldots,\lambda_{m}}$ are distinct eigenvalues of ${T}$ and ${v_{1},\ldots,v_{m}}$ are the corresponding non-zero eigenvectors. Then the set ${v_{1},\ldots,v_{m}}$ is linearly independent.

Proof: Suppose to the contrary that ${v_{1},\ldots,v_{m}}$ is linearly dependent. Then there must be some subset that is linearly independent. Suppose that ${k}$ is the smallest positive integer such that ${v_{k}}$ can be written in terms of ${v_{1},\ldots,v_{k-1}}$. That is, the set ${v_{1},\ldots,v_{k-1}}$ is a linearly independent subset of ${v_{1},\ldots,v_{m}}$. In that case, there are numbers ${a_{1},\ldots,a_{k-1}\in\mathbb{F}}$ such that

$\displaystyle v_{k}=\sum_{i=1}^{k-1}a_{i}v_{i} \ \ \ \ \ (3)$

If we apply the operator ${T}$ to both sides and use the eigenvalue equation, we have

 $\displaystyle Tv_{k}$ $\displaystyle =$ $\displaystyle \lambda_{k}v_{k}\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{i=1}^{k-1}a_{i}Tv_{i}\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{i=1}^{k-1}a_{i}\lambda_{i}v_{i} \ \ \ \ \ (6)$

We can multiply both sides of 3 by ${\lambda_{k}}$ and subtract to get

 $\displaystyle \left(\lambda_{k}-\lambda_{k}\right)v_{k}$ $\displaystyle =$ $\displaystyle \sum_{i=1}^{k-1}a_{i}\left(\lambda_{i}-\lambda_{k}\right)v_{i}\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (8)$

Since the set of vectors ${v_{1},\ldots,v_{k-1}}$ is linearly independent, and ${\lambda_{k}\ne\lambda_{i}}$ for ${i=1,\ldots,k-1}$, the only solution of this equation is ${a_{i}=0}$ for ${i=1,\ldots,k-1}$. But this would make ${v_{k}=0}$, contrary to our assumption that ${v_{k}}$ is a non-zero eigenvector of ${T}$. Therefore the set ${v_{1},\ldots,v_{m}}$ is linearly independent. $\Box$

It turns out that there are some operators on real vector spaces that don’t have any eigenvalues. A simple example is the 2-dimensional vector space consisting of the ${xy}$ plane. The rotation operator which rotates any vector about the origin (by some angle other than ${2\pi}$) doesn’t leave any vector parallel to itself and thus has no eigenvalues or eigenvectors.

However, in a complex vector space, things are a bit neater. This leads to the following theorem:

Theorem 2 Every operator on a finite-dimensional, nonzero, complex vector space has at least one eigenvalue.

Proof: Suppose ${V}$ is a complex vector space with dimension ${n>0}$. For some vector ${v\in V}$ we can write the ${n+1}$ vectors

$\displaystyle v,Tv,T^{2}v,\ldots,T^{n}v \ \ \ \ \ (9)$

Because we have ${n+1}$ vectors in an ${n}$-dimensional vector space, these vectors must be linearly dependent, which means we can find complex numbers ${a_{0},\ldots,a_{n}\in\mathbb{C}}$, not all zero, such that

$\displaystyle 0=a_{0}v+a_{1}Tv+\ldots+a_{n}T^{n}v \ \ \ \ \ (10)$

We can consider a polynomial in ${z}$ with the ${a_{i}}$ as coefficients:

$\displaystyle p\left(z\right)=a_{0}+a_{1}z+\ldots+a_{n}z^{n} \ \ \ \ \ (11)$

The Fundamental Theorem of Algebra states that any polynomial of degree ${n}$ can be factored into ${n}$ linear factors. In our case, the actual degree of ${p\left(z\right)}$ is ${m\le n}$ since ${a_{n}}$ could be zero. So we can factor ${p\left(z\right)}$ as follows:

$\displaystyle p\left(z\right)=c\left(z-\lambda_{1}\right)\ldots\left(z-\lambda_{m}\right) \ \ \ \ \ (12)$

where ${c\ne0}$.

Comparing this to 10, we can write that equation as

 $\displaystyle 0$ $\displaystyle =$ $\displaystyle a_{0}v+a_{1}Tv+\ldots+a_{n}T^{n}v\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(a_{0}I+a_{1}T+\ldots+a_{n}T^{n}\right)v\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle c\left(T-\lambda_{1}I\right)\ldots\left(T-\lambda_{m}I\right)v \ \ \ \ \ (15)$

All the ${T-\lambda_{i}I}$ operators in the last line commute with each other since ${I}$ commutes with everything and ${T}$ commutes with itself, so in order for the last line to be zero, there has to be at least one ${\lambda_{i}}$ such that ${\left(T-\lambda_{i}I\right)v=0}$. That is, there is at least one ${\lambda_{i}}$ such that ${T-\lambda_{i}I}$ has a nonzero null space, which means ${\lambda_{i}}$ is an eigenvalue.$\Box$