# Time-dependent propagators

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 4.3.

The fourth postulate of non-relativistic quantum mechanics concerns how states evolve with time. The postulate simply states that in non-relativistic quantum mechanics, a state satisfies the Schrödinger equation:

$\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi\right\rangle =H\left|\psi\right\rangle \ \ \ \ \ (1)$

where ${H}$ is the Hamiltonian, which is obtained from the classical Hamiltonian by means of the other postulates of quantum mechanics, namely that we replace all references to the position ${x}$ by the quantum position operator ${X}$ with matrix elements (in the ${x}$ basis) of

$\displaystyle \left\langle x^{\prime}\left|X\right|x\right\rangle =\delta\left(x-x^{\prime}\right) \ \ \ \ \ (2)$

and all references to classical momentum ${p}$ by the momentum operator ${P}$ with matrix elements

$\displaystyle \left\langle x^{\prime}\left|P\right|x\right\rangle =-i\hbar\delta^{\prime}\left(x-x^{\prime}\right) \ \ \ \ \ (3)$

In our earlier examination of the Schrödinger equation, we assumed that the Hamiltonian is independent of time, which allowed us to obtain an explicit expression for the propagator

$\displaystyle U\left(t\right)=e^{-iHt/\hbar} \ \ \ \ \ (4)$

The propagator is applied to the initial state ${\left|\psi\left(0\right)\right\rangle }$ to obtain the state at any future time ${t}$:

$\displaystyle \left|\psi\left(t\right)\right\rangle =U\left(t\right)\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (5)$

What happens if ${H=H\left(t\right)}$, that is, there is an explicit time dependence in the Hamiltonian? The approach taken by Shankar is a bit hand-wavy, but goes as follows. We divide the time interval ${\left[0,t\right]}$ into ${N}$ small increments ${\Delta=t/N}$. To first order in ${\Delta}$, we can integrate 1 by taking the first order term in a Taylor expansion:

 $\displaystyle \left|\psi\left(\Delta\right)\right\rangle$ $\displaystyle =$ $\displaystyle \left|\psi\left(0\right)\right\rangle +\Delta\left.\frac{d}{dt}\left|\psi\left(t\right)\right\rangle \right|_{t=0}+\mathcal{O}\left(\Delta^{2}\right)\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left|\psi\left(0\right)\right\rangle +-\frac{i\Delta}{\hbar}H\left(0\right)\left|\psi\left(0\right)\right\rangle +\mathcal{O}\left(\Delta^{2}\right)\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(1-\frac{i\Delta}{\hbar}H\left(0\right)\right)\left|\psi\left(0\right)\right\rangle +\mathcal{O}\left(\Delta^{2}\right) \ \ \ \ \ (8)$

So far, we’ve been fairly precise, but now the hand-waving starts. We note that the term multiplying ${\left|\psi\left(0\right)\right\rangle }$ consists of the first two terms in the expansion of ${e^{-i\Delta H\left(0\right)/\hbar}}$, so we state that to evolve from ${t=0}$ to ${t=\Delta}$, we multiply the initial state ${\left|\psi\left(0\right)\right\rangle }$ by ${e^{-i\Delta H\left(0\right)/\hbar}}$. That is, we propose that

$\displaystyle \left|\psi\left(\Delta\right)\right\rangle =e^{-i\Delta H\left(0\right)/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (9)$

[The reason this is hand-waving is that there are many functions whose first order Taylor expansion matches ${\left(1-\frac{i\Delta}{\hbar}H\left(0\right)\right)}$, so it seems arbitrary to choose the exponential. I imagine the motivation is that in the time-independent case, the result reduces to 4.]

In any case, if we accept this, then we can iterate the process to evolve to later times. To get to ${t=2\Delta}$, we have

 $\displaystyle \left|\psi\left(2\Delta\right)\right\rangle$ $\displaystyle =$ $\displaystyle e^{-i\Delta H\left(\Delta\right)/\hbar}\left|\psi\left(\Delta\right)\right\rangle \ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-i\Delta H\left(\Delta\right)/\hbar}e^{-i\Delta H\left(0\right)/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (11)$

The snag here is that we can’t, in general, combine the two exponentials into a single exponential by adding the exponents. This is because ${H\left(\Delta\right)}$ and ${H\left(0\right)}$ will not, in general, commute, as the Baker-Campbell-Hausdorff formula tells us. For example, the time dependence of ${H\left(t\right)}$ might be such that at ${t=0}$, ${H\left(0\right)}$ is a function of the position operator ${X}$ only, while at ${t=\Delta}$, ${H\left(\Delta\right)}$ becomes a function of the momentum operator ${P}$ only. Since ${X}$ and ${P}$ don’t commute, ${\left[H\left(0\right),H\left(\Delta\right)\right]\ne0}$, so ${e^{-i\Delta H\left(\Delta\right)/\hbar}e^{-i\Delta H\left(0\right)/\hbar}\ne e^{-i\Delta\left[H\left(0\right)+H\left(\Delta\right)\right]/\hbar}}$.

This means that the best we can usually do is to write

 $\displaystyle \left|\psi\left(t\right)\right\rangle$ $\displaystyle =$ $\displaystyle \left|\psi\left(N\Delta\right)\right\rangle \ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \prod_{n=0}^{N-1}e^{-i\Delta H\left(n\Delta\right)/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (13)$

The propagator then becomes, in the limit

$\displaystyle U\left(t\right)=\lim_{N\rightarrow\infty}\prod_{n=0}^{N-1}e^{-i\Delta H\left(n\Delta\right)/\hbar} \ \ \ \ \ (14)$

This limit is known as a time-ordered integral and is written as

$\displaystyle T\left\{ \exp\left[-\frac{i}{\hbar}\int_{0}^{t}H\left(t^{\prime}\right)dt^{\prime}\right]\right\} \equiv\lim_{N\rightarrow\infty}\prod_{n=0}^{N-1}e^{-i\Delta H\left(n\Delta\right)/\hbar} \ \ \ \ \ (15)$

One final note about the propagators. Since each term in the product is the exponential of ${i}$ times a Hermitian operator, each term is a unitary operator. Further, since the product of two unitary operators is still unitary, the propagator in the time-dependent case is a unitary operator.

We’ve defined a propagator as a unitary operator that carries a state from ${t=0}$ to some later time ${t}$, but we can generalize the notation so that ${U\left(t_{2},t_{1}\right)}$ is a propagator that carries a state from ${t=t_{1}}$ to ${t=t_{2}}$, that is

$\displaystyle \left|\psi\left(t_{2}\right)\right\rangle =U\left(t_{2},t_{1}\right)\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (16)$

We can chain propagators together to get

 $\displaystyle \left|\psi\left(t_{3}\right)\right\rangle$ $\displaystyle =$ $\displaystyle U\left(t_{3},t_{2}\right)\left|\psi\left(t_{2}\right)\right\rangle \ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left(t_{3},t_{2}\right)U\left(t_{2},t_{1}\right)\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left(t_{3},t_{1}\right)\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (19)$

Therefore

$\displaystyle U\left(t_{3},t_{1}\right)=U\left(t_{3},t_{2}\right)U\left(t_{2},t_{1}\right) \ \ \ \ \ (20)$

Since the Hermitian conjugate of a unitary operator is its inverse, we have

$\displaystyle U^{\dagger}\left(t_{2},t_{1}\right)=U^{-1}\left(t_{2},t_{1}\right) \ \ \ \ \ (21)$

We can combine this with 20 to get

 $\displaystyle \left|\psi\left(t_{1}\right)\right\rangle$ $\displaystyle =$ $\displaystyle I\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U^{-1}\left(t_{2},t_{1}\right)U\left(t_{2},t_{1}\right)\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U^{\dagger}\left(t_{2},t_{1}\right)U\left(t_{2},t_{1}\right)\left|\psi\left(t_{1}\right)\right\rangle \ \ \ \ \ (24)$

Therefore

 $\displaystyle U^{\dagger}\left(t_{2},t_{1}\right)U\left(t_{2},t_{1}\right)$ $\displaystyle =$ $\displaystyle U\left(t_{1},t_{1}\right)=I\ \ \ \ \ (25)$ $\displaystyle U^{\dagger}\left(t_{2},t_{1}\right)$ $\displaystyle =$ $\displaystyle U\left(t_{1},t_{2}\right) \ \ \ \ \ (26)$

That is, the Hermitian conjugate (or inverse) of a propagator carries a state ‘backwards in time’ to its starting point.

# Postulates of quantum mechanics: Schrödinger equation and propagators

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 4.3.

The first three postulates of quantum mechanics concern the properties of a quantum state. The fourth postulate concerns how states evolve with time. The postulate simply states that in non-relativistic quantum mechanics, a state satisfies the Schrödinger equation:

$\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi\right\rangle =H\left|\psi\right\rangle \ \ \ \ \ (1)$

where ${H}$ is the Hamiltonian, which is obtained from the classical Hamiltonian by means of the other postulates of quantum mechanics, namely that we replace all references to the position ${x}$ by the quantum position operator ${X}$ with matrix elements (in the ${x}$ basis) of

$\displaystyle \left\langle x^{\prime}\left|X\right|x\right\rangle =\delta\left(x-x^{\prime}\right) \ \ \ \ \ (2)$

and all references to classical momentum ${p}$ by the momentum operator ${P}$ with matrix elements

$\displaystyle \left\langle x^{\prime}\left|P\right|x\right\rangle =-i\hbar\delta^{\prime}\left(x-x^{\prime}\right) \ \ \ \ \ (3)$

Although we’ve posted many articles based on Griffiths’s book in which we solved the Schrödinger equation, the approach taken by Shankar is a bit different and, in some ways, a lot more elegant. We begin with a Hamiltonian that does not depend explicitly on time, and then by observing that, since the Schrödinger equation contains only the first derivative with respect to time, The time evolution of a state can be uniquely determined if we specify only the initial state ${\left|\psi\left(0\right)\right\rangle }$. [A differential equation that is second order in time, such as the wave equation, requires both the initial position and initial velocity to be specified.]

The solution of the Schrödinger equation is then found in analogy to the approach we used in solving the coupled masses problem earlier. We find the eigenvalues and eigenvectors of the Hamiltonian in some basis and use these to construct the propagator ${U\left(t\right)}$. We can then write the solution as

$\displaystyle \left|\psi\left(t\right)\right\rangle =U\left(t\right)\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (4)$

For the case of a time-independent Hamiltonian, we can actually construct ${U\left(t\right)}$ in terms of the eigenvectors of ${H}$. The eigenvalue equation is

$\displaystyle H\left|E\right\rangle =E\left|E\right\rangle \ \ \ \ \ (5)$

where ${E}$ is an eigenvalue of ${H}$ and ${\left|E\right\rangle }$ is its corresponding eigenvector. Since the eigenvectors form a vector space, we can expand the wave function in terms of them in the usual way

 $\displaystyle \left|\psi\left(t\right)\right\rangle$ $\displaystyle =$ $\displaystyle \sum\left|E\right\rangle \left\langle E\left|\psi\left(t\right)\right.\right\rangle \ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle \equiv$ $\displaystyle \sum a_{E}\left(t\right)\left|E\right\rangle \ \ \ \ \ (7)$

The coefficient ${a_{E}\left(t\right)}$ is the component of ${\left|\psi\left(t\right)\right\rangle }$ along the ${\left|E\right\rangle }$ vector as a function of time. We can now apply the Schrödinger equation 1 to get (a dot over a symbol indicates a time derivative):

 $\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi\left(t\right)\right\rangle$ $\displaystyle =$ $\displaystyle i\hbar\sum\dot{a}_{E}\left(t\right)\left|E\right\rangle \ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle H\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum a_{E}\left(t\right)H\left|E\right\rangle \ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum a_{E}\left(t\right)E\left|E\right\rangle \ \ \ \ \ (11)$

Since the eigenvectors ${\left|E\right\rangle }$ are linearly independent (as they form a basis for the vector space), each term in the sum in the first line must be equal to the corresponding term in the sum in the last line, so we have

$\displaystyle i\hbar\dot{a}_{E}\left(t\right)=a_{E}\left(t\right)E \ \ \ \ \ (12)$

The solution is

 $\displaystyle a_{E}\left(t\right)$ $\displaystyle =$ $\displaystyle a_{E}\left(0\right)e^{-iEt/\hbar}\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-iEt/\hbar}\left\langle E\left|\psi\left(0\right)\right.\right\rangle \ \ \ \ \ (14)$

The general solution 7 is therefore

$\displaystyle \left|\psi\left(t\right)\right\rangle =\sum e^{-iEt/\hbar}\left|E\right\rangle \left\langle E\left|\psi\left(0\right)\right.\right\rangle \ \ \ \ \ (15)$

from which we can read off the propagator:

$\displaystyle U\left(t\right)=\sum e^{-iEt/\hbar}\left|E\right\rangle \left\langle E\right| \ \ \ \ \ (16)$

Thus if we can determine the eigenvalues and eigenvectors of ${H}$, we can write the propagator in terms of them and get the general solution. We can see from this that ${U\left(t\right)}$ is unitary:

 $\displaystyle U^{\dagger}U$ $\displaystyle =$ $\displaystyle \sum_{E^{\prime}}\sum_{E}e^{-i\left(E-E^{\prime}\right)t/\hbar}\left|E\right\rangle \left\langle E\left|E^{\prime}\right.\right\rangle \left\langle E^{\prime}\right|\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{E^{\prime}}\sum_{E}e^{-i\left(E-E^{\prime}\right)t/\hbar}\left|E\right\rangle \delta_{EE^{\prime}}\left\langle E^{\prime}\right|\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{E}\left|E\right\rangle \left\langle E\right|\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1 \ \ \ \ \ (20)$

This derivation uses the fact that the eigenvectors are orthonormal and form a complete set, so that ${\left\langle E\left|E^{\prime}\right.\right\rangle =\delta_{EE^{\prime}}}$ and ${\sum_{E}\left|E\right\rangle \left\langle E\right|=1}$. Since a unitary operator doesn’t change the norm of a vector, we see from 4 that if ${\left|\psi\left(0\right)\right\rangle }$ is normalized, then so is ${\left|\psi\left(t\right)\right\rangle }$ for all times ${t}$. Further, the probability that the state will be measured to be in eigenstate ${\left|E\right\rangle }$ is constant over time, since this probability is given by

$\displaystyle \left|a_{E}\left(t\right)\right|^{2}=\left|e^{-iEt/\hbar}\left\langle E\left|\psi\left(0\right)\right.\right\rangle \right|^{2}=\left|\left\langle E\left|\psi\left(0\right)\right.\right\rangle \right|^{2} \ \ \ \ \ (21)$

This derivation assumed that the spectrum of ${H}$ was discrete and non-degenerate. If the possible eigenvalues ${E}$ are continuous, then the sum is replaced by an integral

$\displaystyle U\left(t\right)=\int e^{-iEt/\hbar}\left|E\right\rangle \left\langle E\right|dE \ \ \ \ \ (22)$

If the spectrum is discrete and degenerate, then we need to find an orthonormal set of eigenvectors that spans each degenerate subspace, and sum over these sets. For example, if ${E_{1}}$ is degenerate, then we find a set of eigenvectors ${\left|E_{1},\alpha\right\rangle }$ that spans the subspace for which ${E_{1}}$ is the eigenvalue. The index ${\alpha}$ runs from 1 up to the degree of degeneracy of ${E_{1}}$, and the propagator is then

$\displaystyle U\left(t\right)=\sum_{\alpha}\sum_{E_{i}}e^{-iE_{i}t/\hbar}\left|E_{i},\alpha\right\rangle \left\langle E_{i},\alpha\right| \ \ \ \ \ (23)$

The sum over ${E_{i}}$ runs over all the distinct eigenvalues, and the sum over ${\alpha}$ runs over the eigenvectors for each different ${E_{i}}$.

Another form of the propagator can be written directly in terms of the time-independent Hamiltonian as

$\displaystyle U\left(t\right)=e^{-iHt/\hbar} \ \ \ \ \ (24)$

This relies on the concept of the function of an operator, so that ${e^{-iHt/\hbar}}$ is a matrix whose elements are power series of the exponent ${-\frac{iHt}{\hbar}}$. The power series must, of course, converge for this solution to be valid. Since ${H}$ is Hermitian, ${U\left(t\right)}$ is unitary. We can verify that the solution using this form of ${U\left(t\right)}$ satisfies the Schrödinger equation:

 $\displaystyle \left|\psi\left(t\right)\right\rangle$ $\displaystyle =$ $\displaystyle U\left(t\right)\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-iHt/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (26)$ $\displaystyle i\hbar\left|\dot{\psi}\left(t\right)\right\rangle$ $\displaystyle =$ $\displaystyle i\hbar\frac{d}{dt}\left(e^{-iHt/\hbar}\right)\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar\left(-\frac{i}{\hbar}\right)He^{-iHt/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle He^{-iHt/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle H\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (30)$

The derivative of ${U\left(t\right)}$ can be calculated from the derivatives of its matrix elements, which are all power series.

# Functions of hermitian operators

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Exercises 1.9.1 – 1.9.3.

One of the most common ways to define a function of an operator is to consider the case where the function can be expressed as a power series. That is, given an operator ${\Omega}$, a function ${f\left(\Omega\right)}$ can be defined as

$\displaystyle f\left(\Omega\right)=\sum_{n=0}^{\infty}a_{n}\Omega^{n} \ \ \ \ \ (1)$

where the coefficients ${a_{n}}$ are, in general, complex scalars. This definition can still be difficult to deal with if ${\Omega}$ is not diagonalizable since, in that case, powers of ${\Omega}$ have no simple form, so it can be hard to tell if the series converges.

We can avoid this problem by restricting ourselves to hermitian operators, since such operators are always diagonalizable according to the spectral theorem and all eigenvalues of hermitian operators are real. Then powers of ${\Omega}$ are easy to calculate, since if the ${i}$th diagonal element of ${\Omega}$ is ${\omega_{i}}$, the ${i}$th diagonal element of ${\Omega^{n}}$ is ${\omega_{i}^{n}}$. The problem of finding ${f\left(\Omega\right)}$ is then reduced to examining whether the series converges for each diagonal element.

Example 1 Suppose we have the simplest such power series

$\displaystyle f\left(\Omega\right)=\sum_{n=0}^{\infty}\Omega^{n} \ \ \ \ \ (2)$

If we look at this series in the eigenbasis (the basis of orthonormal eigenvectors that diagonalizes ${\Omega}$), then we have

$\displaystyle f\left(\Omega\right)=\left[\begin{array}{cccc} \sum_{n=0}^{\infty}\omega_{1}^{n}\\ & \sum_{n=0}^{\infty}\omega_{2}^{n}\\ & & \ddots\\ & & & \sum_{n=0}^{\infty}\omega_{m}^{n} \end{array}\right] \ \ \ \ \ (3)$

${\Omega}$ here is an ${m\times m}$ matrix with eigenvalues ${\omega_{i}}$, ${i=1,\ldots,m}$ (it’s possible that some of the eigenvalues could be equal, if ${\Omega}$ is degenerate, but that doesn’t affect the argument).

It’s known that the geometric series

$\displaystyle f\left(x\right)=\sum_{n=0}^{\infty}x^{n}=\frac{1}{1-x} \ \ \ \ \ (4)$

converges as shown, provided that ${\left|x\right|<1}$. Thus we see that ${f\left(\Omega\right)}$ converges provided all its eigenvalues satisfy ${\left|\omega_{i}\right|<1}$. The function is then

$\displaystyle f\left(\Omega\right)=\left[\begin{array}{cccc} \frac{1}{1-\omega_{1}}\\ & \frac{1}{1-\omega_{2}}\\ & & \ddots\\ & & & \frac{1}{1-\omega_{m}} \end{array}\right] \ \ \ \ \ (5)$

To see what operator it converges to, we consider the function

$\displaystyle g\left(\Omega\right)=\left(I-\Omega\right)^{-1} \ \ \ \ \ (6)$

Still working in the eigenbasis where ${\Omega}$ is diagonal, the matrix ${I-\Omega}$ is also diagonal with diagonal elements ${1-\omega_{i}}$. The inverse of a diagonal matrix is another diagonal matrix with diagonal elements equal to the reciprocal of the elements in the original matrix, so ${\left(I-\Omega\right)^{-1}}$ has diagonal elements ${\frac{1}{1-\omega_{i}}}$ so from 5 we see that

$\displaystyle f\left(\Omega\right)=\sum_{n=0}^{\infty}\Omega^{n}=\left(I-\Omega\right)^{-1} \ \ \ \ \ (7)$

provided all the eigenvalues of ${\Omega}$ satisfy ${\left|\omega_{i}\right|<1}$.

Example 2 If ${H}$ is a hermitian operator, then ${e^{iH}}$ is unitary. To see this, we again work in the eigenbasis of ${H}$. By expressing ${e^{iH}}$ as a power series and using the same argument as in the previous example, we see that

$\displaystyle U=e^{iH}=\left[\begin{array}{cccc} e^{i\omega_{1}}\\ & e^{i\omega_{2}}\\ & & \ddots\\ & & & e^{i\omega_{m}} \end{array}\right] \ \ \ \ \ (8)$

The adjoint of ${e^{iH}}$ is found by looking at the power series:

 $\displaystyle U^{\dagger}=\left(e^{iH}\right)^{\dagger}$ $\displaystyle =$ $\displaystyle \left[\sum_{n=0}^{\infty}\frac{\left(iH\right)^{n}}{n!}\right]^{\dagger}\ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{n=0}^{\infty}\frac{\left(-iH^{\dagger}\right)^{n}}{n!}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{n=0}^{\infty}\frac{\left(-iH\right)^{n}}{n!}\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-iH} \ \ \ \ \ (12)$

where in the third line we used the hermitian property ${H^{\dagger}=H}$. Therefore

 $\displaystyle \left(e^{iH}\right)^{\dagger}$ $\displaystyle =$ $\displaystyle e^{-iH}=\left[\begin{array}{cccc} e^{-i\omega_{1}}\\ & e^{-i\omega_{2}}\\ & & \ddots\\ & & & e^{-i\omega_{m}} \end{array}\right]\ \ \ \ \ (13)$ $\displaystyle U^{\dagger}U=\left(e^{iH}\right)^{\dagger}e^{iH}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} e^{-i\omega_{1}}\\ & e^{-i\omega_{2}}\\ & & \ddots\\ & & & e^{-i\omega_{m}} \end{array}\right]\left[\begin{array}{cccc} e^{i\omega_{1}}\\ & e^{i\omega_{2}}\\ & & \ddots\\ & & & e^{i\omega_{m}} \end{array}\right]\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I \ \ \ \ \ (15)$

Thus ${\left(e^{iH}\right)^{\dagger}=\left(e^{iH}\right)^{-1}}$ and ${e^{iH}}$ is unitary.

From 8 we can find the determinant of ${e^{iH}}$:

$\displaystyle \det U=\det e^{iH}=\exp\left[i\sum_{i=1}^{m}\omega_{i}\right]=\exp\left(i\mbox{Tr}H\right) \ \ \ \ \ (16)$

since the trace of a hermitian matrix is the sum of its eigenvalues.

# Spectral theorem for normal operators

References: edX online course MIT 8.05 Week 6.

We’ll now look at a central theorem about normal operators, known as the spectral theorem.

We’ve seen that if a matrix ${M}$ has a set ${v}$ of eigenvectors that span the space, then we can diagonalize ${M}$ by means of the similarity transformation

$\displaystyle D_{M}=A^{-1}MA \ \ \ \ \ (1)$

where ${D_{T}}$ is diagonal and the columns of ${A}$ are the eigenvectors of ${M}$. In the general case, there’s no guarantee that the eigenvectors of ${M}$ are orthonormal. However, if there is an orthonormal basis in which ${M}$ is diagonal, then ${M}$ is said to be unitarily diagonalizable. Suppose we start with some arbitrary orthonormal basis ${\left(e_{1},\ldots,e_{n}\right)}$ (we can always construct such a basis using the Gram-Schmidt procedure). Then if the set of eigenvectors of ${M}$ form an orthonormal basis ${\left(u_{1},\ldots,u_{n}\right)}$, there is a unitary matrix ${U}$ that transforms the ${e_{i}}$ basis into the ${u_{i}}$ basis (since unitary operators preserve inner products):

$\displaystyle u_{i}=Ue_{i}=\sum_{j}U_{ji}e_{j} \ \ \ \ \ (2)$

Using this unitary operator, we therefore have for a unitarily diagonalizable operator ${M}$

$\displaystyle D_{M}=U^{-1}MU=U^{\dagger}MU \ \ \ \ \ (3)$

The spectral theorem now states:

Theorem An operator ${M}$ in a complex vector space has an orthonormal basis of eigenvectors (that is, it’s unitarily diagonalizable) if and only if ${M}$ is normal.

Proof: Since this is an ‘if and only if’ theorem, we need to prove it in both directions. First, suppose that ${M}$ is unitarily diagonalizable, so that 3 holds for some ${U}$. Then

 $\displaystyle M$ $\displaystyle =$ $\displaystyle UD_{M}U^{\dagger}\ \ \ \ \ (4)$ $\displaystyle M^{\dagger}$ $\displaystyle =$ $\displaystyle UD_{M}^{\dagger}U^{\dagger} \ \ \ \ \ (5)$

The commutator is then, since ${U^{\dagger}U=I}$

 $\displaystyle \left[M^{\dagger},M\right]$ $\displaystyle =$ $\displaystyle UD_{M}^{\dagger}D_{M}U^{\dagger}-UD_{M}D_{M}^{\dagger}U^{\dagger}\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left[D_{M}^{\dagger},D_{M}\right]U^{\dagger}\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (8)$

where the result follows because all diagonal matrices commute. Thus ${M}$ is normal, and this completes one direction of the proof.

Going the other way is a bit trickier. We need to show that for any normal matrix ${M}$ with elements defined on some arbitrary orthonormal basis (that is, a basis that is not necessarily composed of eigenvectors of ${M}$), there is a unitary matrix ${U}$ such that ${U^{\dagger}MU}$ is diagonal. Since we started with an orthonormal basis and ${U}$ preserves inner products, the new basis is also orthonormal which will prove the theorem.

The proof uses mathematical induction, in which we first prove that the result is true for one specific dimension of vector space, say ${\mbox{dim }V=1}$. We can then assume the result is true for some dimension ${n-1}$ and from that assumption, prove it is also true for the next higher dimension ${n}$.

Since any ${1\times1}$ matrix is diagonal (it consists of only one element), the result is true for ${\mbox{dim }V=1}$. So we now assume it’s true for a dimension of ${n-1}$ and prove it’s true for a dimension of ${n}$.

We take an arbitrary orthonormal basis of the ${n}$-dimensional ${V}$ to be ${\left(\left|1\right\rangle ,\ldots,\left|n\right\rangle \right)}$. In that basis, the matrix ${M}$ has elements ${M_{ij}=\left\langle i\left|M\right|j\right\rangle }$. We know that ${M}$ has at least one eigenvalue ${\lambda_{1}}$ with a normalized eigenvector ${\left|x_{1}\right\rangle }$:

$\displaystyle M\left|x_{1}\right\rangle =\lambda_{1}\left|x_{1}\right\rangle \ \ \ \ \ (9)$

and, since ${M}$ is normal, the eigenvector ${\left|x_{1}\right\rangle }$ is also an eigenvector of ${M^{\dagger}}$:

$\displaystyle M^{\dagger}\left|x_{1}\right\rangle =\lambda_{1}^*\left|x_{1}\right\rangle \ \ \ \ \ (10)$

Starting with a basis of ${V}$ containing ${\left|x_{1}\right\rangle }$, we can use Gram-Schmidt to generate an orthonormal basis ${\left(\left|x_{1}\right\rangle ,\ldots,\left|x_{n}\right\rangle \right)}$. We now define an operator ${U_{1}}$ as follows:

$\displaystyle U_{1}\equiv\sum_{i}\left|x_{i}\right\rangle \left\langle i\right| \ \ \ \ \ (11)$

${U_{1}}$ is unitary, since

 $\displaystyle U_{1}^{\dagger}$ $\displaystyle =$ $\displaystyle \sum_{i}\left|i\right\rangle \left\langle x_{i}\right|\ \ \ \ \ (12)$ $\displaystyle U_{1}^{\dagger}U$ $\displaystyle =$ $\displaystyle \sum_{i}\sum_{j}\left|i\right\rangle \left\langle x_{i}\left|x_{j}\right.\right\rangle \left\langle j\right|\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{i}\sum_{j}\left|i\right\rangle \delta_{ij}\left\langle j\right|\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{i}\left|i\right\rangle \left\langle i\right|\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I \ \ \ \ \ (16)$

From its definition

 $\displaystyle U_{1}\left|1\right\rangle$ $\displaystyle =$ $\displaystyle \left|x_{1}\right\rangle \ \ \ \ \ (17)$ $\displaystyle U_{1}^{\dagger}\left|x_{1}\right\rangle$ $\displaystyle =$ $\displaystyle \left|1\right\rangle \ \ \ \ \ (18)$

Now consider the matrix ${M_{1}}$ defined as

$\displaystyle M_{1}\equiv U_{1}^{\dagger}MU_{1} \ \ \ \ \ (19)$

${M_{1}}$ is also normal, as can be verified by calculating the commutator and using ${\left[M^{\dagger},M\right]=0}$. Further

 $\displaystyle M_{1}\left|1\right\rangle$ $\displaystyle =$ $\displaystyle U_{1}^{\dagger}MU_{1}\left|1\right\rangle \ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U_{1}^{\dagger}M\left|x_{1}\right\rangle \ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \lambda_{1}U_{1}^{\dagger}\left|x_{1}\right\rangle \ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \lambda_{1}\left|1\right\rangle \ \ \ \ \ (23)$

Thus ${\left|1\right\rangle }$ is an eigenvector of ${M_{1}}$ with eigenvalue ${\lambda_{1}}$.

The matrix elements in the first column of ${M_{1}}$ in the original basis ${\left(\left|1\right\rangle ,\ldots,\left|n\right\rangle \right)}$ are

$\displaystyle \left\langle j\left|M_{1}\right|1\right\rangle =\lambda_{1}\left\langle j\left|1\right.\right\rangle =\lambda_{1}\delta_{1j} \ \ \ \ \ (24)$

Thus all entries in the first column are zero except for the first row, where it is ${\lambda_{1}}$. How about the first row? Using 10 we have

 $\displaystyle \left\langle 1\left|M_{1}\right|j\right\rangle$ $\displaystyle =$ $\displaystyle \left(\left\langle j\left|M_{1}^{\dagger}\right|1\right\rangle \right)^*\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(\lambda_{1}^*\left\langle j\left|1\right.\right\rangle \right)^*\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \lambda_{1}\delta_{1j} \ \ \ \ \ (27)$

Thus all entries in the first row, except the first, are also zero. Thus in the original basis ${\left(\left|1\right\rangle ,\ldots,\left|n\right\rangle \right)}$ we have

$\displaystyle M_{1}=\left[\begin{array}{cccc} \lambda_{1} & 0 & \ldots & 0\\ 0\\ \vdots & & M^{\prime}\\ 0 \end{array}\right] \ \ \ \ \ (28)$

where ${M^{\prime}}$ is an ${\left(n-1\right)\times\left(n-1\right)}$ matrix. We have

 $\displaystyle M_{1}^{\dagger}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} \lambda_{1}^* & 0 & \ldots & 0\\ 0\\ \vdots & & \left(M^{\prime}\right)^{\dagger}\\ 0 \end{array}\right]\ \ \ \ \ (29)$ $\displaystyle M_{1}^{\dagger}M_{1}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} \left|\lambda_{1}\right|^{2} & 0 & \ldots & 0\\ 0\\ \vdots & & \left(M^{\prime}\right)^{\dagger}M^{\prime}\\ 0 \end{array}\right]\ \ \ \ \ (30)$ $\displaystyle M_{1}M_{1}^{\dagger}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} \left|\lambda_{1}\right|^{2} & 0 & \ldots & 0\\ 0\\ \vdots & & M^{\prime}\left(M^{\prime}\right)^{\dagger}\\ 0 \end{array}\right] \ \ \ \ \ (31)$

Since ${M_{1}}$ is normal, we must have ${M_{1}M_{1}^{\dagger}=M_{1}^{\dagger}M_{1}}$, which implies

$\displaystyle \left(M^{\prime}\right)^{\dagger}M^{\prime}=M^{\prime}\left(M^{\prime}\right)^{\dagger} \ \ \ \ \ (32)$

so that ${M^{\prime}}$ is also a normal matrix. By the induction hypotheses, since ${M^{\prime}}$ is an ${\left(n-1\right)\times\left(n-1\right)}$ normal matrix, it is unitarily diagonalizable by some unitary matrix ${U^{\prime}}$, that is

$\displaystyle U^{\prime\dagger}M^{\prime}U^{\prime}=D_{M^{\prime}} \ \ \ \ \ (33)$

is diagonal. We can extend ${U^{\prime}}$ to an ${n\times n}$ unitary matrix by adding a 1 to the upper left:

$\displaystyle U=\left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & U^{\prime}\\ 0 \end{array}\right] \ \ \ \ \ (34)$

We can check that ${U}$ is unitary by direct calculation, using ${\left(U^{\prime}\right)^{\dagger}U^{\prime}=I}$

 $\displaystyle U^{\dagger}U$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & \left(U^{\prime}\right)^{\dagger}\\ 0 \end{array}\right]\left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & U^{\prime}\\ 0 \end{array}\right]\ \ \ \ \ (35)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & \left(U^{\prime}\right)^{\dagger}U^{\prime}\\ 0 \end{array}\right]\ \ \ \ \ (36)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0 & 1\\ \vdots & & 1\\ 0 & & & 1 \end{array}\right]=I \ \ \ \ \ (37)$

We then have, using 33

 $\displaystyle U^{\dagger}M_{1}U$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & \left(U^{\prime}\right)^{\dagger}\\ 0 \end{array}\right]\left[\begin{array}{cccc} \lambda_{1} & 0 & \ldots & 0\\ 0\\ \vdots & & M^{\prime}\\ 0 \end{array}\right]\left[\begin{array}{cccc} 1 & 0 & \ldots & 0\\ 0\\ \vdots & & U^{\prime}\\ 0 \end{array}\right]\ \ \ \ \ (38)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} \lambda_{1} & 0 & \ldots & 0\\ 0\\ \vdots & & U^{\prime\dagger}M^{\prime}U^{\prime}\\ 0 \end{array}\right]\ \ \ \ \ (39)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} \lambda_{1} & 0 & \ldots & 0\\ 0\\ \vdots & & D_{M^{\prime}}\\ 0 \end{array}\right] \ \ \ \ \ (40)$

That is, ${U^{\dagger}M_{1}U}$ is diagonal. From the definition 19 of ${M_{1}}$, we now have

 $\displaystyle U^{\dagger}M_{1}U$ $\displaystyle =$ $\displaystyle U^{\dagger}U_{1}^{\dagger}MU_{1}U\ \ \ \ \ (41)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(U_{1}U\right)^{\dagger}M\left(U_{1}U\right) \ \ \ \ \ (42)$

Since the product of two unitary matrices is unitary, we have found a unitary operator ${U_{1}U}$ that diagonalizes ${M}$, which proves the result. $\Box$

Notice that the proof didn’t assume that the eigenvalues are nondegenerate, so that even if there are several linearly independent eigenvectors corresponding to one eigenvalue, it is still possible to find an orthonormal basis consisting of the eigenvectors. In other words, for any hermitian or unitary operator, it is always possible to find an orthonormal basis of the vector space consisting of eigenvectors of the operator.

In the general case, a normal matrix ${M}$ in an ${n}$-dimensional vector space can have ${m}$ distinct eigenvalues, where ${1\le m\le n}$. If ${n=m}$, there is no degeneracy and each eigenvalue has a unique (up to a scalar multiple) eigenvector. If ${m, then one or more of the eigenvalues occurs more than once, and the eigenvector subspace corresponding to a degenerate eigenvalue has a dimension larger than 1. However, the spectral theorem guarantees that it is possible to choose an orthonormal basis within each subspace, and that each subspace is orthogonal to all other subspaces.

More precisely, the vector space ${V}$ can be decomposed into ${m}$ subspaces ${U_{k}}$ for ${k=1,\ldots,m}$, with the dimension ${d_{k}}$ of subspace ${U_{k}}$ equal to the degeneracy of eigenvalue ${\lambda_{k}}$. The full space ${V}$ is the direct sum of these subspaces

 $\displaystyle V$ $\displaystyle =$ $\displaystyle U_{1}\oplus U_{2}\oplus\ldots\oplus U_{m}\ \ \ \ \ (43)$ $\displaystyle n$ $\displaystyle =$ $\displaystyle \sum_{k=1}^{m}d_{k} \ \ \ \ \ (44)$

It’s usually most convenient to order the eigenvectors as follows:

$\displaystyle \left(u_{1}^{\left(1\right)},\ldots,u_{d_{1}}^{\left(1\right)},u_{1}^{\left(2\right)},\ldots,u_{d_{2}}^{\left(2\right)},\ldots,u_{1}^{\left(m\right)},\ldots,u_{d_{m}}^{\left(m\right)}\right) \ \ \ \ \ (45)$

The notation ${u_{j}^{\left(k\right)}}$ means the ${j}$th eigenvector belonging to eigenvalue ${k}$.

In practice, there is a lot of freedom in choosing orthonormal eigenvectors for degenerate eigenvalues, since we can pick any ${d_{k}}$ mutually orthogonal vectors within the subspace of dimension ${d_{k}}$. For example, in 3-d space we usually choose the ${x,y,z}$ unit vectors as the orthonormal set, but we can pivot these three vectors about the origin, or even reflect them in a plane passing through the origin, and still get an orthonormal set of 3 vectors.

The diagonal form of the normal matrix ${M}$ in this orthonormal basis is

$\displaystyle D_{M}=\left[\begin{array}{ccccccc} \lambda_{1}\\ & \ddots\\ & & \lambda_{1}\\ & & & \ddots\\ & & & & \lambda_{m}\\ & & & & & \ddots\\ & & & & & & \lambda_{m} \end{array}\right] \ \ \ \ \ (46)$

Here, eigenvalue ${\lambda_{k}}$ occurs ${d_{k}}$ times along the diagonal.

# Unitary matrices – some examples

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Exercises 1.6.3 – 1.6.6.

Here are a few more results about unitary operators.

Shankar defines a unitary operator ${U}$ as one where

$\displaystyle UU^{\dagger}=I \ \ \ \ \ (1)$

From this we can derive the other condition by which they can be defined, namely that a unitary operator preserves the norm of a vector:

$\displaystyle \left|Uv\right|=\left|v\right| \ \ \ \ \ (2)$

This follows, for if we define the effect of ${U}$ by

$\displaystyle \left|v_{1}^{\prime}\right\rangle =U\left|v_{1}\right\rangle \ \ \ \ \ (3)$

then

 $\displaystyle \left\langle v_{1}^{\prime}\left|v_{1}^{\prime}\right.\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle Uv_{1}\left|Uv_{1}\right.\right\rangle \ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle v_{1}\left|U^{\dagger}Uv_{1}\right.\right\rangle \ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle v_{1}\left|v_{1}\right.\right\rangle \ \ \ \ \ (6)$

Thus ${\left|v_{1}^{\prime}\right|^{2}=\left|v_{1}\right|^{2}}$.

Theorem 1 The product of two unitary operators ${U_{1}}$ and ${U_{2}}$ is unitary.

Proof: Using Shankar’s definition 1, we have

 $\displaystyle \left(U_{1}U_{2}\right)^{\dagger}U_{1}U_{2}$ $\displaystyle =$ $\displaystyle U_{2}^{\dagger}U_{1}^{\dagger}U_{1}U_{2}\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U_{2}^{\dagger}IU_{2}\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U_{2}^{\dagger}U_{2}\ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I \ \ \ \ \ (10)$

$\Box$

Theorem 2 The determinant of a unitary matrix ${U}$ is a complex number with unit modulus.

Proof: The determinant of a hermitian conjugate is the complex conjugate of the determinant of the original matrix, since ${\det U=\det U^{T}}$ (where the superscript ${T}$ denotes the transpose) for any matrix, and the hermitian conjugate is the complex conjugate transpose. Therefore

$\displaystyle \det\left(UU^{\dagger}\right)=\left[\det U\right]\left[\det U\right]^*=\det I=1 \ \ \ \ \ (11)$

Therefore ${\left|\det U\right|^{2}=1}$ as required.$\Box$

Example 1 The rotation matrix ${R\left(\frac{\pi}{2}\mathbf{i}\right)}$ is unitary. We have

$\displaystyle R\left(\frac{\pi}{2}\mathbf{i}\right)=\left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 0 & -1\\ 0 & 1 & 0 \end{array}\right] \ \ \ \ \ (12)$

By direct calculation

 $\displaystyle RR^{\dagger}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 0 & -1\\ 0 & 1 & 0 \end{array}\right]\left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 0 & 1\\ 0 & -1 & 0 \end{array}\right]\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array}\right]=I \ \ \ \ \ (14)$

Example 2 Consider the matrix

$\displaystyle U=\frac{1}{\sqrt{2}}\left[\begin{array}{cc} 1 & i\\ i & 1 \end{array}\right] \ \ \ \ \ (15)$

By calculating

 $\displaystyle UU^{\dagger}$ $\displaystyle =$ $\displaystyle \frac{1}{2}\left[\begin{array}{cc} 1 & i\\ i & 1 \end{array}\right]\left[\begin{array}{cc} 1 & -i\\ -i & 1 \end{array}\right]\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{2}\left[\begin{array}{cc} 2 & 0\\ 0 & 2 \end{array}\right]=I \ \ \ \ \ (17)$

Thus ${U}$ is unitary, but because ${U\ne U^{\dagger}}$ it is not hermitian. Its determinant is

$\displaystyle \det U=\left(\frac{1}{\sqrt{2}}\right)^{2}\left(1-i^{2}\right)=1 \ \ \ \ \ (18)$

This is of the required form ${e^{i\theta}}$ with ${\theta=0}$.

Example 3 Consider the matrix

 $\displaystyle U$ $\displaystyle =$ $\displaystyle \frac{1}{2}\left[\begin{array}{cc} 1+i & 1-i\\ 1-i & 1+i \end{array}\right]\ \ \ \ \ (19)$ $\displaystyle UU^{\dagger}$ $\displaystyle =$ $\displaystyle \frac{1}{4}\left[\begin{array}{cc} 1+i & 1-i\\ 1-i & 1+i \end{array}\right]\left[\begin{array}{cc} 1-i & 1+i\\ 1+i & 1-i \end{array}\right]\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{4}\left[\begin{array}{cc} 4 & 0\\ 0 & 4 \end{array}\right]=I \ \ \ \ \ (21)$

Thus ${U}$ is unitary, but because ${U\ne U^{\dagger}}$ it is not hermitian. Its determinant is

 $\displaystyle \det U$ $\displaystyle =$ $\displaystyle \left(\frac{1}{2}\right)^{2}\left[\left(1+i\right)^{2}-\left(1-i\right)^{2}\right]\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i \ \ \ \ \ (23)$

This is of the required form ${e^{i\theta}}$ with ${\theta=\frac{\pi}{2}}$.

# Unitary operators

References: edX online course MIT 8.05.1x Week 4.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 7.

Another important type of operator is the unitary operator ${U}$, which is defined by the condition that it is surjective and that

$\displaystyle \left|Uu\right|=\left|u\right| \ \ \ \ \ (1)$

for all ${u\in V}$. That is, a unitary operator preserves the norm of all vectors. The identity matrix ${I}$ is a special case of a unitary operator, as it doesn’t change any vector, but multiplying ${I}$ by any complex number ${\alpha}$ with ${\left|\alpha\right|=1}$ also preserves the norm, so ${\alpha I}$ is another unitary operator.

Because ${U}$ preserves the norm of all vectors, the only vector that can be in the null space of ${U}$ is the zero vector, meaning that ${U}$ is also injective. As it is both injective and surjective, it is invertible.

Theorem 1 For a unitary operator ${U}$, ${U^{\dagger}=U^{-1}}$.

Proof: From its definition and the properties of an adjoint operator, we have

 $\displaystyle \left|Uu\right|^{2}$ $\displaystyle =$ $\displaystyle \left\langle Uu,Uu\right\rangle \ \ \ \ \ (2)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle u,U^{\dagger}Uu\right\rangle \ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle u,u\right\rangle \ \ \ \ \ (4)$

Therefore, ${U^{\dagger}U=I}$ so ${U^{\dagger}=U^{-1}}$.$\Box$

Theorem 2 Unitary operators preserve inner products, meaning that ${\left\langle Uu,Uv\right\rangle =\left\langle u,v\right\rangle }$ for all ${u,v\in V}$.

Proof: Since ${U^{\dagger}=U^{-1}}$ we have

$\displaystyle \left\langle Uu,Uv\right\rangle =\left\langle u,U^{\dagger}Uv\right\rangle =\left\langle u,v\right\rangle \ \ \ \ \ (5)$

$\Box$

Theorem 3 Acting on an orthonormal basis ${\left(e_{1},\ldots,e_{n}\right)}$ with a unitary operator ${U}$ produces another orthonormal basis.

Proof: Suppose the orthonormal basis is converted to another set of vectors ${\left(f_{1},\ldots,f_{n}\right)}$ by ${U}$:

$\displaystyle f_{i}=Ue_{i} \ \ \ \ \ (6)$

Then

$\displaystyle \left\langle f_{i},f_{j}\right\rangle =\left\langle Ue_{i},Ue_{j}\right\rangle =\left\langle e_{i},e_{j}\right\rangle =\delta_{ij} \ \ \ \ \ (7)$

Thus ${\left(f_{1},\ldots,f_{n}\right)}$ are an orthonormal set. Since the orthonormal basis ${\left(e_{1},\ldots,e_{n}\right)}$ spans ${V}$ (by assumption) and the set ${\left(f_{1},\ldots,f_{n}\right)}$ contains ${n}$ linearly independent orthonormal vectors, ${\left(f_{1},\ldots,f_{n}\right)}$ is also an orthonormal basis for ${V}$.$\Box$

Theorem 4 If one orthonormal basis ${\left(e_{1},\ldots,e_{n}\right)}$ is converted to another ${\left(f_{1},\ldots,f_{n}\right)}$ by a unitary operator ${U}$, then the matrix elements of ${U}$ are the same in both bases.

Proof: This is just a special case of the more general theorem that states that any operator that transforms one set of basis vectors into another has the same matrix elements in both bases. In this case, the proof is especially simple:

 $\displaystyle U_{ki}\left(\left\{ e\right\} \right)$ $\displaystyle =$ $\displaystyle \left\langle e_{k},Ue_{i}\right\rangle \ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle U^{-1}f_{k},f_{i}\right\rangle \ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle U^{\dagger}f_{k},f_{i}\right\rangle \ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle f_{k},Uf_{i}\right\rangle \ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U_{ki}\left(\left\{ f\right\} \right) \ \ \ \ \ (12)$

$\Box$

# Unitary transformations and the Heisenberg picture

Reference: References: Robert D. Klauber, Student Friendly Quantum Field Theory, (Sandtrove Press, 2013) – Chapter 2, Problems 2.11 – 2.12.

Most of the quantum mechanics that we’ve done so far has used the Schrödinger picture, in which a system is described by finding its wave function ${\Psi\left(x,t\right)}$, which depends on spatial position and time. Most operators in the Schrödinger picture are independent of time, so that the time dependence of the solution is contained within the wave function.

There is another way of looking at quantum theory called the Heisenberg picture. The states and operators in the Heisenberg picture are obtained from their counterparts in the Schrödinger picture by means of a unitary transformation using the unitary operator ${U}$.

Unitary transformations

A unitary transformation in quantum mechanics is obtained by applying an operator ${U}$ to a state that leaves the square modulus of the state (that is, the probability density) unchanged. In classical mechanics, an orthogonal transformation rotates a 3-d vector without changing its length; in that sense a unitary transformation in quantum mechanics is an analogue of a classical orthogonal transformation, and can be thought of as a rotation of a quantum state vector in Hilbert space.

We can write this condition as

 $\displaystyle \left\langle U\psi\left|U\psi\right.\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle \psi\left|U^{\dagger}U\psi\right.\right\rangle \ \ \ \ \ (1)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \psi\left|\psi\right.\right\rangle \ \ \ \ \ (2)$

From this we see that the hermitian conjugate must also be the inverse of ${U}$:

$\displaystyle U^{\dagger}=U^{-1} \ \ \ \ \ (3)$

Any complex exponential operator ${U=e^{iA}}$ for some other (hermitian, so that ${A^{\dagger}=A}$) operator ${A}$ qualifies as a unitary operator, since

 $\displaystyle U^{\dagger}$ $\displaystyle =$ $\displaystyle \left(e^{iA}\right)^{\dagger}=e^{-iA^{\dagger}}=e^{-iA}\ \ \ \ \ (4)$ $\displaystyle U^{\dagger}U$ $\displaystyle =$ $\displaystyle e^{-iA}e^{iA}=1 \ \ \ \ \ (5)$

A unitary transformation can also be thought of as applied to an operator ${Q}$ rather than the state. The requirement is that the bracket ${\left\langle \psi\left|Q\right|\psi\right\rangle }$ is unchanged by the unitary transformation. Transforming the bracket gives (where ${Q^{\prime}}$ is the transformed operator)

 $\displaystyle \left\langle U\psi\left|Q^{\prime}\right|U\psi\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle \psi\left|U^{\dagger}Q^{\prime}U\right|\psi\right\rangle \ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \psi\left|Q\right|\psi\right\rangle \ \ \ \ \ (7)$

Therefore, the transformed operator can be obtained from

 $\displaystyle U^{\dagger}Q^{\prime}U$ $\displaystyle =$ $\displaystyle Q\ \ \ \ \ (8)$ $\displaystyle UU^{\dagger}Q^{\prime}UU^{\dagger}$ $\displaystyle =$ $\displaystyle UQU^{\dagger}\ \ \ \ \ (9)$ $\displaystyle Q^{\prime}$ $\displaystyle =$ $\displaystyle UQU^{\dagger}=UQU^{-1} \ \ \ \ \ (10)$

The Heisenberg picture

To make the transition to the Heisenberg picture, we first need to recall the expression for the time derivative of the expectation value of some quantum observable ${Q}$:

$\displaystyle \frac{d}{dt}\left\langle Q\right\rangle =i\left\langle \left[H,Q\right]\right\rangle +\left\langle \frac{\partial Q}{\partial t}\right\rangle \ \ \ \ \ (11)$

where the term ${\left\langle \left[H,Q\right]\right\rangle }$ is the expectation value of the commutator of ${Q}$ with the hamiltonian ${H}$ and we’ve used natural units with ${\hbar=1}$. This is the situation in the Schrödinger picture, so to make this explicit we can include the states (where a subscript or superscript ${S}$ indicates ‘Schrödinger state’ or ‘Schrödinger operator’).

 $\displaystyle \frac{d}{dt}\left\langle Q\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle \psi_{S}\left|i\left[H,Q^{S}\right]\right|\psi_{S}\right\rangle +\left\langle \psi_{S}\left|\frac{\partial Q^{S}}{\partial t}\right|\psi_{S}\right\rangle \ \ \ \ \ (12)$

Usually the Schrödinger operator ${Q^{S}}$ is independent of time so that last term is zero.

Now suppose we introduce the unitary transformation

$\displaystyle U\equiv e^{-iHt} \ \ \ \ \ (13)$

and transform states and operators according to (now the superscript or subscript ${H}$ indicates ‘Heisenberg’):

 $\displaystyle U^{\dagger}\left|\psi_{S}\right\rangle$ $\displaystyle =$ $\displaystyle \left|\psi_{H}\right\rangle \ \ \ \ \ (14)$ $\displaystyle U^{\dagger}Q^{S}U$ $\displaystyle =$ $\displaystyle Q^{H} \ \ \ \ \ (15)$

For a free particle, the wave function is

$\displaystyle \left|\psi_{S}\right\rangle =e^{-iEt+i\mathbf{p}\cdot\mathbf{x}} \ \ \ \ \ (16)$

where ${\mathbf{p}}$ is the momentum. Applying the unitary operator we get, since ${\left|\psi_{S}\right\rangle }$ is an eigenstate of ${H}$:

 $\displaystyle e^{iHt}\left|\psi_{S}\right\rangle$ $\displaystyle =$ $\displaystyle e^{iEt}e^{-iEt+i\mathbf{p}\cdot\mathbf{x}}\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{i\mathbf{p}\cdot\mathbf{x}}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left|\psi_{H}\right\rangle \ \ \ \ \ (19)$

In the Heisenberg picture, the time dependence has been removed from the state. So where does the time dependence show up in the Heisenberg picture? Let’s go back to 12 and convert it to the Heisenberg picture. We can insert ${UU^{\dagger}=U^{\dagger}U=1}$ anywhere since it won’t change anything, so we get

 $\displaystyle \frac{d}{dt}\left\langle Q\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle \psi_{S}\left|UU^{\dagger}i\left[H,Q^{S}\right]UU^{\dagger}\right|\psi_{S}\right\rangle +\left\langle \psi_{S}\left|UU^{\dagger}\frac{\partial Q^{S}}{\partial t}UU^{\dagger}\right|\psi_{S}\right\rangle \ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle U^{\dagger}\psi_{S}\left|U^{\dagger}i\left[H,Q^{S}\right]U\right|U^{\dagger}\psi_{S}\right\rangle +\left\langle U^{\dagger}\psi_{S}\left|U^{\dagger}\frac{\partial Q^{S}}{\partial t}U\right|U^{\dagger}\psi_{S}\right\rangle \ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \psi_{H}\left|U^{\dagger}i\left[H,Q^{S}\right]U\right|\psi_{H}\right\rangle +\left\langle \psi_{H}\left|U^{\dagger}\frac{\partial Q^{S}}{\partial t}U\right|\psi_{H}\right\rangle \ \ \ \ \ (22)$

From 15, we see that the operators in these two terms are the Heisenberg operators:

 $\displaystyle U^{\dagger}i\left[H,Q^{S}\right]U$ $\displaystyle =$ $\displaystyle i\left[H,Q^{H}\right]\ \ \ \ \ (23)$ $\displaystyle U^{\dagger}\frac{\partial Q^{S}}{\partial t}U$ $\displaystyle =$ $\displaystyle \frac{\partial Q^{H}}{\partial t} \ \ \ \ \ (24)$

So in the Heisenberg picture

$\displaystyle \frac{d}{dt}\left\langle Q\right\rangle =\left\langle \psi_{H}\left|i\left[H,Q^{H}\right]\right|\psi_{H}\right\rangle +\left\langle \psi_{H}\left|\frac{\partial Q^{H}}{\partial t}\right|\psi_{H}\right\rangle \ \ \ \ \ (25)$

This has the same form as in the Schrödinger picture 12. The difference is that the time dependence has been shifted from the states to the operators, since the operator ${U}$ has an explicit time dependence.

Example 1 We have a state

$\displaystyle \left|\psi\right\rangle =C_{1}\left|\psi_{E_{1}}\right\rangle +C_{2}\left|\psi_{E_{2}}\right\rangle \ \ \ \ \ (26)$

where ${\left|\psi_{E_{1}}\right\rangle }$ and ${\left|\psi_{E_{2}}\right\rangle }$ are eigenstates of the Hamiltonian ${H}$ and ${C_{1},C_{2}}$ are normalization constants. Operating on this state with ${U}$, we have

 $\displaystyle U\left|\psi\right\rangle$ $\displaystyle =$ $\displaystyle C_{1}e^{-iE_{1}t}\left|\psi_{E_{1}}\right\rangle +C_{2}e^{-iE_{2}t}\left|\psi_{E_{2}}\right\rangle \ \ \ \ \ (27)$

Example 2 Suppose we have a free particle wave function at some fixed time ${t_{0}}$:

$\displaystyle \left|\psi_{E}\right\rangle =e^{-i\left(Et_{0}-\mathbf{p}\cdot\mathbf{x}\right)} \ \ \ \ \ (28)$

In the Schrödinger picture, the time-dependent wave function for a free particle is 16. We can get the same wave function by applying the unitary operator ${U=e^{-iH\left(t-t_{0}\right)}}$ to ${\left|\psi_{E}\right\rangle }$:

 $\displaystyle U\left|\psi_{E}\right\rangle$ $\displaystyle =$ $\displaystyle e^{-iH\left(t-t_{0}\right)}e^{-i\left(Et_{0}-\mathbf{p}\cdot\mathbf{x}\right)}\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-i\left(Et-\mathbf{p}\cdot\mathbf{x}\right)} \ \ \ \ \ (30)$

In this sense, ${U}$ acts as an evolution operator, in that it takes a state at a fixed point in time and turns it into a dynamic state that evolves in time, giving the usual Schrödinger picture wave function. We can turn this state back into the Heisenberg picture by operating with ${U^{\dagger}}$:

 $\displaystyle U^{\dagger}e^{-i\left(Et-\mathbf{p}\cdot\mathbf{x}\right)}$ $\displaystyle =$ $\displaystyle e^{iH\left(t-t_{0}\right)}e^{-i\left(Et-\mathbf{p}\cdot\mathbf{x}\right)}\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle e^{-i\left(Et_{0}-\mathbf{p}\cdot\mathbf{x}\right)} \ \ \ \ \ (32)$

which is the original time-independent wave function.