Author Archives: gwrowe

Radially symmetric potentials, angular momentum and centrifugal force

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.3.5.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

We’ve seen that the eigenfunctions of two-dimensional angular momentum have the form

\displaystyle  \psi\left(\rho,\phi\right)=R\left(\rho\right)\Phi_{m}\left(\phi\right) \ \ \ \ \ (1)

where

\displaystyle  \Phi_{m}\left(\phi\right)=\frac{1}{\sqrt{2\pi}}e^{im\phi} \ \ \ \ \ (2)

In 2 dimensions and polar coordinates, the hamiltonian can be written as

\displaystyle  H=-\frac{\hbar^{2}}{2\mu}\left(\frac{\partial^{2}}{\partial\rho^{2}}+\frac{1}{\rho}\frac{\partial}{\partial\rho}+\frac{1}{\rho^{2}}\frac{\partial^{2}}{\partial\phi^{2}}\right)+V\left(\rho,\phi\right) \ \ \ \ \ (3)

If the potential is radially symmetric, that is, it doesn’t depend on {\phi}, then

\displaystyle  H=-\frac{\hbar^{2}}{2\mu}\left(\frac{\partial^{2}}{\partial\rho^{2}}+\frac{1}{\rho}\frac{\partial}{\partial\rho}+\frac{1}{\rho^{2}}\frac{\partial^{2}}{\partial\phi^{2}}\right)+V\left(\rho\right) \ \ \ \ \ (4)

In polar coordinates, the angular momentum operator has the form

\displaystyle  L_{z}=-i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (5)

Thus {L_{z}} commutes with every term in the hamiltonian 4, so for {V=V\left(\rho\right)}, we find

\displaystyle  \left[H,L_{z}\right]=0 \ \ \ \ \ (6)

meaning that we can find a set of functions that are simultaneously eigenfunctions of both {H} and {L_{z}}. Since we already know what the most general eigenfunctions of {L_{z}} are (eqn 1), the problem is then to find the radial function {R\left(\rho\right)} so that

\displaystyle  H\left[R\left(\rho\right)\Phi_{m}\left(\phi\right)\right]=ER\left(\rho\right)\Phi_{m}\left(\phi\right) \ \ \ \ \ (7)

If we use 4 for {H} and 2 for {\Phi} we find that we must solve the differential equation

\displaystyle  -\frac{\hbar^{2}}{2\mu}\left(\frac{d^{2}R}{d\rho^{2}}+\frac{1}{\rho}\frac{dR}{d\rho}-\frac{m^{2}}{\rho^{2}}R\right)+V\left(\rho\right)R=ER \ \ \ \ \ (8)

We’ve replaced the partial derivatives in 4 by ordinary derivatives, since we now have an ODE in one independent variable, namely {\rho}.

The term arising from the {\frac{1}{\rho^{2}}\frac{\partial^{2}}{\partial\phi^{2}}} term in 4 is similar to a potential term, since it doesn’t involve any derivatives of {R}. The potential term is

\displaystyle  V_{c}=\frac{\hbar^{2}}{2\mu}\frac{m^{2}}{\rho^{2}} \ \ \ \ \ (9)

We can find the force corresponding to {V_{c}} by taking the gradient, which in this case amounts to

\displaystyle  F_{c}=\frac{\partial V_{c}}{\partial\rho}=-\frac{\hbar^{2}m^{2}}{\mu\rho^{3}} \ \ \ \ \ (10)

Since the quantum angular momentum is {\ell_{z}=m\hbar}, this can be written as

\displaystyle  F_{c}=-\frac{\ell_{z}^{2}}{\mu\rho^{3}} \ \ \ \ \ (11)

If the particle is in a circular orbit, then {\ell_{z}=\rho p} where {p} is its momentum, so this becomes

\displaystyle  F_{c}=-\frac{p^{2}}{\mu\rho} \ \ \ \ \ (12)

Classically, {p=\mu v^{2}} so this is equivalent to

\displaystyle  F_{c}=-\frac{\mu v^{2}}{\rho} \ \ \ \ \ (13)

which is the formula for centripetal force in Newtonian physics. (Shankar calls it the centrifugal force, but the minus sign indicates it acts towards the centre of rotation rather than outwards, and of course as we well know, the centrifugal force is a fictitious force anyway.)

Angular momentum: probabilities of eigenvalues in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercises 12.3.3 – 12.3.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

We’ve seen that the eigenfunctions of two-dimensional angular momentum have the form

\displaystyle  \psi\left(\rho,\phi\right)=R\left(\rho\right)e^{i\ell_{z}\phi/\hbar} \ \ \ \ \ (1)

where {\ell_{z}} (the eigenvalue) is an integral multiple of {\hbar} and {R\left(\rho\right)} is some function of the radial coordinate {\rho} which depends on the particular potential function in the hamiltonian. It’s more convenient to write the angular function as

\displaystyle  \Phi_{m}\left(\phi\right)=\frac{1}{\sqrt{2\pi}}e^{im\phi} \ \ \ \ \ (2)

This set of functions is orthonormal over the interval {\phi\in\left[0,2\pi\right]}, that is

\displaystyle  \int_{0}^{2\pi}\Phi_{m}^*\left(\phi\right)\Phi_{m^{\prime}}\left(\phi\right)d\phi=\delta_{mm^{\prime}} \ \ \ \ \ (3)

This set of functions forms the angular part of the eigenfunctions of {L_{z}}, which in some cases allows us to determine the probabilities of a system being in a particular eigenstate of {L_{z}}. Here are a couple of examples.

Example 1 A particle is described by the wave function

\displaystyle  \psi\left(\rho,\phi\right)=Ae^{-\rho^{2}/2\Delta^{2}}\cos^{2}\phi \ \ \ \ \ (4)

where {A} is a normalization constant, and {\Delta} is another constant.

We can use the trig identity

\displaystyle  \cos^{2}\phi=\frac{1}{2}\left(1+\cos2\phi\right) \ \ \ \ \ (5)

to write this wave function as

\displaystyle   \psi\left(\rho,\phi\right) \displaystyle  = \displaystyle  \frac{A}{2}e^{-\rho^{2}/2\Delta^{2}}\left[1+\cos2\phi\right]\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \frac{A}{2}e^{-\rho^{2}/2\Delta^{2}}\left(1+\frac{e^{2i\phi}+e^{-2i\phi}}{2}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \frac{A\sqrt{2\pi}}{2}e^{-\rho^{2}/2\Delta^{2}}\left(\Phi_{0}+\frac{1}{2}\left(\Phi_{2}+\Phi_{-2}\right)\right) \ \ \ \ \ (8)

Thus the wave function has the form

\displaystyle  \psi\left(\rho,\phi\right)=c_{0}\Phi_{0}+c_{2}\Phi_{2}+c_{-2}\Phi_{-2} \ \ \ \ \ (9)

where the coefficients {c_{m}} can be found by comparison with 8. Since the {\Phi_{m}} are orthonormal functions, the probability of the particle being in state {i} is

\displaystyle  P\left(\ell_{z}=m\hbar\right)=\frac{\left|c_{m}\right|^{2}}{\sum_{j}\left|c_{j}\right|^{2}} \ \ \ \ \ (10)

We can see from this formula that the factor of {\frac{A\sqrt{2\pi}}{2}e^{-\rho^{2}/2\Delta^{2}}} cancels out of the probability formula, so we have

\displaystyle   P\left(\ell_{z}=0\right) \displaystyle  = \displaystyle  \frac{\left|c_{0}\right|^{2}}{\sum_{j}\left|c_{j}\right|^{2}}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{1+\frac{1}{4}+\frac{1}{4}}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \frac{2}{3}\ \ \ \ \ (13)
\displaystyle  P\left(\ell_{z}=2\hbar\right) \displaystyle  = \displaystyle  \frac{\left|c_{2}\right|^{2}}{\sum_{j}\left|c_{j}\right|^{2}}\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \frac{\frac{1}{4}}{1+\frac{1}{4}+\frac{1}{4}}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{6}\ \ \ \ \ (16)
\displaystyle  P\left(\ell_{z}=-2\hbar\right) \displaystyle  = \displaystyle  \frac{\left|c_{-2}\right|^{2}}{\sum_{j}\left|c_{j}\right|^{2}}\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \frac{\frac{1}{4}}{1+\frac{1}{4}+\frac{1}{4}}\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{6} \ \ \ \ \ (19)

Example 2 Now we have the wave function

\displaystyle  \psi\left(\rho,\phi\right)=Ae^{-\rho^{2}/2\Delta^{2}}\left(\frac{\rho}{\Delta}\cos\phi+\sin\phi\right) \ \ \ \ \ (20)

Again, we write the trig functions in terms of {\Phi_{m}} to get

\displaystyle   \psi\left(\rho,\phi\right) \displaystyle  = \displaystyle  Ae^{-\rho^{2}/2\Delta^{2}}\left(\frac{\rho}{\Delta}\frac{e^{i\phi}+e^{-i\phi}}{2}+\frac{e^{i\phi}-e^{-i\phi}}{2i}\right)\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  A\sqrt{2\pi}e^{-\rho^{2}/2\Delta^{2}}\left[\left(\frac{\rho}{2\Delta}+\frac{1}{2i}\right)\Phi_{1}+\left(\frac{\rho}{2\Delta}-\frac{1}{2i}\right)\Phi_{-1}\right] \ \ \ \ \ (22)

As above, the factor of {A\sqrt{2\pi}e^{-\rho^{2}/2\Delta^{2}}} cancels out when calculating probabilities, so we have

\displaystyle   P\left(\ell_{z}=\hbar\right) \displaystyle  = \displaystyle  \frac{\left|c_{1}\right|^{2}}{\left|c_{1}\right|^{2}+\left|c_{-1}\right|^{2}}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left|\frac{\rho}{2\Delta}+\frac{1}{2i}\right|^{2}}{\left|\frac{\rho}{2\Delta}+\frac{1}{2i}\right|^{2}+\left|\frac{\rho}{2\Delta}-\frac{1}{2i}\right|^{2}}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left(\frac{\rho}{2\Delta}\right)^{2}+\frac{1}{4}}{2\left[\left(\frac{\rho}{2\Delta}\right)^{2}+\frac{1}{4}\right]}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\ \ \ \ \ (26)
\displaystyle  P\left(\ell_{z}=-\hbar\right) \displaystyle  = \displaystyle  \frac{\left|c_{-1}\right|^{2}}{\left|c_{1}\right|^{2}+\left|c_{-1}\right|^{2}}\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left|\frac{\rho}{2\Delta}-\frac{1}{2i}\right|^{2}}{\left|\frac{\rho}{2\Delta}+\frac{1}{2i}\right|^{2}+\left|\frac{\rho}{2\Delta}-\frac{1}{2i}\right|^{2}}\ \ \ \ \ (28)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left(\frac{\rho}{2\Delta}\right)^{2}+\frac{1}{4}}{2\left[\left(\frac{\rho}{2\Delta}\right)^{2}+\frac{1}{4}\right]}\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2} \ \ \ \ \ (30)

Thus in this case, the {\rho} dependence cancels out when calculating the probabilities, although we can’t expect this to be true in general.

Eigenvalues of angular momentum

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.3.2.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

One consequence of requiring the angular momentum operator {L_{z}} to be hermitian is that the eigenvalues must be integral multiples of {\hbar}, so that {\ell_{z}=m\hbar} for {m=0,\pm1,\pm2,\ldots}. Shankar proposes another method by which we might try to obtain this restriction on {\ell_{z}}. We start with a superposition of two eigenstates of {L_{z}}, so that

\displaystyle   \psi\left(\rho,\phi\right) \displaystyle  = \displaystyle  A\left(\rho\right)e^{i\phi\ell_{z}/\hbar}+B\left(\rho\right)e^{i\phi\ell_{z}^{\prime}/\hbar}\ \ \ \ \ (1)
\displaystyle  \displaystyle  = \displaystyle  e^{i\phi\ell_{z}^{\prime}/\hbar}\left[A\left(\rho\right)e^{i\phi\left(\ell_{z}-\ell_{z}^{\prime}\right)/\hbar}+B\left(\rho\right)\right] \ \ \ \ \ (2)

where {A} and {B} are two unknown functions of the radial coordinate {\rho}, and {\ell_{z}} and {\ell_{z}^{\prime}} are two eigenvalues of {L_{z}}. If we rotate the system by a complete circle, so that {\phi\rightarrow\phi+2\pi}, the physical state should remain unchanged. This means that

\displaystyle  \left|\psi\left(\rho,\phi+2\pi\right)\right|=\left|\psi\left(\rho,\phi\right)\right| \ \ \ \ \ (3)

so that {\psi\left(\rho,\phi+2\pi\right)} may differ from {\psi\left(\rho,\phi\right)} by a phase factor. From 2

\displaystyle  \psi\left(\rho,\phi+2\pi\right)=e^{i\left(\phi+2\pi\right)\ell_{z}^{\prime}/\hbar}\left[A\left(\rho\right)e^{i\left(\phi+2\pi\right)\left(\ell_{z}-\ell_{z}^{\prime}\right)/\hbar}+B\left(\rho\right)\right] \ \ \ \ \ (4)

The phase factor of {e^{i\left(\phi+2\pi\right)\ell_{z}^{\prime}/\hbar}} on the RHS can be anything (provided the exponent is purely imaginary), but the quantity in the square brackets must be numerically the same as the corresponding quantity in 2. This means that

\displaystyle  \frac{\left(\phi+2\pi\right)\left(\ell_{z}-\ell_{z}^{\prime}\right)}{\hbar}=\frac{\phi\left(\ell_{z}-\ell_{z}^{\prime}\right)}{\hbar}+2m\pi \ \ \ \ \ (5)

where {m} is an integer. This gives the condition

\displaystyle  \ell_{z}-\ell_{z}^{\prime}=m\hbar \ \ \ \ \ (6)

To proceed further, we need to argue that {\ell_{z}} is symmetric about zero, that is, if {\ell_{z}} is an eigenvalue, then so is {-\ell_{z}}. I’m not sure if Shankar expects us to prove this rigorously, but it seems plausible, since the only difference between {+\ell_{z}} and {-\ell_{z}} is (classically, anyway) that the direction of rotation is reversed. Given this condition, {\ell_{z}} must be a multiple of {\frac{1}{2}\hbar}, since any other value doesn’t satisfy both the conditions of symmetry about zero, and 6. (For example, if we try {\ell_{z}=\frac{1}{4}\hbar}, then the symmetry requirement means we must also allow {\ell_{z}=-\frac{1}{4}\hbar}, but this violates 6.) If {\ell_{z}} is an odd multiple of {\frac{1}{2}\hbar}, then we get the sequence {\ldots,-\frac{3}{2}\hbar},-{\frac{1}{2}\hbar,+\frac{1}{2}\hbar,+\frac{3}{2}\hbar,\ldots} while if {\ell_{z}} is an even multiple of {\frac{1}{2}\hbar} we get the sequence {\ldots,-2\hbar,-\hbar,0,+\hbar,+2\hbar,\ldots}. In reality, only the latter sequence is correct, but we can’t show that from this argument.

Eigenvalues of two-dimensional angular momentum

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.3.1.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The angular momentum operator {L_{z}} for rotations in two dimensions has the form, in polar coordinates, of

\displaystyle L_{z}=-i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (1)

To find the eigenvalues and eigenfunctions, we need to solve

\displaystyle L_{z}\left|\ell_{z}\right\rangle =\ell_{z}\left|\ell_{z}\right\rangle \ \ \ \ \ (2)

where {\left|\ell_{z}\right\rangle } is the eigenfunction and {\ell_{z}} is the corresponding eigenvalue. Using polar coordinates, we must solve

\displaystyle -i\hbar\frac{\partial}{\partial\phi}\psi_{\ell_{z}}\left(\rho,\phi\right)=\ell_{z}\psi_{\ell_{z}}\left(\rho,\phi\right) \ \ \ \ \ (3)

 

where {\rho} is the radial coordinate. As the only derivative here is with respect to {\phi}, we can solve this using separation of variables by proposing a solution of form

\displaystyle \psi_{\ell_{z}}\left(\rho,\phi\right)=R\left(\rho\right)\Phi\left(\phi\right) \ \ \ \ \ (4)

Substituting this and cancelling off {R\left(\rho\right)} we get

\displaystyle -i\hbar\frac{\partial}{\partial\phi}\Phi\left(\phi\right)=\ell_{z}\Phi\left(\phi\right) \ \ \ \ \ (5)

which has the solution

\displaystyle \Phi\left(\phi\right)=Ae^{i\ell_{z}\phi/\hbar} \ \ \ \ \ (6)

for some constant {A}, which we can absorb into {R\left(\rho\right)} to give the general solution

\displaystyle \psi_{\ell_{z}}\left(\rho,\phi\right)=R\left(\rho\right)e^{i\ell_{z}\phi/\hbar} \ \ \ \ \ (7)

 

[This is actually the two-dimensional version of the more general 3-d case, in which the solution involved a radial function multiplied by a spherical harmonic.]

At this stage, the eigenvalue {\ell_{z}} could be any number, real or complex, since they all satisfy 3. However, since {L_{z}} is an observable, it must be hermitian, which implies that {L_{z}^{\dagger}=L_{z}}, so that

\displaystyle \left\langle \psi_{1}\left|L_{z}\right|\psi_{2}\right\rangle =\left\langle \psi_{2}\left|L_{z}\right|\psi_{1}\right\rangle ^* \ \ \ \ \ (8)

In the coordinate basis, we have

\displaystyle \int_{0}^{\infty}\int_{0}^{2\pi}\psi_{1}^*\left(-i\hbar\frac{\partial}{\partial\phi}\right)\psi_{2}d\phi\;d\rho=\left[\int_{0}^{\infty}\int_{0}^{2\pi}\psi_{2}^*\left(-i\hbar\frac{\partial}{\partial\phi}\right)\psi_{1}d\phi\;d\rho\right]^* \ \ \ \ \ (9)

 

Integrating the LHS by parts, we have

\displaystyle \int_{0}^{\infty}\int_{0}^{2\pi}\psi_{1}^*\left(-i\hbar\frac{\partial}{\partial\phi}\right)\psi_{2}d\phi\;d\rho=-i\hbar\int_{0}^{\infty}\left.\psi_{1}^*\psi_{2}\right|_{0}^{2\pi}d\rho+i\hbar\int_{0}^{\infty}\int_{0}^{2\pi}\frac{\partial\psi_{1}^*}{\partial\phi}\psi_{2}d\phi\;d\rho \ \ \ \ \ (10)

The second term on the RHS is seen to be equal to the RHS of 9, so in order for 9 to be true, we must have

\displaystyle \int_{0}^{\infty}\left.\psi_{1}^*\psi_{2}\right|_{0}^{2\pi}d\rho=0 \ \ \ \ \ (11)

Although two different eigenfunctions {\psi_{1}} and {\psi_{2}} are orthogonal and thus would satisfy this condition automatically, the condition must also be true when {\psi_{1}=\psi_{2}}. This gives us the condition that

\displaystyle \psi_{\ell_{z}}\left(2\pi\right)=\psi_{\ell_{z}}\left(0\right) \ \ \ \ \ (12)

That is, the eigenfunctions must be periodic with period {2\pi}. Looking back at 7, we see that this forces the eigenvalues {\ell_{z}} to be integral multiples of {\hbar}:

\displaystyle \ell_{z} \displaystyle = \displaystyle m\hbar\ \ \ \ \ (13)
\displaystyle m \displaystyle = \displaystyle 0,\pm1,\pm2,\ldots \ \ \ \ \ (14)

Here {m} is the magnetic quantum number, not the mass.

Combining translations and rotations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

When it comes to symmetries in quantum mechanics, we’ve looked at translations and rotations in two dimensions, and found that the generators are the momenta {P_{x}} and {P_{y}} for translations, and the angular momentum {L_{z}} for rotations.

From the fact that {L_{z}} does not commute with either momentum or position operators, you might guess that if we performed some sequence of translations and rotations on a system that the order in which these operations are done matters. In fact, you can see this by considering simple two-dimensional geometry, without reference to quantum mechanics. Consider the {x} and {y} axes on a sheet of graph paper. First, translate these axes by adding the vector {\mathbf{r}} to all points, so that the new origin of coordinates lies at position {\mathbf{r}} as referenced in the original coordinates. Next, do a rotation about the original origin by some angle {\phi}. This will move the new origin around the original {z} axis. Now, do the inverse of the original translation by adding {-\mathbf{r}} to all points. Finally, do the inverse of the rotation by rotating the system by {-\phi} around the original {z} axis. You’ll find that the {xy} axes that have undergone this sequence of transformations does not coincide with the original {xy} axes. However, if you did the same set of four transformations in the order: translate by {\mathbf{r}}, translate by {-\mathbf{r}}, rotate by {\phi}, rotate by {-\phi}, the transformed axes would coincide with the original axes.

To see how this works in quantum mechanics, we can again consider infinitesimal translations and rotations. If we start with a point at location {\left[x,y\right]} and apply the four transformations described above, but now for an infinitesimal translation {\boldsymbol{\varepsilon}=\varepsilon_{x}\hat{\mathbf{x}}+\varepsilon_{y}\hat{\mathbf{y}}} and rotation {\varepsilon_{z}\hat{\mathbf{z}}}, then the successive transformations work as follows. In each case, we’ll retain terms up to order {\varepsilon_{x}\varepsilon_{z}} and {\varepsilon_{y}\varepsilon_{z}} but discard terms of order {\varepsilon_{x}^{2}}, {\varepsilon_{y}^{2}}, {\varepsilon_{z}^{2}} and higher. [I’m not quite sure of the rationale that allows us to do this, apart from the fact that it gives the right answer.]

\displaystyle   \left[\begin{array}{c} x\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right]\ \ \ \ \ (1)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (2)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}-\varepsilon_{x}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\varepsilon_{y} \end{array}\right]\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (4)
\displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}+\left[y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}\right]\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\left[x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x-\varepsilon_{y}\varepsilon_{z}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (6)

Thus, to this order in the infinitesimals, the combination of translation-rotation-translation-rotation is equivalent to a single translation by a distance {\left[-\varepsilon_{y}\varepsilon_{z},\varepsilon_{x}\varepsilon_{z}\right]}. We can write this in terms of the unitary quantum operators for translations and rotations as

\displaystyle  U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(-\varepsilon_{y}\varepsilon_{z}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (7)

Using the forms of these operators for infinitesimal transformations, we can expand both sides to give

\displaystyle   \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I+\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right]\times
\displaystyle  \left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I-\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right] \displaystyle  = \ \ \ \ \ (8) \displaystyle  I-\frac{i}{\hbar}\left(-\varepsilon_{y}\varepsilon_{z}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right)

Since the infinitesimal displacements are arbitrary, this equation can be valid only if the coefficients of each combination of {\varepsilon_{x},\varepsilon_{y}} and {\varepsilon_{z}} are equal on both sides. As above, we’ll discard any terms of order {\varepsilon_{x}^{2}}, {\varepsilon_{y}^{2}}, {\varepsilon_{z}^{2}} and higher. The algebra is straightforward although a bit tedious, so I’ll just give a couple of examples here.

The coefficient of {\varepsilon_{z}} on its own is, on the LHS

\displaystyle  \frac{i\varepsilon_{z}}{\hbar}L_{z}-\frac{i\varepsilon_{z}}{\hbar}L_{z}=0 \ \ \ \ \ (9)

On the RHS, there is no term in {\varepsilon_{z}}, so we get 0 on the RHS. In this case, we see the equation is consistent.

For the {\varepsilon_{x}\varepsilon_{z}} term, we get on the LHS:

\displaystyle  \varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left(L_{z}P_{x}-L_{z}P_{x}-P_{x}L_{z}+L_{z}P_{x}\right)=-\varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left[P_{x},L_{z}\right] \ \ \ \ \ (10)

On the RHS, the term is

\displaystyle  -\frac{i}{\hbar}\varepsilon_{x}\varepsilon_{z}P_{y} \ \ \ \ \ (11)

Thus the condition here becomes

\displaystyle  \left[P_{x},L_{z}\right]=-i\hbar P_{y} \ \ \ \ \ (12)

which agrees with the commutation relation we found earlier. By considering the coefficient of {\varepsilon_{y}\varepsilon_{z}}, we arrive at the other condition, which is

\displaystyle  \left[P_{y},L_{z}\right]=i\hbar P_{x} \ \ \ \ \ (13)

The result of this calculation doesn’t tell us anything new about the translation or rotation operators, but it does show that the condition 7 is consistent with what we already know about the commutators of position, momentum and angular momentum.

As Shankar points out, we might think that we need to verify the conditions for an infinite number of combinations of rotations and translations, since each such combination gives rise to a different overall transformation. He says that it has actually been shown that the example above is sufficient to guarantee that all such combinations do in fact give valid results, although he doesn’t give the details. We are, however, given the exercise of verifying this claim for one special case, which we’ll consider now.

In this example, we’ll consider the same four transformations, in the same order, as above except that we’ll take the translation to be entirely in the {x} direction so that {\varepsilon_{y}=0}. This time, we’ll retain terms up to {\varepsilon_{x}\varepsilon_{z}^{2}} and see what we get. We start by repeating the calculations in 1 through 6. However, because we’re saving higher order terms, we need to represent the infinitesimal rotations by

\displaystyle  R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} 1-\frac{\varepsilon_{z}^{2}}{2} & -\varepsilon_{z}\\ \varepsilon_{z} & 1-\frac{\varepsilon_{z}^{2}}{2} \end{array}\right] \ \ \ \ \ (14)

That is, we’re approximating {\cos\varepsilon_{z}} by the first two terms in its expansion. Using this, we have

\displaystyle   \left[\begin{array}{c} x\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right]\ \ \ \ \ (15)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (16)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\varepsilon_{x}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z} \end{array}\right]\ \ \ \ \ (18)
\displaystyle  \left[\begin{array}{c} x-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} \left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\varepsilon_{z}\\ \left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-\left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}+\frac{1}{4}\varepsilon_{x}\varepsilon_{z}^{4}\\ y\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (20)

To get the last line, I used Maple to do the algebra in multiplying out the terms. At this point, we can neglect the terms in {\varepsilon_{z}^{4}}, leaving us with the overall transformation:

\displaystyle  \left[\begin{array}{c} x\\ y \end{array}\right]\longrightarrow\left[\begin{array}{c} x+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (21)

This is equivalent to a translation by {\boldsymbol{\varepsilon}=\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}}, so by analogy with 7, we have the condition

\displaystyle  U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (22)

To expand the operators on the LHS and retain terms up to {\varepsilon_{x}\varepsilon_{z}^{2}}, we need to expand the rotation operators up to order {\varepsilon_{z}^{2}}. Treating the rotation operator as an exponential, this expansion is

\displaystyle  R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}+\ldots \ \ \ \ \ (23)

Using this approximation gives us

\displaystyle   \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I+\frac{i}{\hbar}\varepsilon_{x}P_{x}\right]\left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I-\frac{i}{\hbar}\varepsilon_{x}P_{x}\right] \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right) \ \ \ \ \ (24)

By equating the coefficients of {\varepsilon_{x}\varepsilon_{z}} we regain 12, so that condition checks out.

Extracting the coefficient of {\varepsilon_{x}\varepsilon_{z}^{2}} on the LHS gives

\displaystyle   \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}-\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}-\frac{L_{z}^{2}P_{x}}{2}+L_{z}^{2}P_{x}\right) \displaystyle  = \displaystyle  \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}\right) \ \ \ \ \ (25)

Matching this to the {\varepsilon_{x}\varepsilon_{z}^{2}} term on the RHS of 24, we get the condition specified in Shankar’s problem:

\displaystyle  -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2}=\hbar^{2}P_{x} \ \ \ \ \ (26)

We can show that this condition reduces to the already-known commutators by using the identity

\displaystyle   \left[\Lambda,\left[\Lambda,\Omega\right]\right] \displaystyle  = \displaystyle  \Lambda\left(\Lambda\Omega-\Omega\Lambda\right)-\left(\Lambda\Omega-\Omega\Lambda\right)\Lambda\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  -2\Lambda\Omega\Lambda+\Lambda^{2}\Omega+\Omega\Lambda^{2} \ \ \ \ \ (28)

Applying this to 26 we have

\displaystyle   -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2} \displaystyle  = \displaystyle  \left[L_{z},\left[L_{z},P_{x}\right]\right]\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  i\hbar\left[L_{z},P_{y}\right]\ \ \ \ \ (30)
\displaystyle  \displaystyle  = \displaystyle  i\hbar\left(-i\hbar P_{x}\right)\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  \hbar^{2}P_{x} \ \ \ \ \ (32)

Thus the more complicated condition 26 actually reduces to existing commutators.

Rotations through a finite angle; use of polar coordinates

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.3.

The angluar momentum operator {L_{z}} is the generator of rotations in the {xy} plane. We did the derivation for infinitesimal rotations, but we can generalize this to finite rotations in a similar manner to that used for translations. The unitary transformation for an infinitesimal rotation is

\displaystyle  U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)

For rotation through a finite angle {\phi_{0}}, we divide up the angle into {N} small angles, so {\varepsilon_{z}=\phi_{0}/N}. Rotation through the full angle {\phi_{0}} is then given by

\displaystyle  U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]=\lim_{N\rightarrow\infty}\left(I-\frac{i\phi_{0}L_{z}}{N\hbar}\right)^{N}=e^{-i\phi_{0}L_{z}/\hbar} \ \ \ \ \ (2)

The limit follows because the only non-trivial operator involved is {L_{z}}, so no commutation problems arise.

In rectangular coordinates, {L_{z}} has the relatively non-obvious form

\displaystyle   L_{z} \displaystyle  = \displaystyle  XP_{y}-YP_{x}\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\left(x\frac{\partial}{\partial y}-y\frac{\partial}{\partial x}\right) \ \ \ \ \ (4)

so it’s not immediately clear that 2 does in fact lead to the desired rotation. Trying to calculate the exponential with {L_{z}} expressed this way is not easy, given that the two terms {x\frac{\partial}{\partial y}} and {y\frac{\partial}{\partial x}} don’t commute.

It turns out that {L_{z}} has a much simpler form in polar coordinates, and there are two ways of converting it to polar form. First, we recall the transformation equations.

\displaystyle   x \displaystyle  = \displaystyle  \rho\cos\phi\ \ \ \ \ (5)
\displaystyle  y \displaystyle  = \displaystyle  \rho\sin\phi\ \ \ \ \ (6)
\displaystyle  \rho \displaystyle  = \displaystyle  \sqrt{x^{2}+y^{2}}\ \ \ \ \ (7)
\displaystyle  \phi \displaystyle  = \displaystyle  \tan^{-1}\frac{y}{x} \ \ \ \ \ (8)

From the chain rule, we can convert the derivatives:

\displaystyle   \frac{\partial}{\partial x} \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}+\frac{\partial\cos\phi}{\partial x}\frac{\partial}{\partial\left(\cos\phi\right)}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}-\sin\phi\frac{\partial\phi}{\partial x}\frac{\partial}{\left(-\sin\phi\right)\partial\phi}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \frac{x}{\rho}\frac{\partial}{\partial\rho}-\sin\phi\frac{-y/x^{2}}{1+y^{2}/x^{2}}\left(\frac{-1}{\sin\phi}\right)\frac{\partial}{\partial\phi}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (12)

Using similar methods, we get for the other derivative

\displaystyle   \frac{\partial}{\partial y} \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial y}\frac{\partial}{\partial\rho}+\frac{\partial\sin\phi}{\partial x}\frac{\partial}{\partial\left(\sin\phi\right)}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (14)

Plugging these into 4 we have

\displaystyle   L_{z} \displaystyle  = \displaystyle  -i\hbar\left[x\left(\frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi}\right)-y\left(\frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi}\right)\right]\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\frac{x^{2}+y^{2}}{\rho^{2}}\frac{\partial}{\partial\phi}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (17)

Another method of converting {L_{z}} to polar coordinates is to consider the effect of {U\left[R\right]} for an infinitesimal rotation {\varepsilon_{z}} on a state vector expressed in polar coordinates {\psi\left(\rho,\phi\right)}. Shankar states that

\displaystyle  \left\langle \rho,\phi\left|U\left[R\right]\right|\psi\left(\rho,\phi\right)\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (18)

If you don’t believe this, it can be shown using a method similar to that for the one-dimensional translation. In this case, we’re dealing with position eigenkets in polar coordinates, so we have

\displaystyle  U\left[R\right]\left|\rho,\phi\right\rangle =\left|\rho,\phi+\varepsilon_{z}\right\rangle \ \ \ \ \ (19)

Applying this, we get

\displaystyle   \left|\psi_{\varepsilon_{z}}\right\rangle \displaystyle  = \displaystyle  U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  U\left[R\right]\int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi+\varepsilon_{z}\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho^{\prime},\phi^{\prime}\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime} \ \ \ \ \ (23)

where in the last line, we used the substitution {\phi^{\prime}=\phi+\varepsilon_{z}}. (The substitution {\rho^{\prime}=\rho} is used just to give the radial variable a different name in the integrand.) We can use the same limits of integration for {\phi} and {\phi^{\prime}}, since we just need to ensure that the integral covers the total range of angles. It then follows that

\displaystyle   \left\langle \rho,\phi\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left\langle \rho,\phi\left|\rho^{\prime},\phi^{\prime}\right.\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\delta\left(\rho-\rho^{\prime}\right)\delta\left(\phi-\phi^{\prime}\right)\left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (26)

Combining this with 1 we have

\displaystyle  \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (27)

Expanding the RHS to order {\varepsilon_{z}} we have

\displaystyle  \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi\right)-\varepsilon_{z}\frac{\partial\psi}{\partial\phi} \ \ \ \ \ (28)

from which 17 follows again.

Once we have {L_{z}} in this form, the exponential form of a finite rotation is easier to interpret, for we have, from 2

\displaystyle   e^{-i\phi_{0}L_{z}/\hbar} \displaystyle  = \displaystyle  \exp\left[-\phi_{0}\frac{\partial}{\partial\phi}\right]\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  1-\phi_{0}\frac{\partial}{\partial\phi}+\frac{\phi_{0}^{2}}{2!}\frac{\partial^{2}}{\partial\phi^{2}}+\ldots \ \ \ \ \ (30)

Applying this to a state function {\psi\left(\rho,\phi\right)}, we see that we get the Taylor series for {\psi\left(\rho,\phi-\phi_{0}\right)}, so the exponential does indeed represent a rotation through a finite angle.

Rotational transformations using passive transformations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.2.

We can also derive the generator of rotations {L_{z}} by considering passive transformations of the position and momentum operators, in a way similar to that used for deriving the generator of translations. In a passive transformation, the operators are modified while the state vectors remain the same. For an infinitesimal rotation {\varepsilon_{z}\hat{\mathbf{z}}} about the {z} axis in two dimensions, the unitary operator has the form

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)

 

For a finite rotation by {\phi_{0}\hat{\mathbf{z}}} the transformations are given by

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (2)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (3)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (4)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (5)

For the infinitesimal transformation, {\phi_{0}=\varepsilon_{z}} and these equations reduce to

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle -\left\langle Y\right\rangle \varepsilon_{z}\ \ \ \ \ (6)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \varepsilon_{z}+\left\langle Y\right\rangle \ \ \ \ \ (7)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle -\left\langle P_{y}\right\rangle \varepsilon_{z}\ \ \ \ \ (8)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \varepsilon_{z}+\left\langle P_{y}\right\rangle \ \ \ \ \ (9)

In the passive transformation scheme, we move the transformation to the operators to get

\displaystyle U^{\dagger}\left[R\right]XU\left[R\right] \displaystyle = \displaystyle X-Y\varepsilon_{z}\ \ \ \ \ (10)
\displaystyle U^{\dagger}\left[R\right]YU\left[R\right] \displaystyle = \displaystyle X\varepsilon_{z}+Y\ \ \ \ \ (11)
\displaystyle U^{\dagger}\left[R\right]P_{x}U\left[R\right] \displaystyle = \displaystyle P_{x}-P_{y}\varepsilon_{z}\ \ \ \ \ (12)
\displaystyle U^{\dagger}\left[R\right]P_{y}U\left[R\right] \displaystyle = \displaystyle P_{x}\varepsilon_{z}+P_{y} \ \ \ \ \ (13)

Substituting 1 into these equations gives us the commutation relations satisfied by {L_{z}}. For example, in the first equation we have

\displaystyle U^{\dagger}\left[R\right]XU\left[R\right] \displaystyle = \displaystyle \left(I+\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)X\left(I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle X+\frac{i\varepsilon_{z}}{\hbar}\left(L_{z}X-XL_{z}\right)\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle X-Y\varepsilon_{z} \ \ \ \ \ (16)

Equating the last two lines, we get

\displaystyle \left[X,L_{z}\right]=-i\hbar Y \ \ \ \ \ (17)

 

Similarly, for the other three equations we get

\displaystyle \left[Y,L_{z}\right] \displaystyle = \displaystyle i\hbar X\ \ \ \ \ (18)
\displaystyle \left[P_{x},L_{z}\right] \displaystyle = \displaystyle -i\hbar P_{y}\ \ \ \ \ (19)
\displaystyle \left[P_{y},L_{z}\right] \displaystyle = \displaystyle i\hbar P_{x} \ \ \ \ \ (20)

We can use these commutation relations to derive the form of {L_{z}} by using the commutation relations for coordinates and momenta:

\displaystyle \left[X,P_{x}\right]=\left[Y,P_{y}\right]=i\hbar \ \ \ \ \ (21)

with all other commutators involving {X,Y,P_{x}} and {P_{y}} being zero. Starting with 17, we see that

\displaystyle \left[X,L_{z}\right]=-\left[X,P_{x}\right]Y \ \ \ \ \ (22)

We can therefore deduce that

\displaystyle L_{z}=-P_{x}Y+f\left(X,Y,P_{y}\right) \ \ \ \ \ (23)

 

where {f} is some unknown function. We must include {f} since the commutators of {X} with {X,Y} and {P_{y}} are all zero, so adding on {f} still satisfies 17. (You can think of it as similar to adding on the constant in an indefinite integral.)

Now from 18, we have

\displaystyle \left[Y,L_{z}\right]=\left[Y,P_{y}\right]X \ \ \ \ \ (24)

so combining this with 23 we have

\displaystyle L_{z}=-P_{x}Y+P_{y}X+g\left(X,Y\right) \ \ \ \ \ (25)

 

The undetermined function is now a function only of {X} and {Y}, since the dependence of {L_{z}} on {P_{x}} and {P_{y}} has been determined uniquely by the commutators 17 and 18.

From 19 we have

\displaystyle \left[P_{x},L_{z}\right]=\left[P_{x},X\right]P_{y} \ \ \ \ \ (26)

We can see that this is satisfied already by 25, except that we now know that the function {g} cannot depend on {X}, since then {\left[P_{x},g\right]\ne0}. Thus we have narrowed down {L_{z}} to

\displaystyle L_{z}=-P_{x}Y+P_{y}X+h\left(Y\right) \ \ \ \ \ (27)

 

Finally, from 20 we have

\displaystyle \left[P_{y},L_{z}\right]=-\left[P_{y},Y\right]P_{x} \ \ \ \ \ (28)

This is satisfied by 27 if we take {h=0} (well, technically, we could take {h} to be some constant, but we might as well take the constant to be zero), giving us the final form for {L_{z}}:

\displaystyle L_{z}=-P_{x}Y+P_{y}X \ \ \ \ \ (29)

Rotational invariance in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.1.

As a first look at rotational invariance in quantum mechanics, we’ll look at two-dimensional rotations about the {z} axis. Classically, a rotation by an angle {\phi_{0}} about the {z} axis is given by the matrix equation for the coordinates

\displaystyle \left[\begin{array}{c} \bar{x}\\ \bar{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} x\\ y \end{array}\right] \ \ \ \ \ (1)

The momenta transform the same way, since we are merely changing the direction of the {x} and {y} axes. Thus we have also

\displaystyle \left[\begin{array}{c} \bar{p}_{x}\\ \bar{p}_{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} p_{x}\\ p_{y} \end{array}\right] \ \ \ \ \ (2)

The rotation matrix can be written as an operator, defined as

\displaystyle R\left(\phi_{0}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right] \ \ \ \ \ (3)

In quantum mechanics, due to the uncertainty principle we cannot specify position and momentum precisely at the same time, so as with the case of translational invariance, we deal with expectation values. As usual, a rotation is represented by a unitary operator {U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]} so that a quantum state transforms according to

\displaystyle \left|\psi\right\rangle \rightarrow\left|\psi_{R}\right\rangle =U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (4)

Dealing with expectation values means that the rotation operator must satisfy

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (5)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (6)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (7)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (8)

The expectation values on the LHS of these equations are calculated using the rotated state, so that

\displaystyle \left\langle X\right\rangle _{R}=\left\langle \psi_{R}\left|X\right|\psi_{R}\right\rangle \ \ \ \ \ (9)

and so on.

In two dimensions, the position eigenkets depend on the two independent coordinates {x} and {y}, and each of these eigenkets transforms under rotation in the same way the position variables above. Operating on such an eigenket with the unitary rotation operator thus must give

\displaystyle U\left[R\right]\left|x,y\right\rangle =\left|x\cos\phi_{0}-y\sin\phi_{0},x\sin\phi_{0}+y\cos\phi_{0}\right\rangle \ \ \ \ \ (10)

 

As with the translation operator, we try to construct an explicity form for {U\left[R\right]} by considering an infinitesimal rotation {\varepsilon_{z}\hat{\mathbf{z}}} about the {z} axis. We propose that the unitary operator for this rotation is given by

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (11)

 

where {L_{z}} is, at this stage, an unknown operator called the generator of infinitesimal rotations (although, as the notation suggests, it will turn out to be the {z} component of angular momentum). Under this rotation, we have, to first order in {\varepsilon_{z}}:

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (12)

Note that we’ve omitted a possible phase factor in this rotation. That is, we could have written

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =e^{i\varepsilon_{z}g\left(x,y\right)/\hbar}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (13)

for some real function {g\left(x,y\right)}. Dropping the phase factor has the effect of making the momentum expectation values transform in the same way as the position expectaton values, as shown by Shankar in his equation 12.2.13, so we’ll just take the phase factor to be 1 from now on.

We can now find the position space form of a general state vector {\left|\psi\right\rangle } under an infinitesimal rotation by following a similar procedure to that for a translation.

We have

\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle \displaystyle = \displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|\psi\right\rangle \ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (16)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy \ \ \ \ \ (17)

We can now change integration variables if we define

\displaystyle x^{\prime} \displaystyle \equiv \displaystyle x-y\varepsilon_{z}\ \ \ \ \ (18)
\displaystyle y^{\prime} \displaystyle = \displaystyle x\varepsilon_{z}+y \ \ \ \ \ (19)

The differentials transform by considering terms only up to first order in infinitesimal quantities, so we have

\displaystyle dx^{\prime} \displaystyle = \displaystyle dx-\varepsilon_{z}dy=dx\ \ \ \ \ (20)
\displaystyle dy^{\prime} \displaystyle = \displaystyle \varepsilon_{z}dx+dy=dy \ \ \ \ \ (21)

Also, to first order in infinitesimal quantities, we can invert the variables to get

\displaystyle x^{\prime}+\varepsilon_{z}y^{\prime} \displaystyle = \displaystyle x-y\varepsilon_{z}+x\varepsilon_{z}^{2}+y\varepsilon_{z}=x\ \ \ \ \ (22)
\displaystyle y^{\prime}-\varepsilon_{z}x^{\prime} \displaystyle = \displaystyle x\varepsilon_{z}+y-x\varepsilon_{z}+y\varepsilon_{z}^{2}=y \ \ \ \ \ (23)

The ranges of integration are still {\pm\infty}, so we end up with

\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle =\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x^{\prime},y^{\prime}\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime} \ \ \ \ \ (24)

Multiplying on the left by the bra {\left\langle x,y\right|} we have

\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left\langle x,y\left|x^{\prime},y^{\prime}\right.\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\delta\left(x-x^{\prime}\right)\delta\left(y-y^{\prime}\right)\left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (26)
\displaystyle \displaystyle = \displaystyle \left\langle x+\varepsilon_{z}y,y-\varepsilon_{z}x\left|\psi\right.\right\rangle \ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right) \ \ \ \ \ (28)

This can now be expanded in a 2-variable Taylor series to give, to first order in {\varepsilon_{z}}:

\displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)=\psi\left(x,y\right)+y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y} \ \ \ \ \ (29)

 

We can compare this with 11 inserted into 14:

\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle = \displaystyle \left\langle x,y\left|U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\right|\psi\right\rangle \ \ \ \ \ (30)
\displaystyle \displaystyle = \displaystyle \left\langle x,y\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle \ \ \ \ \ (31)
\displaystyle \displaystyle = \displaystyle \psi\left(x,y\right)-\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle \ \ \ \ \ (32)

Setting 32 equal to 29 we have

\displaystyle -\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle \displaystyle = \displaystyle y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y}\ \ \ \ \ (33)
\displaystyle \left\langle x,y\left|L_{z}\right|\psi\right\rangle \displaystyle = \displaystyle x\left(-i\hbar\frac{\partial\psi}{\partial y}\right)-y\left(-i\hbar\frac{\partial\psi}{\partial x}\right) \ \ \ \ \ (34)

Using the position-space forms of the momenta

\displaystyle P_{x} \displaystyle = \displaystyle -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (35)
\displaystyle P_{y} \displaystyle = \displaystyle -i\hbar\frac{\partial}{\partial y} \ \ \ \ \ (36)

we see that {L_{z}} is given by

\displaystyle L_{z}=XP_{y}-YP_{x} \ \ \ \ \ (37)

which is the quantum equivalent of the {z} component of angular momentum, as promised.

Translation invariance in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.1.1.

In preparation for an examination of rotation invariance, we’ll have a look at translational invariance in two dimensions. We can apply much of what we did with translation in one dimension, where we showed that the momentum {P} is the generator of translations. In particular, the translation operator {T\left(\varepsilon\right)} for an infinitesimal translation {\varepsilon} is

\displaystyle  T\left(\varepsilon\right)=I-\frac{i\varepsilon}{\hbar}P \ \ \ \ \ (1)

In two dimensions, we can write an infinitesimal translation as {\boldsymbol{\delta}a} where

\displaystyle  \boldsymbol{\delta}a=\delta a_{x}\hat{\mathbf{x}}+\delta a_{y}\hat{\mathbf{y}} \ \ \ \ \ (2)

In one dimension, we showed earlier that

\displaystyle  \left\langle x\left|T\left(\varepsilon\right)\right|\psi\right\rangle =\psi\left(x-\varepsilon\right) \ \ \ \ \ (3)

The analogous relation in two dimensions is

\displaystyle  \left\langle x,y\left|T\left(\boldsymbol{\delta}a\right)\right|\psi\right\rangle =\psi\left(x-\delta a_{x},y-\delta a_{y}\right) \ \ \ \ \ (4)

We can verify that the correct form for {T\left(\boldsymbol{\delta}a\right)} is

\displaystyle   T\left(\boldsymbol{\delta}a\right) \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\boldsymbol{\delta}a\cdot\mathbf{P}\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\left(\delta a_{x}P_{x}+\delta a_{y}P_{y}\right) \ \ \ \ \ (6)

Using the representation of momentum in the position basis, which is

\displaystyle   P_{x} \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (7)
\displaystyle  P_{y} \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial y} \ \ \ \ \ (8)

the LHS of 4 is, using {\left\langle x,y\left|\psi\right.\right\rangle =\psi\left(x,y\right)}:

\displaystyle   \left\langle x,y\left|T\left(\boldsymbol{\delta}a\right)\right|\psi\right\rangle \displaystyle  = \displaystyle  \left\langle x,y\left|I-\frac{i}{\hbar}\left(\delta a_{x}P_{x}+\delta a_{y}P_{y}\right)\right|\psi\right\rangle \ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(x,y\right)-\delta a_{x}\frac{\partial\psi}{\partial x}-\delta a_{y}\frac{\partial\psi}{\partial y} \ \ \ \ \ (10)

The last line is also what we get if we expand the RHS of 4 to first order in {\boldsymbol{\delta}a}, which verifies that 5 is correct, so that the two-dimensional momentum {\mathbf{P}} is the generator of two-dimensional translations.

We can apply the exponentiation technique we used in the one-dimensional case to obtain the translation operator for a finite translation in two dimensions. We need to be careful that we don’t run into problems with non-commuting operators, but in view of 7 and 8 and the fact that derivatives with respect to different independent variables commute, we see that

\displaystyle  \left[P_{x},P_{y}\right]=0 \ \ \ \ \ (11)

We can divide a finite translation {\mathbf{a}} into {N} small steps, each of size {\frac{\mathbf{a}}{N}}, so that the translation is

\displaystyle  T\left(\mathbf{a}\right)=\left(I-\frac{i}{\hbar N}\mathbf{a}\cdot\mathbf{P}\right)^{N} \ \ \ \ \ (12)

Because the two components of momentum commute, we can take the limit of this expression to get the exponential form:

\displaystyle  T\left(\mathbf{a}\right)=\lim_{N\rightarrow\infty}\left(I-\frac{i}{\hbar N}\mathbf{a}\cdot\mathbf{P}\right)^{N}=e^{-i\mathbf{a}\cdot\mathbf{P}/\hbar} \ \ \ \ \ (13)

Again, because the two components of momentum commute, we can combine two translations, by {\mathbf{a}} and then by {\mathbf{b}}, to get

\displaystyle  T\left(\mathbf{b}\right)T\left(\mathbf{a}\right)=e^{-i\mathbf{b}\cdot\mathbf{P}/\hbar}e^{-i\mathbf{a}\cdot\mathbf{P}/\hbar}=e^{-i\left(\mathbf{a}+\mathbf{b}\right)\cdot\mathbf{P}/\hbar}=T\left(\mathbf{b}+\mathbf{a}\right) \ \ \ \ \ (14)

Time reversal, antiunitary operators and Wigner’s theorem

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11, Section 11.5.

Zee, A. (2016), Group Theory in a Nutshell for Physicists. Section IV.6.

Parity is one of the two main discrete symmetries treated in non-relativistic quantum mechanics. The other is time reversal, which we’ll look at here.

First, we’ll have a look at what time reversal symmetry means in classical physics. The idea is that if we can take a snapshot of the system at some time, each particle will have a given position {x} and a given momentum {p}. If we reverse the direction of time at that instant, the particle’s position remains the same, but its momentum reverses. In other words {x\rightarrow x} and {p\rightarrow-p}. Note the difference between time reversal and parity: in a parity operation, both position and momentum get ‘reflected’ into their negative values, while in time reversal, only momentum gets ‘reflected’.

We can see how this works by looking at Newton’s law in the form

\displaystyle  F=m\frac{d^{2}x}{dt^{2}} \ \ \ \ \ (1)

Time reversal invariance is valid if the same equation holds when we reverse the direction of time, that is, we let {t\rightarrow-t}. Since {x\rightarrow x}, the numerator on the RHS is unchanged. For the denominator {t\rightarrow-t} means that {dt\rightarrow-dt} and {\left(dt\right)^{2}\rightarrow\left(-dt\right)^{2}=dt^{2}}, so the acceleration is invariant. Newton’s law is invariant under time reversal provided that the force on the LHS is invariant, which will be the case provided that {F} depends only on {x} and not on {\dot{x}}. This is true for forces such as Newtonian gravity and electrostatics, but is not true for the magnetic force felt by a charge {q} moving through a magnetic field {\mathbf{B}} with velocity {\mathbf{v}}, where the Lorentz force law holds:

\displaystyle  \mathbf{F}=q\mathbf{v}\times\mathbf{B} \ \ \ \ \ (2)

This follows because {\mathbf{v}\rightarrow-\mathbf{v}} so if the field {\mathbf{B}} is the same after time reversal, {\mathbf{F}\rightarrow-\mathbf{F}}. However, because all magnetic fields are produced by the motion of charges, if we expand the time reversal to include the charges giving rise to the magnetic field {\mathbf{B}}, then the motion of all these charges would reverse, which in turn would cause {\mathbf{B}\rightarrow-\mathbf{B}}. Thus if we time-reverse the entire electromagnetic system, the electromagnetic force is invariant under time reversal.

How does time reversal work in quantum mechanics? Shankar considers a particle in one dimension governed by a time-independent Hamiltonian, which obeys the Schrödinger equation, as usual:

\displaystyle  i\hbar\frac{\partial\psi\left(x,t\right)}{\partial t}=H\left(x\right)\psi\left(x,t\right) \ \ \ \ \ (3)

At this point, Shankar states that if we replace {\psi} by its complex conjugate {\psi^*}, we are implementing time reversal, claiming that it is ‘clear’ because {\psi^*} gives the same probability distribution as {\psi}. I cannot find any reason why this should be ‘clear’ from this statement, so let’s try looking at the problem in a bit more detail. The clearest explanation I’ve found is in Zee’s book, referenced above.

In order that the system be invariant under time reversal, we consider the transformation {t\rightarrow t^{\prime}=-t} and we wish to find some operator {T} which operates on the wave function {\psi\left(t\right)} so that

\displaystyle  T\psi\left(t\right)=\psi^{\prime}\left(t^{\prime}\right)=\psi^{\prime}\left(-t\right) \ \ \ \ \ (4)

[I’m suppressing the dependence on {x} for brevity; since time reversal doesn’t affect {x}, it stays the same throughout this argument] satisfies the Schrödinger equation in the form

\displaystyle  i\hbar\frac{\partial\psi^{\prime}\left(t^{\prime}\right)}{\partial t^{\prime}}=H\psi^{\prime}\left(t^{\prime}\right) \ \ \ \ \ (5)

From this, we get

\displaystyle  i\hbar\frac{\partial\left(T\psi\left(t\right)\right)}{\partial\left(-t\right)}=HT\psi\left(t\right) \ \ \ \ \ (6)

Whatever this unknown operator {T} is, it has an inverse, so we can multiply on the left by {T^{-1}} to get

\displaystyle  T^{-1}\left(-i\right)T\hbar\frac{\partial\psi\left(t\right)}{\partial t}=T^{-1}HT\psi\left(t\right) \ \ \ \ \ (7)

Notice that we’re not assuming that {T} has no effect on {i} (that is, we’re not assuming that we can pull {i} out of the expression on the LHS). Now we know that {T} has an effect only if what it operates on depends on time (since it’s the time reversal operator) so, since we’re assuming that {H} is time-independent, we must have {\left[H,T\right]=0}. Given this, we have

\displaystyle  T^{-1}HT=T^{-1}TH=H \ \ \ \ \ (8)

Thus, the RHS of 7 reduces to the RHS of the original Schrödinger equation 3. If the Schrödinger equation is to remain valid after time reversal, the LHS of 7 must also reduce to the LHS of 3. That is, we must have

\displaystyle  T^{-1}\left(-i\right)T=i \ \ \ \ \ (9)

Multiplying on the left by {T} we get

\displaystyle  -iT=Ti \ \ \ \ \ (10)

In other words, one of the effects of {T} is that it takes the complex conjugate of any expression that it operates on.

To find out exactly what {T} is, we can write it as the product of a unitary operator {U} and the operator {K}, whose only job is that it takes the complex conjugate. Since doing the complex conjugate operation twice in succession returns us to the original expression, {K^{2}=I}, so {K=K^{-1}}. We get

\displaystyle   T \displaystyle  = \displaystyle  UK\ \ \ \ \ (11)
\displaystyle  T^{-1} \displaystyle  = \displaystyle  K^{-1}U^{-1}=KU^{-1} \ \ \ \ \ (12)

Ordinary unitary operators are linear in the sense that {U\left(\alpha\psi\right)=\alpha U\psi}, where {\alpha} is a complex number and {\psi} is some function, with a similar relation holding for {U^{-1}}. Combining the above few equations, we have

\displaystyle   T^{-1}\left(-i\right)T \displaystyle  = \displaystyle  KU^{-1}\left(-i\right)UK\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  K\left(-i\right)U^{-1}UK\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  iK^{2}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  i \ \ \ \ \ (16)

Thus the most general form for {T} is some unitary operator {U} multiplied by the complex conjugate operator {K}. We can see that, for such an operator, and complex constants {\alpha}and {\beta} and functions {\psi} and {\phi}:

\displaystyle   T\left(\alpha\psi+\beta\phi\right) \displaystyle  = \displaystyle  UK\left(\alpha\psi+\beta\phi\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  U\left(\alpha^*K\psi+\beta^*K\phi\right)\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \alpha^*UK\psi+\beta^*UK\phi\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \alpha^*T\psi+\beta^*T\phi \ \ \ \ \ (20)

An operator that obeys this relation is called antilinear. The operator {T} has the additional property

\displaystyle   \left\langle T\psi\left|T\phi\right.\right\rangle \displaystyle  = \displaystyle  \left\langle UK\psi\left|UK\phi\right.\right\rangle \ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \left\langle U\psi\left|U\phi\right.\right\rangle ^*\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \psi\left|\phi\right.\right\rangle ^*\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \phi\left|\psi\right.\right\rangle \ \ \ \ \ (24)

The third line follows from the fact that a unitary operator preserves inner products. An antilinear operator that satisfies the condition {\left\langle T\psi\left|T\phi\right.\right\rangle =\left\langle \phi\left|\psi\right.\right\rangle } is called antiunitary. [The fact that time reversal is antiunitary was first derived by Eugene Wigner in 1932. A more general result, known as Wigner’s theorem, states that any symmetry in a quantum system must be represented by either a unitary or an antiunitary operator.]

To find {U} in this case, consider a plane wave state

\displaystyle  \psi\left(t\right)=e^{i\left(px-Et\right)/\hbar} \ \ \ \ \ (25)

Applying {T} to this state, we have

\displaystyle   T\psi\left(t\right) \displaystyle  = \displaystyle  UKe^{i\left(px-Et\right)/\hbar}\ \ \ \ \ (26)
\displaystyle  \displaystyle  = \displaystyle  Ue^{-i\left(px-Et\right)/\hbar} \ \ \ \ \ (27)

In one dimension, the only unitary operator {U} is a phase factor like {e^{i\alpha}} for some real {\alpha} (since {U} has to preserve the inner product). We can take {U=1} since the phase factor cancels out when calculating {\left|T\psi\left(t\right)\right|^{2}}. Going back to 4, we see that the time-reversed wave function is

\displaystyle   \psi^{\prime}\left(-t\right) \displaystyle  = \displaystyle  T\psi\left(t\right)=e^{-i\left(px-Et\right)/\hbar}\ \ \ \ \ (28)
\displaystyle  \psi^{\prime}\left(t\right) \displaystyle  = \displaystyle  e^{-i\left(px+Et\right)/\hbar}=e^{\left(-ipx-Et\right)/\hbar} \ \ \ \ \ (29)

Since this is the same as the original wave function except that {p\rightarrow-p}, we see that it is indeed a valid time-reversed wave function. The energy is the same (the {-Et} part of the exponent still has a minus sign) but the momentum has reversed, giving a wave that moves in the opposite direction.

Another way of looking at time reversal is as follows. Suppose we start with a system in the state {\psi\left(0\right)} at {t=0}. We can let it evolve for a time {\tau} using the propagator to get the state at time {t=\tau}:

\displaystyle  \psi\left(\tau\right)=e^{-iH\tau/\hbar}\psi\left(0\right) \ \ \ \ \ (30)

Applying time reversal via the operator {T} to this state, we have (we’re assuming that {H} is time-independent, but we’re allowing it to be complex)

\displaystyle  T\psi\left(\tau\right)=e^{iH^*\tau/\hbar}\psi^*\left(0\right) \ \ \ \ \ (31)

If we now evolve this time-reversed state through the same time {\tau}, we should end up back in the (time-reversed) original state if the system is invariant under time reversal. That is,

\displaystyle  \psi\left(2\tau\right)=e^{-iH\tau/\hbar}e^{iH^*\tau/\hbar}\psi^*\left(0\right)=\psi^*\left(0\right) \ \ \ \ \ (32)

[Note that we don’t require {\psi\left(2\tau\right)=\psi\left(0\right)} since {\psi\left(2\tau\right)} is the system in its time-reversed state, where it’s moving in the opposite direction to the original state. Think about time-reversing a bouncing ball. The ball becomes effectively time-reversed when it bounces. If the ball is travelling down at some speed {v} at a height {h}, then after bouncing (assuming an elastic bounce) it will be travelling at the same speed {v} when it bounces back to the height {h}, but it will be moving in the opposite direction.]

In this equation, we’re working in the {X} basis, so the exponents are numerical functions, not operators, and we’re free to combine the exponents without worrying about commutators. This means that in order for the system to be time-reversal invariant, we must have

\displaystyle  H\left(x\right)=H^*\left(x\right) \ \ \ \ \ (33)

In other words, the Hamiltonian must be real. The usual kinetic plus potential type of Hamiltonian satisfies this since it has the form

\displaystyle  H=\frac{P^{2}}{2m}+V\left(x\right) \ \ \ \ \ (34)

and although the quantum momentum operator is {P=-i\hbar\frac{d}{dx}}, its square is real. In the magnetic force case, the presence of the charge’s velocity as a linear term (in {q\mathbf{v}\times\mathbf{B}}) means the momentum operator occurs as a linear term, making {H} complex, so time reversal invariance doesn’t hold. Again, however, if we included the charges that give rise to the magnetic field, the discrepancy disappears.