Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are on the welcome page.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. Details of the number of visits and distinct visitors are given on the hit statistics page.

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

I should point out that although I did study physics at the university level, this was back in the 1970s and by the time I started this blog in December 2010, I had forgotten pretty much everything I had learned back then. This blog represents my journey back to some level of literacy in physics. I am by no means a professional physicist or an authority on any aspect of the subject. I offer this blog as a record of my own notes and problem solutions as I worked through various books, in the hope that it will help, and possibly even inspire, others to explore this wonderful subject.

Before leaving a comment, you may find it useful to read the “Instructions for commenters“.

Combining translations and rotations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

When it comes to symmetries in quantum mechanics, we’ve looked at translations and rotations in two dimensions, and found that the generators are the momenta {P_{x}} and {P_{y}} for translations, and the angular momentum {L_{z}} for rotations.

From the fact that {L_{z}} does not commute with either momentum or position operators, you might guess that if we performed some sequence of translations and rotations on a system that the order in which these operations are done matters. In fact, you can see this by considering simple two-dimensional geometry, without reference to quantum mechanics. Consider the {x} and {y} axes on a sheet of graph paper. First, translate these axes by adding the vector {\mathbf{r}} to all points, so that the new origin of coordinates lies at position {\mathbf{r}} as referenced in the original coordinates. Next, do a rotation about the original origin by some angle {\phi}. This will move the new origin around the original {z} axis. Now, do the inverse of the original translation by adding {-\mathbf{r}} to all points. Finally, do the inverse of the rotation by rotating the system by {-\phi} around the original {z} axis. You’ll find that the {xy} axes that have undergone this sequence of transformations does not coincide with the original {xy} axes. However, if you did the same set of four transformations in the order: translate by {\mathbf{r}}, translate by {-\mathbf{r}}, rotate by {\phi}, rotate by {-\phi}, the transformed axes would coincide with the original axes.

To see how this works in quantum mechanics, we can again consider infinitesimal translations and rotations. If we start with a point at location {\left[x,y\right]} and apply the four transformations described above, but now for an infinitesimal translation {\boldsymbol{\varepsilon}=\varepsilon_{x}\hat{\mathbf{x}}+\varepsilon_{y}\hat{\mathbf{y}}} and rotation {\varepsilon_{z}\hat{\mathbf{z}}}, then the successive transformations work as follows. In each case, we’ll retain terms up to order {\varepsilon_{x}\varepsilon_{z}} and {\varepsilon_{y}\varepsilon_{z}} but discard terms of order {\varepsilon_{x}^{2}}, {\varepsilon_{y}^{2}}, {\varepsilon_{z}^{2}} and higher. [I’m not quite sure of the rationale that allows us to do this, apart from the fact that it gives the right answer.]

\displaystyle   \left[\begin{array}{c} x\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right]\ \ \ \ \ (1)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (2)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}-\varepsilon_{x}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\varepsilon_{y} \end{array}\right]\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (4)
\displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}+\left[y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}\right]\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\left[x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x-\varepsilon_{y}\varepsilon_{z}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (6)

Thus, to this order in the infinitesimals, the combination of translation-rotation-translation-rotation is equivalent to a single translation by a distance {\left[-\varepsilon_{y}\varepsilon_{z},\varepsilon_{x}\varepsilon_{z}\right]}. We can write this in terms of the unitary quantum operators for translations and rotations as

\displaystyle  U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(-\varepsilon_{y}\varepsilon_{z}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (7)

Using the forms of these operators for infinitesimal transformations, we can expand both sides to give

\displaystyle   \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I+\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right]\times
\displaystyle  \left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I-\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right] \displaystyle  = \ \ \ \ \ (8) \displaystyle  I-\frac{i}{\hbar}\left(-\varepsilon_{y}\varepsilon_{z}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right)

Since the infinitesimal displacements are arbitrary, this equation can be valid only if the coefficients of each combination of {\varepsilon_{x},\varepsilon_{y}} and {\varepsilon_{z}} are equal on both sides. As above, we’ll discard any terms of order {\varepsilon_{x}^{2}}, {\varepsilon_{y}^{2}}, {\varepsilon_{z}^{2}} and higher. The algebra is straightforward although a bit tedious, so I’ll just give a couple of examples here.

The coefficient of {\varepsilon_{z}} on its own is, on the LHS

\displaystyle  \frac{i\varepsilon_{z}}{\hbar}L_{z}-\frac{i\varepsilon_{z}}{\hbar}L_{z}=0 \ \ \ \ \ (9)

On the RHS, there is no term in {\varepsilon_{z}}, so we get 0 on the RHS. In this case, we see the equation is consistent.

For the {\varepsilon_{x}\varepsilon_{z}} term, we get on the LHS:

\displaystyle  \varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left(L_{z}P_{x}-L_{z}P_{x}-P_{x}L_{z}+L_{z}P_{x}\right)=-\varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left[P_{x},L_{z}\right] \ \ \ \ \ (10)

On the RHS, the term is

\displaystyle  -\frac{i}{\hbar}\varepsilon_{x}\varepsilon_{z}P_{y} \ \ \ \ \ (11)

Thus the condition here becomes

\displaystyle  \left[P_{x},L_{z}\right]=-i\hbar P_{y} \ \ \ \ \ (12)

which agrees with the commutation relation we found earlier. By considering the coefficient of {\varepsilon_{y}\varepsilon_{z}}, we arrive at the other condition, which is

\displaystyle  \left[P_{y},L_{z}\right]=i\hbar P_{x} \ \ \ \ \ (13)

The result of this calculation doesn’t tell us anything new about the translation or rotation operators, but it does show that the condition 7 is consistent with what we already know about the commutators of position, momentum and angular momentum.

As Shankar points out, we might think that we need to verify the conditions for an infinite number of combinations of rotations and translations, since each such combination gives rise to a different overall transformation. He says that it has actually been shown that the example above is sufficient to guarantee that all such combinations do in fact give valid results, although he doesn’t give the details. We are, however, given the exercise of verifying this claim for one special case, which we’ll consider now.

In this example, we’ll consider the same four transformations, in the same order, as above except that we’ll take the translation to be entirely in the {x} direction so that {\varepsilon_{y}=0}. This time, we’ll retain terms up to {\varepsilon_{x}\varepsilon_{z}^{2}} and see what we get. We start by repeating the calculations in 1 through 6. However, because we’re saving higher order terms, we need to represent the infinitesimal rotations by

\displaystyle  R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} 1-\frac{\varepsilon_{z}^{2}}{2} & -\varepsilon_{z}\\ \varepsilon_{z} & 1-\frac{\varepsilon_{z}^{2}}{2} \end{array}\right] \ \ \ \ \ (14)

That is, we’re approximating {\cos\varepsilon_{z}} by the first two terms in its expansion. Using this, we have

\displaystyle   \left[\begin{array}{c} x\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right]\ \ \ \ \ (15)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (16)
\displaystyle  \left[\begin{array}{c} x+\varepsilon_{x}-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)} \displaystyle  \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\varepsilon_{x}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z} \end{array}\right]\ \ \ \ \ (18)
\displaystyle  \left[\begin{array}{c} x-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right] \displaystyle  {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)} \displaystyle  \left[\begin{array}{c} \left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\varepsilon_{z}\\ \left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-\left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} x\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}+\frac{1}{4}\varepsilon_{x}\varepsilon_{z}^{4}\\ y\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (20)

To get the last line, I used Maple to do the algebra in multiplying out the terms. At this point, we can neglect the terms in {\varepsilon_{z}^{4}}, leaving us with the overall transformation:

\displaystyle  \left[\begin{array}{c} x\\ y \end{array}\right]\longrightarrow\left[\begin{array}{c} x+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (21)

This is equivalent to a translation by {\boldsymbol{\varepsilon}=\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}}, so by analogy with 7, we have the condition

\displaystyle  U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (22)

To expand the operators on the LHS and retain terms up to {\varepsilon_{x}\varepsilon_{z}^{2}}, we need to expand the rotation operators up to order {\varepsilon_{z}^{2}}. Treating the rotation operator as an exponential, this expansion is

\displaystyle  R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}+\ldots \ \ \ \ \ (23)

Using this approximation gives us

\displaystyle   \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I+\frac{i}{\hbar}\varepsilon_{x}P_{x}\right]\left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I-\frac{i}{\hbar}\varepsilon_{x}P_{x}\right] \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right) \ \ \ \ \ (24)

By equating the coefficients of {\varepsilon_{x}\varepsilon_{z}} we regain 12, so that condition checks out.

Extracting the coefficient of {\varepsilon_{x}\varepsilon_{z}^{2}} on the LHS gives

\displaystyle   \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}-\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}-\frac{L_{z}^{2}P_{x}}{2}+L_{z}^{2}P_{x}\right) \displaystyle  = \displaystyle  \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}\right) \ \ \ \ \ (25)

Matching this to the {\varepsilon_{x}\varepsilon_{z}^{2}} term on the RHS of 24, we get the condition specified in Shankar’s problem:

\displaystyle  -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2}=\hbar^{2}P_{x} \ \ \ \ \ (26)

We can show that this condition reduces to the already-known commutators by using the identity

\displaystyle   \left[\Lambda,\left[\Lambda,\Omega\right]\right] \displaystyle  = \displaystyle  \Lambda\left(\Lambda\Omega-\Omega\Lambda\right)-\left(\Lambda\Omega-\Omega\Lambda\right)\Lambda\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  -2\Lambda\Omega\Lambda+\Lambda^{2}\Omega+\Omega\Lambda^{2} \ \ \ \ \ (28)

Applying this to 26 we have

\displaystyle   -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2} \displaystyle  = \displaystyle  \left[L_{z},\left[L_{z},P_{x}\right]\right]\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  i\hbar\left[L_{z},P_{y}\right]\ \ \ \ \ (30)
\displaystyle  \displaystyle  = \displaystyle  i\hbar\left(-i\hbar P_{x}\right)\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  \hbar^{2}P_{x} \ \ \ \ \ (32)

Thus the more complicated condition 26 actually reduces to existing commutators.

Rotations through a finite angle; use of polar coordinates

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.3.

The angluar momentum operator {L_{z}} is the generator of rotations in the {xy} plane. We did the derivation for infinitesimal rotations, but we can generalize this to finite rotations in a similar manner to that used for translations. The unitary transformation for an infinitesimal rotation is

\displaystyle  U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)

For rotation through a finite angle {\phi_{0}}, we divide up the angle into {N} small angles, so {\varepsilon_{z}=\phi_{0}/N}. Rotation through the full angle {\phi_{0}} is then given by

\displaystyle  U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]=\lim_{N\rightarrow\infty}\left(I-\frac{i\phi_{0}L_{z}}{N\hbar}\right)^{N}=e^{-i\phi_{0}L_{z}/\hbar} \ \ \ \ \ (2)

The limit follows because the only non-trivial operator involved is {L_{z}}, so no commutation problems arise.

In rectangular coordinates, {L_{z}} has the relatively non-obvious form

\displaystyle   L_{z} \displaystyle  = \displaystyle  XP_{y}-YP_{x}\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\left(x\frac{\partial}{\partial y}-y\frac{\partial}{\partial x}\right) \ \ \ \ \ (4)

so it’s not immediately clear that 2 does in fact lead to the desired rotation. Trying to calculate the exponential with {L_{z}} expressed this way is not easy, given that the two terms {x\frac{\partial}{\partial y}} and {y\frac{\partial}{\partial x}} don’t commute.

It turns out that {L_{z}} has a much simpler form in polar coordinates, and there are two ways of converting it to polar form. First, we recall the transformation equations.

\displaystyle   x \displaystyle  = \displaystyle  \rho\cos\phi\ \ \ \ \ (5)
\displaystyle  y \displaystyle  = \displaystyle  \rho\sin\phi\ \ \ \ \ (6)
\displaystyle  \rho \displaystyle  = \displaystyle  \sqrt{x^{2}+y^{2}}\ \ \ \ \ (7)
\displaystyle  \phi \displaystyle  = \displaystyle  \tan^{-1}\frac{y}{x} \ \ \ \ \ (8)

From the chain rule, we can convert the derivatives:

\displaystyle   \frac{\partial}{\partial x} \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}+\frac{\partial\cos\phi}{\partial x}\frac{\partial}{\partial\left(\cos\phi\right)}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}-\sin\phi\frac{\partial\phi}{\partial x}\frac{\partial}{\left(-\sin\phi\right)\partial\phi}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \frac{x}{\rho}\frac{\partial}{\partial\rho}-\sin\phi\frac{-y/x^{2}}{1+y^{2}/x^{2}}\left(\frac{-1}{\sin\phi}\right)\frac{\partial}{\partial\phi}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (12)

Using similar methods, we get for the other derivative

\displaystyle   \frac{\partial}{\partial y} \displaystyle  = \displaystyle  \frac{\partial\rho}{\partial y}\frac{\partial}{\partial\rho}+\frac{\partial\sin\phi}{\partial x}\frac{\partial}{\partial\left(\sin\phi\right)}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (14)

Plugging these into 4 we have

\displaystyle   L_{z} \displaystyle  = \displaystyle  -i\hbar\left[x\left(\frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi}\right)-y\left(\frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi}\right)\right]\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\frac{x^{2}+y^{2}}{\rho^{2}}\frac{\partial}{\partial\phi}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (17)

Another method of converting {L_{z}} to polar coordinates is to consider the effect of {U\left[R\right]} for an infinitesimal rotation {\varepsilon_{z}} on a state vector expressed in polar coordinates {\psi\left(\rho,\phi\right)}. Shankar states that

\displaystyle  \left\langle \rho,\phi\left|U\left[R\right]\right|\psi\left(\rho,\phi\right)\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (18)

If you don’t believe this, it can be shown using a method similar to that for the one-dimensional translation. In this case, we’re dealing with position eigenkets in polar coordinates, so we have

\displaystyle  U\left[R\right]\left|\rho,\phi\right\rangle =\left|\rho,\phi+\varepsilon_{z}\right\rangle \ \ \ \ \ (19)

Applying this, we get

\displaystyle   \left|\psi_{\varepsilon_{z}}\right\rangle \displaystyle  = \displaystyle  U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  U\left[R\right]\int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi+\varepsilon_{z}\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho^{\prime},\phi^{\prime}\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime} \ \ \ \ \ (23)

where in the last line, we used the substitution {\phi^{\prime}=\phi+\varepsilon_{z}}. (The substitution {\rho^{\prime}=\rho} is used just to give the radial variable a different name in the integrand.) We can use the same limits of integration for {\phi} and {\phi^{\prime}}, since we just need to ensure that the integral covers the total range of angles. It then follows that

\displaystyle   \left\langle \rho,\phi\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\left\langle \rho,\phi\left|\rho^{\prime},\phi^{\prime}\right.\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \int_{0}^{2\pi}\int_{0}^{\infty}\delta\left(\rho-\rho^{\prime}\right)\delta\left(\phi-\phi^{\prime}\right)\left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (26)

Combining this with 1 we have

\displaystyle  \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (27)

Expanding the RHS to order {\varepsilon_{z}} we have

\displaystyle  \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi\right)-\varepsilon_{z}\frac{\partial\psi}{\partial\phi} \ \ \ \ \ (28)

from which 17 follows again.

Once we have {L_{z}} in this form, the exponential form of a finite rotation is easier to interpret, for we have, from 2

\displaystyle   e^{-i\phi_{0}L_{z}/\hbar} \displaystyle  = \displaystyle  \exp\left[-\phi_{0}\frac{\partial}{\partial\phi}\right]\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  1-\phi_{0}\frac{\partial}{\partial\phi}+\frac{\phi_{0}^{2}}{2!}\frac{\partial^{2}}{\partial\phi^{2}}+\ldots \ \ \ \ \ (30)

Applying this to a state function {\psi\left(\rho,\phi\right)}, we see that we get the Taylor series for {\psi\left(\rho,\phi-\phi_{0}\right)}, so the exponential does indeed represent a rotation through a finite angle.

Rotational transformations using passive transformations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.2.

We can also derive the generator of rotations {L_{z}} by considering passive transformations of the position and momentum operators, in a way similar to that used for deriving the generator of translations. In a passive transformation, the operators are modified while the state vectors remain the same. For an infinitesimal rotation {\varepsilon_{z}\hat{\mathbf{z}}} about the {z} axis in two dimensions, the unitary operator has the form

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)


For a finite rotation by {\phi_{0}\hat{\mathbf{z}}} the transformations are given by

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (2)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (3)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (4)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (5)

For the infinitesimal transformation, {\phi_{0}=\varepsilon_{z}} and these equations reduce to

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle -\left\langle Y\right\rangle \varepsilon_{z}\ \ \ \ \ (6)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \varepsilon_{z}+\left\langle Y\right\rangle \ \ \ \ \ (7)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle -\left\langle P_{y}\right\rangle \varepsilon_{z}\ \ \ \ \ (8)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \varepsilon_{z}+\left\langle P_{y}\right\rangle \ \ \ \ \ (9)

In the passive transformation scheme, we move the transformation to the operators to get

\displaystyle U^{\dagger}\left[R\right]XU\left[R\right] \displaystyle = \displaystyle X-Y\varepsilon_{z}\ \ \ \ \ (10)
\displaystyle U^{\dagger}\left[R\right]YU\left[R\right] \displaystyle = \displaystyle X\varepsilon_{z}+Y\ \ \ \ \ (11)
\displaystyle U^{\dagger}\left[R\right]P_{x}U\left[R\right] \displaystyle = \displaystyle P_{x}-P_{y}\varepsilon_{z}\ \ \ \ \ (12)
\displaystyle U^{\dagger}\left[R\right]P_{y}U\left[R\right] \displaystyle = \displaystyle P_{x}\varepsilon_{z}+P_{y} \ \ \ \ \ (13)

Substituting 1 into these equations gives us the commutation relations satisfied by {L_{z}}. For example, in the first equation we have

\displaystyle U^{\dagger}\left[R\right]XU\left[R\right] \displaystyle = \displaystyle \left(I+\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)X\left(I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle X+\frac{i\varepsilon_{z}}{\hbar}\left(L_{z}X-XL_{z}\right)\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle X-Y\varepsilon_{z} \ \ \ \ \ (16)

Equating the last two lines, we get

\displaystyle \left[X,L_{z}\right]=-i\hbar Y \ \ \ \ \ (17)


Similarly, for the other three equations we get

\displaystyle \left[Y,L_{z}\right] \displaystyle = \displaystyle i\hbar X\ \ \ \ \ (18)
\displaystyle \left[P_{x},L_{z}\right] \displaystyle = \displaystyle -i\hbar P_{y}\ \ \ \ \ (19)
\displaystyle \left[P_{y},L_{z}\right] \displaystyle = \displaystyle i\hbar P_{x} \ \ \ \ \ (20)

We can use these commutation relations to derive the form of {L_{z}} by using the commutation relations for coordinates and momenta:

\displaystyle \left[X,P_{x}\right]=\left[Y,P_{y}\right]=i\hbar \ \ \ \ \ (21)

with all other commutators involving {X,Y,P_{x}} and {P_{y}} being zero. Starting with 17, we see that

\displaystyle \left[X,L_{z}\right]=-\left[X,P_{x}\right]Y \ \ \ \ \ (22)

We can therefore deduce that

\displaystyle L_{z}=-P_{x}Y+f\left(X,Y,P_{y}\right) \ \ \ \ \ (23)


where {f} is some unknown function. We must include {f} since the commutators of {X} with {X,Y} and {P_{y}} are all zero, so adding on {f} still satisfies 17. (You can think of it as similar to adding on the constant in an indefinite integral.)

Now from 18, we have

\displaystyle \left[Y,L_{z}\right]=\left[Y,P_{y}\right]X \ \ \ \ \ (24)

so combining this with 23 we have

\displaystyle L_{z}=-P_{x}Y+P_{y}X+g\left(X,Y\right) \ \ \ \ \ (25)


The undetermined function is now a function only of {X} and {Y}, since the dependence of {L_{z}} on {P_{x}} and {P_{y}} has been determined uniquely by the commutators 17 and 18.

From 19 we have

\displaystyle \left[P_{x},L_{z}\right]=\left[P_{x},X\right]P_{y} \ \ \ \ \ (26)

We can see that this is satisfied already by 25, except that we now know that the function {g} cannot depend on {X}, since then {\left[P_{x},g\right]\ne0}. Thus we have narrowed down {L_{z}} to

\displaystyle L_{z}=-P_{x}Y+P_{y}X+h\left(Y\right) \ \ \ \ \ (27)


Finally, from 20 we have

\displaystyle \left[P_{y},L_{z}\right]=-\left[P_{y},Y\right]P_{x} \ \ \ \ \ (28)

This is satisfied by 27 if we take {h=0} (well, technically, we could take {h} to be some constant, but we might as well take the constant to be zero), giving us the final form for {L_{z}}:

\displaystyle L_{z}=-P_{x}Y+P_{y}X \ \ \ \ \ (29)

Rotational invariance in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.1.

As a first look at rotational invariance in quantum mechanics, we’ll look at two-dimensional rotations about the {z} axis. Classically, a rotation by an angle {\phi_{0}} about the {z} axis is given by the matrix equation for the coordinates

\displaystyle \left[\begin{array}{c} \bar{x}\\ \bar{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} x\\ y \end{array}\right] \ \ \ \ \ (1)

The momenta transform the same way, since we are merely changing the direction of the {x} and {y} axes. Thus we have also

\displaystyle \left[\begin{array}{c} \bar{p}_{x}\\ \bar{p}_{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} p_{x}\\ p_{y} \end{array}\right] \ \ \ \ \ (2)

The rotation matrix can be written as an operator, defined as

\displaystyle R\left(\phi_{0}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right] \ \ \ \ \ (3)

In quantum mechanics, due to the uncertainty principle we cannot specify position and momentum precisely at the same time, so as with the case of translational invariance, we deal with expectation values. As usual, a rotation is represented by a unitary operator {U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]} so that a quantum state transforms according to

\displaystyle \left|\psi\right\rangle \rightarrow\left|\psi_{R}\right\rangle =U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (4)

Dealing with expectation values means that the rotation operator must satisfy

\displaystyle \left\langle X\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (5)
\displaystyle \left\langle Y\right\rangle _{R} \displaystyle = \displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (6)
\displaystyle \left\langle P_{x}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (7)
\displaystyle \left\langle P_{y}\right\rangle _{R} \displaystyle = \displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (8)

The expectation values on the LHS of these equations are calculated using the rotated state, so that

\displaystyle \left\langle X\right\rangle _{R}=\left\langle \psi_{R}\left|X\right|\psi_{R}\right\rangle \ \ \ \ \ (9)

and so on.

In two dimensions, the position eigenkets depend on the two independent coordinates {x} and {y}, and each of these eigenkets transforms under rotation in the same way the position variables above. Operating on such an eigenket with the unitary rotation operator thus must give

\displaystyle U\left[R\right]\left|x,y\right\rangle =\left|x\cos\phi_{0}-y\sin\phi_{0},x\sin\phi_{0}+y\cos\phi_{0}\right\rangle \ \ \ \ \ (10)


As with the translation operator, we try to construct an explicity form for {U\left[R\right]} by considering an infinitesimal rotation {\varepsilon_{z}\hat{\mathbf{z}}} about the {z} axis. We propose that the unitary operator for this rotation is given by

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (11)


where {L_{z}} is, at this stage, an unknown operator called the generator of infinitesimal rotations (although, as the notation suggests, it will turn out to be the {z} component of angular momentum). Under this rotation, we have, to first order in {\varepsilon_{z}}:

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (12)

Note that we’ve omitted a possible phase factor in this rotation. That is, we could have written

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =e^{i\varepsilon_{z}g\left(x,y\right)/\hbar}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (13)

for some real function {g\left(x,y\right)}. Dropping the phase factor has the effect of making the momentum expectation values transform in the same way as the position expectaton values, as shown by Shankar in his equation 12.2.13, so we’ll just take the phase factor to be 1 from now on.

We can now find the position space form of a general state vector {\left|\psi\right\rangle } under an infinitesimal rotation by following a similar procedure to that for a translation.

We have

\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle \displaystyle = \displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|\psi\right\rangle \ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (16)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy \ \ \ \ \ (17)

We can now change integration variables if we define

\displaystyle x^{\prime} \displaystyle \equiv \displaystyle x-y\varepsilon_{z}\ \ \ \ \ (18)
\displaystyle y^{\prime} \displaystyle = \displaystyle x\varepsilon_{z}+y \ \ \ \ \ (19)

The differentials transform by considering terms only up to first order in infinitesimal quantities, so we have

\displaystyle dx^{\prime} \displaystyle = \displaystyle dx-\varepsilon_{z}dy=dx\ \ \ \ \ (20)
\displaystyle dy^{\prime} \displaystyle = \displaystyle \varepsilon_{z}dx+dy=dy \ \ \ \ \ (21)

Also, to first order in infinitesimal quantities, we can invert the variables to get

\displaystyle x^{\prime}+\varepsilon_{z}y^{\prime} \displaystyle = \displaystyle x-y\varepsilon_{z}+x\varepsilon_{z}^{2}+y\varepsilon_{z}=x\ \ \ \ \ (22)
\displaystyle y^{\prime}-\varepsilon_{z}x^{\prime} \displaystyle = \displaystyle x\varepsilon_{z}+y-x\varepsilon_{z}+y\varepsilon_{z}^{2}=y \ \ \ \ \ (23)

The ranges of integration are still {\pm\infty}, so we end up with

\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle =\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x^{\prime},y^{\prime}\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime} \ \ \ \ \ (24)

Multiplying on the left by the bra {\left\langle x,y\right|} we have

\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left\langle x,y\left|x^{\prime},y^{\prime}\right.\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\delta\left(x-x^{\prime}\right)\delta\left(y-y^{\prime}\right)\left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (26)
\displaystyle \displaystyle = \displaystyle \left\langle x+\varepsilon_{z}y,y-\varepsilon_{z}x\left|\psi\right.\right\rangle \ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right) \ \ \ \ \ (28)

This can now be expanded in a 2-variable Taylor series to give, to first order in {\varepsilon_{z}}:

\displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)=\psi\left(x,y\right)+y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y} \ \ \ \ \ (29)


We can compare this with 11 inserted into 14:

\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle \displaystyle = \displaystyle \left\langle x,y\left|U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\right|\psi\right\rangle \ \ \ \ \ (30)
\displaystyle \displaystyle = \displaystyle \left\langle x,y\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle \ \ \ \ \ (31)
\displaystyle \displaystyle = \displaystyle \psi\left(x,y\right)-\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle \ \ \ \ \ (32)

Setting 32 equal to 29 we have

\displaystyle -\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle \displaystyle = \displaystyle y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y}\ \ \ \ \ (33)
\displaystyle \left\langle x,y\left|L_{z}\right|\psi\right\rangle \displaystyle = \displaystyle x\left(-i\hbar\frac{\partial\psi}{\partial y}\right)-y\left(-i\hbar\frac{\partial\psi}{\partial x}\right) \ \ \ \ \ (34)

Using the position-space forms of the momenta

\displaystyle P_{x} \displaystyle = \displaystyle -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (35)
\displaystyle P_{y} \displaystyle = \displaystyle -i\hbar\frac{\partial}{\partial y} \ \ \ \ \ (36)

we see that {L_{z}} is given by

\displaystyle L_{z}=XP_{y}-YP_{x} \ \ \ \ \ (37)

which is the quantum equivalent of the {z} component of angular momentum, as promised.

Translation invariance in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.1.1.

In preparation for an examination of rotation invariance, we’ll have a look at translational invariance in two dimensions. We can apply much of what we did with translation in one dimension, where we showed that the momentum {P} is the generator of translations. In particular, the translation operator {T\left(\varepsilon\right)} for an infinitesimal translation {\varepsilon} is

\displaystyle  T\left(\varepsilon\right)=I-\frac{i\varepsilon}{\hbar}P \ \ \ \ \ (1)

In two dimensions, we can write an infinitesimal translation as {\boldsymbol{\delta}a} where

\displaystyle  \boldsymbol{\delta}a=\delta a_{x}\hat{\mathbf{x}}+\delta a_{y}\hat{\mathbf{y}} \ \ \ \ \ (2)

In one dimension, we showed earlier that

\displaystyle  \left\langle x\left|T\left(\varepsilon\right)\right|\psi\right\rangle =\psi\left(x-\varepsilon\right) \ \ \ \ \ (3)

The analogous relation in two dimensions is

\displaystyle  \left\langle x,y\left|T\left(\boldsymbol{\delta}a\right)\right|\psi\right\rangle =\psi\left(x-\delta a_{x},y-\delta a_{y}\right) \ \ \ \ \ (4)

We can verify that the correct form for {T\left(\boldsymbol{\delta}a\right)} is

\displaystyle   T\left(\boldsymbol{\delta}a\right) \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\boldsymbol{\delta}a\cdot\mathbf{P}\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  I-\frac{i}{\hbar}\left(\delta a_{x}P_{x}+\delta a_{y}P_{y}\right) \ \ \ \ \ (6)

Using the representation of momentum in the position basis, which is

\displaystyle   P_{x} \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (7)
\displaystyle  P_{y} \displaystyle  = \displaystyle  -i\hbar\frac{\partial}{\partial y} \ \ \ \ \ (8)

the LHS of 4 is, using {\left\langle x,y\left|\psi\right.\right\rangle =\psi\left(x,y\right)}:

\displaystyle   \left\langle x,y\left|T\left(\boldsymbol{\delta}a\right)\right|\psi\right\rangle \displaystyle  = \displaystyle  \left\langle x,y\left|I-\frac{i}{\hbar}\left(\delta a_{x}P_{x}+\delta a_{y}P_{y}\right)\right|\psi\right\rangle \ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(x,y\right)-\delta a_{x}\frac{\partial\psi}{\partial x}-\delta a_{y}\frac{\partial\psi}{\partial y} \ \ \ \ \ (10)

The last line is also what we get if we expand the RHS of 4 to first order in {\boldsymbol{\delta}a}, which verifies that 5 is correct, so that the two-dimensional momentum {\mathbf{P}} is the generator of two-dimensional translations.

We can apply the exponentiation technique we used in the one-dimensional case to obtain the translation operator for a finite translation in two dimensions. We need to be careful that we don’t run into problems with non-commuting operators, but in view of 7 and 8 and the fact that derivatives with respect to different independent variables commute, we see that

\displaystyle  \left[P_{x},P_{y}\right]=0 \ \ \ \ \ (11)

We can divide a finite translation {\mathbf{a}} into {N} small steps, each of size {\frac{\mathbf{a}}{N}}, so that the translation is

\displaystyle  T\left(\mathbf{a}\right)=\left(I-\frac{i}{\hbar N}\mathbf{a}\cdot\mathbf{P}\right)^{N} \ \ \ \ \ (12)

Because the two components of momentum commute, we can take the limit of this expression to get the exponential form:

\displaystyle  T\left(\mathbf{a}\right)=\lim_{N\rightarrow\infty}\left(I-\frac{i}{\hbar N}\mathbf{a}\cdot\mathbf{P}\right)^{N}=e^{-i\mathbf{a}\cdot\mathbf{P}/\hbar} \ \ \ \ \ (13)

Again, because the two components of momentum commute, we can combine two translations, by {\mathbf{a}} and then by {\mathbf{b}}, to get

\displaystyle  T\left(\mathbf{b}\right)T\left(\mathbf{a}\right)=e^{-i\mathbf{b}\cdot\mathbf{P}/\hbar}e^{-i\mathbf{a}\cdot\mathbf{P}/\hbar}=e^{-i\left(\mathbf{a}+\mathbf{b}\right)\cdot\mathbf{P}/\hbar}=T\left(\mathbf{b}+\mathbf{a}\right) \ \ \ \ \ (14)

Time reversal, antiunitary operators and Wigner’s theorem

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11, Section 11.5.

Zee, A. (2016), Group Theory in a Nutshell for Physicists. Section IV.6.

Parity is one of the two main discrete symmetries treated in non-relativistic quantum mechanics. The other is time reversal, which we’ll look at here.

First, we’ll have a look at what time reversal symmetry means in classical physics. The idea is that if we can take a snapshot of the system at some time, each particle will have a given position {x} and a given momentum {p}. If we reverse the direction of time at that instant, the particle’s position remains the same, but its momentum reverses. In other words {x\rightarrow x} and {p\rightarrow-p}. Note the difference between time reversal and parity: in a parity operation, both position and momentum get ‘reflected’ into their negative values, while in time reversal, only momentum gets ‘reflected’.

We can see how this works by looking at Newton’s law in the form

\displaystyle  F=m\frac{d^{2}x}{dt^{2}} \ \ \ \ \ (1)

Time reversal invariance is valid if the same equation holds when we reverse the direction of time, that is, we let {t\rightarrow-t}. Since {x\rightarrow x}, the numerator on the RHS is unchanged. For the denominator {t\rightarrow-t} means that {dt\rightarrow-dt} and {\left(dt\right)^{2}\rightarrow\left(-dt\right)^{2}=dt^{2}}, so the acceleration is invariant. Newton’s law is invariant under time reversal provided that the force on the LHS is invariant, which will be the case provided that {F} depends only on {x} and not on {\dot{x}}. This is true for forces such as Newtonian gravity and electrostatics, but is not true for the magnetic force felt by a charge {q} moving through a magnetic field {\mathbf{B}} with velocity {\mathbf{v}}, where the Lorentz force law holds:

\displaystyle  \mathbf{F}=q\mathbf{v}\times\mathbf{B} \ \ \ \ \ (2)

This follows because {\mathbf{v}\rightarrow-\mathbf{v}} so if the field {\mathbf{B}} is the same after time reversal, {\mathbf{F}\rightarrow-\mathbf{F}}. However, because all magnetic fields are produced by the motion of charges, if we expand the time reversal to include the charges giving rise to the magnetic field {\mathbf{B}}, then the motion of all these charges would reverse, which in turn would cause {\mathbf{B}\rightarrow-\mathbf{B}}. Thus if we time-reverse the entire electromagnetic system, the electromagnetic force is invariant under time reversal.

How does time reversal work in quantum mechanics? Shankar considers a particle in one dimension governed by a time-independent Hamiltonian, which obeys the Schrödinger equation, as usual:

\displaystyle  i\hbar\frac{\partial\psi\left(x,t\right)}{\partial t}=H\left(x\right)\psi\left(x,t\right) \ \ \ \ \ (3)

At this point, Shankar states that if we replace {\psi} by its complex conjugate {\psi^*}, we are implementing time reversal, claiming that it is ‘clear’ because {\psi^*} gives the same probability distribution as {\psi}. I cannot find any reason why this should be ‘clear’ from this statement, so let’s try looking at the problem in a bit more detail. The clearest explanation I’ve found is in Zee’s book, referenced above.

In order that the system be invariant under time reversal, we consider the transformation {t\rightarrow t^{\prime}=-t} and we wish to find some operator {T} which operates on the wave function {\psi\left(t\right)} so that

\displaystyle  T\psi\left(t\right)=\psi^{\prime}\left(t^{\prime}\right)=\psi^{\prime}\left(-t\right) \ \ \ \ \ (4)

[I’m suppressing the dependence on {x} for brevity; since time reversal doesn’t affect {x}, it stays the same throughout this argument] satisfies the Schrödinger equation in the form

\displaystyle  i\hbar\frac{\partial\psi^{\prime}\left(t^{\prime}\right)}{\partial t^{\prime}}=H\psi^{\prime}\left(t^{\prime}\right) \ \ \ \ \ (5)

From this, we get

\displaystyle  i\hbar\frac{\partial\left(T\psi\left(t\right)\right)}{\partial\left(-t\right)}=HT\psi\left(t\right) \ \ \ \ \ (6)

Whatever this unknown operator {T} is, it has an inverse, so we can multiply on the left by {T^{-1}} to get

\displaystyle  T^{-1}\left(-i\right)T\hbar\frac{\partial\psi\left(t\right)}{\partial t}=T^{-1}HT\psi\left(t\right) \ \ \ \ \ (7)

Notice that we’re not assuming that {T} has no effect on {i} (that is, we’re not assuming that we can pull {i} out of the expression on the LHS). Now we know that {T} has an effect only if what it operates on depends on time (since it’s the time reversal operator) so, since we’re assuming that {H} is time-independent, we must have {\left[H,T\right]=0}. Given this, we have

\displaystyle  T^{-1}HT=T^{-1}TH=H \ \ \ \ \ (8)

Thus, the RHS of 7 reduces to the RHS of the original Schrödinger equation 3. If the Schrödinger equation is to remain valid after time reversal, the LHS of 7 must also reduce to the LHS of 3. That is, we must have

\displaystyle  T^{-1}\left(-i\right)T=i \ \ \ \ \ (9)

Multiplying on the left by {T} we get

\displaystyle  -iT=Ti \ \ \ \ \ (10)

In other words, one of the effects of {T} is that it takes the complex conjugate of any expression that it operates on.

To find out exactly what {T} is, we can write it as the product of a unitary operator {U} and the operator {K}, whose only job is that it takes the complex conjugate. Since doing the complex conjugate operation twice in succession returns us to the original expression, {K^{2}=I}, so {K=K^{-1}}. We get

\displaystyle   T \displaystyle  = \displaystyle  UK\ \ \ \ \ (11)
\displaystyle  T^{-1} \displaystyle  = \displaystyle  K^{-1}U^{-1}=KU^{-1} \ \ \ \ \ (12)

Ordinary unitary operators are linear in the sense that {U\left(\alpha\psi\right)=\alpha U\psi}, where {\alpha} is a complex number and {\psi} is some function, with a similar relation holding for {U^{-1}}. Combining the above few equations, we have

\displaystyle   T^{-1}\left(-i\right)T \displaystyle  = \displaystyle  KU^{-1}\left(-i\right)UK\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  K\left(-i\right)U^{-1}UK\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  iK^{2}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  i \ \ \ \ \ (16)

Thus the most general form for {T} is some unitary operator {U} multiplied by the complex conjugate operator {K}. We can see that, for such an operator, and complex constants {\alpha}and {\beta} and functions {\psi} and {\phi}:

\displaystyle   T\left(\alpha\psi+\beta\phi\right) \displaystyle  = \displaystyle  UK\left(\alpha\psi+\beta\phi\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  U\left(\alpha^*K\psi+\beta^*K\phi\right)\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \alpha^*UK\psi+\beta^*UK\phi\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \alpha^*T\psi+\beta^*T\phi \ \ \ \ \ (20)

An operator that obeys this relation is called antilinear. The operator {T} has the additional property

\displaystyle   \left\langle T\psi\left|T\phi\right.\right\rangle \displaystyle  = \displaystyle  \left\langle UK\psi\left|UK\phi\right.\right\rangle \ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \left\langle U\psi\left|U\phi\right.\right\rangle ^*\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \psi\left|\phi\right.\right\rangle ^*\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \phi\left|\psi\right.\right\rangle \ \ \ \ \ (24)

The third line follows from the fact that a unitary operator preserves inner products. An antilinear operator that satisfies the condition {\left\langle T\psi\left|T\phi\right.\right\rangle =\left\langle \phi\left|\psi\right.\right\rangle } is called antiunitary. [The fact that time reversal is antiunitary was first derived by Eugene Wigner in 1932. A more general result, known as Wigner’s theorem, states that any symmetry in a quantum system must be represented by either a unitary or an antiunitary operator.]

To find {U} in this case, consider a plane wave state

\displaystyle  \psi\left(t\right)=e^{i\left(px-Et\right)/\hbar} \ \ \ \ \ (25)

Applying {T} to this state, we have

\displaystyle   T\psi\left(t\right) \displaystyle  = \displaystyle  UKe^{i\left(px-Et\right)/\hbar}\ \ \ \ \ (26)
\displaystyle  \displaystyle  = \displaystyle  Ue^{-i\left(px-Et\right)/\hbar} \ \ \ \ \ (27)

In one dimension, the only unitary operator {U} is a phase factor like {e^{i\alpha}} for some real {\alpha} (since {U} has to preserve the inner product). We can take {U=1} since the phase factor cancels out when calculating {\left|T\psi\left(t\right)\right|^{2}}. Going back to 4, we see that the time-reversed wave function is

\displaystyle   \psi^{\prime}\left(-t\right) \displaystyle  = \displaystyle  T\psi\left(t\right)=e^{-i\left(px-Et\right)/\hbar}\ \ \ \ \ (28)
\displaystyle  \psi^{\prime}\left(t\right) \displaystyle  = \displaystyle  e^{-i\left(px+Et\right)/\hbar}=e^{\left(-ipx-Et\right)/\hbar} \ \ \ \ \ (29)

Since this is the same as the original wave function except that {p\rightarrow-p}, we see that it is indeed a valid time-reversed wave function. The energy is the same (the {-Et} part of the exponent still has a minus sign) but the momentum has reversed, giving a wave that moves in the opposite direction.

Another way of looking at time reversal is as follows. Suppose we start with a system in the state {\psi\left(0\right)} at {t=0}. We can let it evolve for a time {\tau} using the propagator to get the state at time {t=\tau}:

\displaystyle  \psi\left(\tau\right)=e^{-iH\tau/\hbar}\psi\left(0\right) \ \ \ \ \ (30)

Applying time reversal via the operator {T} to this state, we have (we’re assuming that {H} is time-independent, but we’re allowing it to be complex)

\displaystyle  T\psi\left(\tau\right)=e^{iH^*\tau/\hbar}\psi^*\left(0\right) \ \ \ \ \ (31)

If we now evolve this time-reversed state through the same time {\tau}, we should end up back in the (time-reversed) original state if the system is invariant under time reversal. That is,

\displaystyle  \psi\left(2\tau\right)=e^{-iH\tau/\hbar}e^{iH^*\tau/\hbar}\psi^*\left(0\right)=\psi^*\left(0\right) \ \ \ \ \ (32)

[Note that we don’t require {\psi\left(2\tau\right)=\psi\left(0\right)} since {\psi\left(2\tau\right)} is the system in its time-reversed state, where it’s moving in the opposite direction to the original state. Think about time-reversing a bouncing ball. The ball becomes effectively time-reversed when it bounces. If the ball is travelling down at some speed {v} at a height {h}, then after bouncing (assuming an elastic bounce) it will be travelling at the same speed {v} when it bounces back to the height {h}, but it will be moving in the opposite direction.]

In this equation, we’re working in the {X} basis, so the exponents are numerical functions, not operators, and we’re free to combine the exponents without worrying about commutators. This means that in order for the system to be time-reversal invariant, we must have

\displaystyle  H\left(x\right)=H^*\left(x\right) \ \ \ \ \ (33)

In other words, the Hamiltonian must be real. The usual kinetic plus potential type of Hamiltonian satisfies this since it has the form

\displaystyle  H=\frac{P^{2}}{2m}+V\left(x\right) \ \ \ \ \ (34)

and although the quantum momentum operator is {P=-i\hbar\frac{d}{dx}}, its square is real. In the magnetic force case, the presence of the charge’s velocity as a linear term (in {q\mathbf{v}\times\mathbf{B}}) means the momentum operator occurs as a linear term, making {H} complex, so time reversal invariance doesn’t hold. Again, however, if we included the charges that give rise to the magnetic field, the discrepancy disappears.

Parity transformations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11, Exercises 11.4.1 – 11.4.4.

A parity transformation reflects all the coordinate axes through the origin, so that, in one dimension {x\rightarrow-x} and in three dimensions the position vector {\mathbf{r}\rightarrow-\mathbf{r}}. In one dimension, a parity transformation is the same as reflection in a point-sized mirror placed at the origin. It might seem that in three dimensions, parity is more than just a reflection in a plane mirror, but in fact it can be shown that it is equivalent to such a reflection followed by a rotation. To see this, suppose we place a mirror in the {xy} plane, so that the {z} axis gets reflected into {-z}. This converts a right-handed rectangular coordinate system (where the direction of the {z} axis is determined by the direction of your thumb on your right hand when you curl your fingers through the right angle between the positive {x} and {y} axes) into a left-handed coordinate system (the direction of the new {+z} axis is found by doing the finger-curling maneuver with your left hand). However, merely reflecting the {z} axis in the {xy} plane leaves the {x} and {y} axes unchanged. Now if we rotate the {xy} plane by an angle {\pi} (or {180^{\circ}}) about the {z} axis, then the {+x} axis gets rotated into the {-x} axis, and the {+y} axis gets rotated into the {-y} axis. In this sense, the 3-d parity transformation is equivalent to a reflection (since pretty well every physical phenomenon is invariant under a rotation).

To apply parity to quantum state vectors, we define a parity operator {\Pi} to have the following action on the {X} basis:

\displaystyle \Pi\left|x\right\rangle =\left|-x\right\rangle \ \ \ \ \ (1)

From this definition we can see the effect on an arbitrary state {\left|\psi\right\rangle } by inserting a complete set of {X} states:

\displaystyle \Pi\left|\psi\right\rangle \displaystyle = \displaystyle \Pi\int_{-\infty}^{\infty}\left|x\right\rangle \left\langle x\left|\psi\right.\right\rangle dx\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\left|-x\right\rangle \left\langle x\left|\psi\right.\right\rangle dx\ \ \ \ \ (3)
\displaystyle \displaystyle = \displaystyle \int_{\infty}^{-\infty}\left|x^{\prime}\right\rangle \left\langle -x^{\prime}\left|\psi\right.\right\rangle \left(-dx^{\prime}\right)\ \ \ \ \ (4)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\left|x^{\prime}\right\rangle \left\langle -x^{\prime}\left|\psi\right.\right\rangle dx^{\prime} \ \ \ \ \ (5)

In the third line we made the substitution {x^{\prime}=-x}, so that {dx=-dx^{\prime}} and the limits of integration get swapped. As a result of this, the effect of parity in the {X} basis representation {\left\langle x\left|\psi\right.\right\rangle =\psi\left(x\right)} of a state vector {\left|\psi\right\rangle } is

\displaystyle \left\langle x\left|\Pi\right|\psi\right\rangle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\left\langle x\left|x^{\prime}\right.\right\rangle \left\langle -x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\delta\left(x-x^{\prime}\right)\left\langle -x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle \psi\left(-x\right) \ \ \ \ \ (8)

Parity therefore simply converts {x\rightarrow-x} wherever it occurs in the function {\psi\left(x\right)}.

One special case of this is the momentum eigenstate {\left|p\right\rangle } which has the form in the {X} basis of

\displaystyle \left\langle x\left|p\right.\right\rangle =\frac{1}{\sqrt{2\pi\hbar}}e^{ipx/\hbar} \ \ \ \ \ (9)

The parity transformation gives

\displaystyle \left\langle x\left|\Pi\right|p\right\rangle =\frac{1}{\sqrt{2\pi\hbar}}e^{-ipx/\hbar} \ \ \ \ \ (10)

Another way of looking at this is that parity changes {p} to {-p} and leaves the {x} alone, so that

\displaystyle \Pi\left|p\right\rangle =\left|-p\right\rangle \ \ \ \ \ (11)

[You might think that if parity transforms {x\rightarrow-x} and {p\rightarrow-p} then the effect on {e^{ipx/\hbar}} should be to switch the signs of both {x} and {p} and thus leave the state unchanged. However, this isn’t correct, as we can express a state vector in either the {X} basis (in which {x\rightarrow-x}) or in the {P} basis (in which {p\rightarrow-p}) but not both at the same time.]

A few properties of {\Pi} can be derived fairly easily. First, since applying {\Pi} twice in succession to the same state swaps {x\rightarrow-x} and back again, it leaves that state unchanged. Since this is true for all states, we must have

\displaystyle \Pi^{2}=I \ \ \ \ \ (12)

from which we see that {\Pi} is its own inverse, so

\displaystyle \Pi^{-1}=\Pi \ \ \ \ \ (13)

We can also see that {\Pi} is Hermitian by considering

\displaystyle \left\langle \psi\left|\Pi^{\dagger}\Pi\right|\psi\right\rangle =\left\langle \Pi\psi\left|\Pi\psi\right.\right\rangle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\psi^*\left(-x\right)\psi\left(-x\right)dx\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle \int_{-\infty}^{\infty}\psi^*\left(x^{\prime}\right)\psi\left(x^{\prime}\right)dx^{\prime}\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle \left\langle \psi\left|\psi\right.\right\rangle \ \ \ \ \ (16)

In the second line we used the same trick as in the derivation of 5 to substitute {x^{\prime}=-x}. Thus we see that

\displaystyle \Pi^{\dagger}\Pi \displaystyle = \displaystyle I\ \ \ \ \ (17)
\displaystyle \Pi^{\dagger} \displaystyle = \displaystyle \Pi^{-1}=\Pi \ \ \ \ \ (18)

The condition {\Pi^{\dagger}=\Pi} shows that {\Pi} is Hermitian, and the condition {\Pi^{\dagger}=\Pi^{-1}} shows that {\Pi} is unitary.

Finally, any operator whose square is the identity operator has eigenvalues {\pm1}, as we can see as follows. Suppose {\left|\psi\right\rangle } is an eigenvector of {\Pi} with eigenvalue {\alpha}. Then

\displaystyle \Pi\left|\psi\right\rangle \displaystyle = \displaystyle \alpha\left|\psi\right\rangle \ \ \ \ \ (19)
\displaystyle \Pi^{2}\left|\psi\right\rangle \displaystyle = \displaystyle \alpha\Pi\left|\psi\right\rangle \ \ \ \ \ (20)
\displaystyle \displaystyle = \displaystyle \alpha^{2}\left|\psi\right\rangle \ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle I\left|\psi\right\rangle \ \ \ \ \ (22)
\displaystyle \displaystyle = \displaystyle \left|\psi\right\rangle \ \ \ \ \ (23)

Therefore {\alpha^{2}=1}, so {\alpha=\pm1}.

We can also define {\Pi} by examining its effect on operators, rather than states. Consider

\displaystyle \left\langle \Pi x^{\prime}\left|X\right|\Pi x\right\rangle \displaystyle = \displaystyle \left\langle -x^{\prime}\left|X\right|-x\right\rangle \ \ \ \ \ (24)
\displaystyle \displaystyle = \displaystyle -x\delta\left(x^{\prime}-x\right) \ \ \ \ \ (25)

However, this is equivalent to

\displaystyle \left\langle \Pi x^{\prime}\left|X\right|\Pi x\right\rangle =\left\langle x^{\prime}\left|\Pi^{\dagger}X\Pi\right|x\right\rangle =-x\delta\left(x^{\prime}-x\right) \ \ \ \ \ (26)

Thus we can write

\displaystyle \Pi^{\dagger}X\Pi=-X \ \ \ \ \ (27)

and similarly for the momentum

\displaystyle \Pi^{\dagger}P\Pi=-P \ \ \ \ \ (28)

Eigenstates of parity are said to be even if the eigenvalue is {+1} and odd if the eigenvalue is {-1}. Mathematically, the {X} basis representation of such eigenstates are even or odd functions of {x}, respectively.

The Hamiltonian is parity invariant if a parity transformation leaves it unchanged, so that

\displaystyle \Pi^{\dagger}H\left(X,P\right)\Pi=H\left(-X,-P\right)=H\left(X,P\right) \ \ \ \ \ (29)

Since {\Pi^{\dagger}=\Pi}, this condition is equivalent to

\displaystyle \left[\Pi,H\right]=0 \ \ \ \ \ (30)

Using the same argument as with conservation of momentum, if this commutator is valid at all times (if {H} is time-independent this is automatic; if {H} is time-dependent, then we must impose the commutator at all times), then {\Pi} must also commute with the propagator {U\left(t\right)}, since {U} depends only on {H}. In this case, if we start with a system in a definite parity state (even or odd), then the parity of the state doesn’t change with time. This follows because if {\left[\Pi,U\left(t\right)\right]=0} then if {\Pi\left|\psi\left(0\right)\right\rangle =\alpha\left|\psi\left(0\right)\right\rangle } (where {\alpha=\pm1}), then we can let the state evolve in time by applying the propagator to it, so that we have

\displaystyle \left|\psi\left(t\right)\right\rangle =U\left(t\right)\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (31)

Applying the parity operator to this and using the commutator, we have

\displaystyle \Pi\left|\psi\left(t\right)\right\rangle =\Pi U\left(t\right)\left|\psi\left(0\right)\right\rangle =U\left(t\right)\Pi\left|\psi\left(0\right)\right\rangle =\alpha U\left(t\right)\left|\psi\left(0\right)\right\rangle =\alpha\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (32)

Thus the parity of the evolved state is the same as the parity of the initial state.

Parity is not always conserved in physics. A notable parity-violating reaction is a decay involving the weak nuclear force. Shankar describes one such case with the decay of an isotope of cobalt: {^{60}\mbox{Co}\rightarrow^{60}\mbox{Ni}+e^{-}+\bar{\nu}}. Another example is in Shankar’s exercise 11.4.3.

Suppose that in one particular reaction which emits an electron, the electron’s spin is observed to be always parallel to its momentum. For the purposes of this argument, we can regard an electron’s spin as being caused by some physical rotation of the electron. Suppose in one such reaction, the electron’s spin is in the {+z} direction (using the right-hand rule for calculating the direction of angular momentum, so that viewed from above, the electron is rotating counterclockwise) and therefore its momentum is also in the {+z} direction. Now reflect this reaction in a mirror lying in the {yz} plane. This reflection will invert the direction of rotation (think of viewing a spinning top in a mirror) so that the spin direction will now point in the {-z} direction, but since the momentum vector is parallel to the plane of the mirror, it will not be inverted. Thus the spin and momentum are now anti-parallel after a parity transformation, showing that parity in this case is not conserved.

Finally, Shankar includes a curious problem (11.4.2) which, as far as I can tell, doesn’t have anything to do with parity, but I’ll include it here for completeness. Suppose we have a particle that moves in a potential

\displaystyle V\left(x\right)=V_{0}\sin\left(\frac{2\pi x}{a}\right) \ \ \ \ \ (33)

This potential is periodic with a period of {a}, so if we translate the system according to {x\rightarrow x+ma} for some integer {m}, the potential is unchanged. The problem is to show that momentum is not conserved in this case. The conservation of momentum argument, valid for infinitesimal translations, relied on Ehrenfest’s theorem, which states that

\displaystyle \left\langle \dot{P}\right\rangle =-\frac{i}{\hbar}\left\langle \left[P,H\right]\right\rangle \ \ \ \ \ (34)

If the momentum commutes with the Hamiltonian, then, on average, the momentum is conserved. Now in this case we can calculate the commutator {\left[P,V\right]} using the result

\displaystyle \left[X^{n},P\right]=i\hbar nX^{n-1} \ \ \ \ \ (35)

We can write the potential as a series:

\displaystyle V\left(X\right)=V_{0}\left[\frac{2\pi X}{a}-\frac{1}{3!}\left(\frac{2\pi X}{a}\right)^{3}+\ldots\right] \ \ \ \ \ (36)

The commutator is therefore

\displaystyle \left[V,P\right]=\frac{2\pi i\hbar V_{0}}{a}\left[1-\frac{1}{2!}\left(\frac{2\pi X}{a}\right)^{2}+\ldots\right]=\frac{2\pi i\hbar V_{0}}{a}\cos\left(\frac{2\pi X}{a}\right) \ \ \ \ \ (37)

Therefore, Ehrenfest’s theorem gives us (since {H} presumably is of the form {H=T+V} with the kinetic energy depending only on {P}, so it commutes with {P}):

\displaystyle \left\langle \dot{P}\right\rangle =-\frac{2\pi V_{0}}{a}\left\langle \cos\left(\frac{2\pi X}{a}\right)\right\rangle \ \ \ \ \ (38)

Since the cosine is periodic, we can’t actually calculate a unique value for its average, although if we do the average over an exact number of periods, the average is still zero. I have a feeling that I’m missing something obvious here, so any suggestions are welcome.

Time translation and conservation of energy

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11, Section 11.3.

We can investigate the effect of a quantum system being invariant under time translation by considering the evolution of a state vector using a propagator. A state at time {t} is given in terms of the state at time {t=0} according to

\displaystyle \left|\psi\left(t\right)\right\rangle =U\left(t\right)\left|\psi\left(0\right)\right\rangle =e^{-itH/\hbar}\left|\psi\left(0\right)\right\rangle \ \ \ \ \ (1)

Strictly speaking, this equation is true only if {H} is time-independent, since in the time-dependent case, we need to express the propagator as a time-ordered integral. However, if we let the system evolve for only an infinitesimal time {\varepsilon}, we can ignore the complexities of the time-ordered integral and write, to first order in {\varepsilon}

\displaystyle U\left(\varepsilon\right)=e^{-i\varepsilon H\left(0\right)/\hbar}=I-\frac{i\varepsilon H\left(0\right)}{\hbar} \ \ \ \ \ (2)

Note that it doesn’t matter if we use the value of {H} at time {t=0} or {t=\varepsilon} or at some time in between, since the differences between these values are of order {\varepsilon}, and thus make no difference to {U\left(\varepsilon\right)} to first order in {\varepsilon}.

Now suppose we prepare the same system (which we’ll call {\left|\psi_{0}\right\rangle }) at some time {t=t_{1}} and consider how the system evolves over an infinitesimal time {\varepsilon} starting from {t=t_{1}}. We then have

\displaystyle \left|\psi\left(t_{1}+\varepsilon\right)\right\rangle \displaystyle = \displaystyle U\left(t_{1}+\varepsilon\right)\left|\psi_{0}\right\rangle \ \ \ \ \ (3)
\displaystyle \displaystyle = \displaystyle \left(I-\frac{i\varepsilon H\left(t_{1}\right)}{\hbar}\right)\left|\psi_{0}\right\rangle \ \ \ \ \ (4)

The idea behind time translation invariance is that it shouldn’t make any difference at what time we prepare a system, provided that the system is prepared identically at whatever time we actually do prepare it. In other words, if we had prepared our system above at {t=t_{2}} instead of {t=t_{1}} and then let it evolve for an infinitesimal time {\varepsilon}, we should end up with exactly the same state. That is, we require that

\displaystyle \left|\psi\left(t_{2}+\varepsilon\right)\right\rangle \displaystyle = \displaystyle \left(I-\frac{i\varepsilon H\left(t_{2}\right)}{\hbar}\right)\left|\psi_{0}\right\rangle \ \ \ \ \ (5)
\displaystyle \displaystyle = \displaystyle \left(I-\frac{i\varepsilon H\left(t_{1}\right)}{\hbar}\right)\left|\psi_{0}\right\rangle \ \ \ \ \ (6)

Rearranging things, we get

\displaystyle -\frac{i\varepsilon}{\hbar}\left(H\left(t_{2}\right)-H\left(t_{1}\right)\right)\left|\psi_{0}\right\rangle =0 \ \ \ \ \ (7)

The initial state can be anything we like, so in order for this condition to be always true, we must have

\displaystyle H\left(t_{2}\right)=H\left(t_{1}\right) \ \ \ \ \ (8)

Again, the two times {t_{1}} and {t_{2}} at which we prepared the system are arbitrary (and not necessarily separated by an infinitesimal time, so they could be years apart), so this condition implies that {H} itself must be constant in time. For a time-independent operator {A}, Ehrenfest’s theorem says that

\displaystyle \left\langle \dot{A}\right\rangle =-\frac{i}{\hbar}\left\langle \left[A,H\right]\right\rangle \ \ \ \ \ (9)

If {A=H}, then the commutator is {\left[H,H\right]=0}, so time translation invariance implies that

\displaystyle \left\langle \dot{H}\right\rangle =0 \ \ \ \ \ (10)

That is, time translation invariance implies that the average energy of the system is conserved.

Clearly, energy is conserved if the system is in an energy eigenstate, since then the energy has a single, unchanging value. However, if we prepare the state as a combination of energy eigenstates, then the system has the form

\displaystyle \psi\left(x,t\right)=\sum_{k}c_{k}e^{-iE_{k}t/\hbar}\psi_{k}\left(x\right) \ \ \ \ \ (11)

where the {c_{k}} are constant coefficients. A measurement of the energy on such a system can yield any of the energies {E_{k}} for which {c_{k}\ne0}, so it might seem that we’re violating the conservation of energy. The point is that, on average, the energy is

\displaystyle \left\langle E\right\rangle =\sum_{k}\left|c_{k}\right|^{2}E_{k} \ \ \ \ \ (12)

and it is this average that doesn’t change with time. In dealing with averages, we’re also retaining consistency with the infamous energy-time uncertainty relation.

Finite transformations: correspondence between classical and quantum

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11, Exercise 11.2.3.

The translation operator for an infinitesimal translation {\varepsilon} is, to first order in {\varepsilon}:

\displaystyle  T\left(\varepsilon\right)=I-\frac{i\varepsilon}{\hbar}P \ \ \ \ \ (1)

where {P}, the momentum operator, serves as the generator of translations. To derive a formula for a finite (non-infinitesimal) translation over a distance {a}, we divide the interval {a} into {N} segments, each of width {a/N}, so that for very large {N}, the width becomes infinitesimal. Then we have

\displaystyle  T\left(a\right)=\left(I-\frac{ia}{\hbar N}P\right)^{N} \ \ \ \ \ (2)

This formula is reminiscent of one definition of the exponential function (which can be found in most introductory calculus texts):

\displaystyle  e^{-ax}=\lim_{N\rightarrow\infty}\left(1-\frac{ax}{N}\right)^{N} \ \ \ \ \ (3)

When we try to apply a formula that is valid for ordinary numbers to a case containing operators, we need to take care that any commutation relations involving the operators are taken into account. In this case, 2 contains only the momentum operator and the identity operator, which commute with each other, so we can in fact apply the limit formula directly to the operator case. We therefore have

\displaystyle  T\left(a\right)=\lim_{N\rightarrow\infty}\left(I-\frac{ia}{\hbar N}P\right)^{N}=e^{-iaP/\hbar} \ \ \ \ \ (4)

In the position basis, {P=-i\hbar\frac{d}{dx}}, so if we apply {T\left(a\right)} to a state vector {\psi\left(x\right)=\left\langle x\left|\psi\right.\right\rangle } we can expand the exponential in a Taylor series to get

\displaystyle  \left\langle x\left|T\left(a\right)\right|\psi\right\rangle =\psi\left(x\right)-a\frac{d\psi}{dx}+\frac{a}{2!}\frac{d^{2}\psi}{dx^{2}}+\ldots \ \ \ \ \ (5)

We can extend our analysis of the correspondence between classical and quantum versions of translations. In the passive transformation model, the transformation is applied to operators rather than state vectors, so for a finite translation of an operator {\Omega} we have

\displaystyle  \Omega\rightarrow T^{\dagger}\left(a\right)\Omega T\left(a\right)=e^{iaP/\hbar}\Omega e^{-iaP/\hbar} \ \ \ \ \ (6)

The operator expression on the RHS can be expanded using Hadamard’s lemma, which for two operators {A} and {B} is

\displaystyle  e^{-A}Be^{A}=B+\left[B,A\right]+\frac{1}{2!}\left[\left[B,A\right],A\right]+\ldots \ \ \ \ \ (7)

where each term contains the commutator of the previous term’s commutator with {A}.

In this case gives us

\displaystyle  e^{iaP/\hbar}\Omega e^{-iaP/\hbar}=\Omega+a\left(-\frac{i}{\hbar}\right)\left[\Omega,P\right]+\frac{a^{2}}{2!}\left(-\frac{i}{\hbar}\right)^{2}\left[\left[\Omega,P\right],P\right]+\ldots \ \ \ \ \ (8)

For example, in the case {\Omega=X}, {\left[X,P\right]=i\hbar I} and all higher commutators are zero (since they involve the commutator of a constant with {P}), so we get

\displaystyle  e^{iaP/\hbar}Xe^{-iaP/\hbar}=X+aI \ \ \ \ \ (9)

so the system is translated by a distance {a}, as we’d expect.

For higher powers of {X}, we can use the result

\displaystyle  \left[X^{n},P\right]=i\hbar nX^{n-1} \ \ \ \ \ (10)

We therefore get

\displaystyle   e^{iaP/\hbar}X^{n}e^{-iaP/\hbar} \displaystyle  = \displaystyle  X^{n}+anX^{n-1}+\frac{a^{2}}{2!}n\left(n-1\right)X^{n-2}+\ldots+\frac{a^{n}}{n!}n!I\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \sum_{m=0}^{n}\binom{n}{m}X^{n-m}\left(aI\right)^{m}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \left(X+aI\right)^{n} \ \ \ \ \ (13)

We’re allowed to treat {X} as an ordinary number in these equations since it is (apart from {I}), the only operator present so all terms commute.

In the classical case, the infinitesimal change {\delta\omega} of a variable {\omega} under an infinitesimal displacement {\delta a} generated by the momentum {p} is given by the Poisson bracket

\displaystyle  \delta\omega=\delta a\left\{ \omega,p\right\} \ \ \ \ \ (14)

We can write this as a derivative:

\displaystyle  \frac{d\omega}{da}=\left\{ \omega,p\right\}  \ \ \ \ \ (15)

For a finite translation by an amount {a}, we can write the value of {\omega} as a Taylor series relative to some starting point {a_{0}} as

\displaystyle  \omega\left(a_{0}+a\right)=\omega\left(a_{0}\right)+a\frac{d\omega}{da}+\frac{a^{2}}{2!}\frac{d^{2}\omega}{da^{2}}+\ldots \ \ \ \ \ (16)

where all derivatives are evaluated at {a=a_{0}}.

We can write all the derivatives in terms of Poisson brackets by using 15. For example

\displaystyle  \frac{d^{2}\omega}{da^{2}}=\frac{d}{da}\left(\frac{d\omega}{da}\right)=\left\{ \frac{d\omega}{da},p\right\} =\left\{ \left\{ \omega,p\right\} ,p\right\} \ \ \ \ \ (17)

Thus the variable {\omega} transforms according to

\displaystyle  \omega\left(a_{0}+a\right)=\omega+a\left\{ \omega,p\right\} +\frac{a^{2}}{2!}\left\{ \left\{ \omega,p\right\} ,p\right\} +\ldots \ \ \ \ \ (18)

Comparing this with 8, we see that the two expressions match if we use the usual recipe for converting classical Poisson brackets to quantum commutators, namely {\left\{ a,b\right\} =-\frac{i}{\hbar}\left[A,B\right]}.

Although we’ve worked this out for the special case of translations, the same principle can be used for other transformations. For example, the angular momentum about the {z} axis is

\displaystyle  \ell_{z}=xp_{y}-yp_{x} \ \ \ \ \ (19)

and serves as the generator of rotations about the {z} axis. Suppose we have a rotation through an angle {\theta} and we want to see how the two coordinates {x} and {y} transform. The expansion 18 becomes

\displaystyle  \bar{x}=x+\theta\left\{ x,\ell_{z}\right\} +\frac{\theta^{2}}{2!}\left\{ \left\{ x,\ell_{z}\right\} ,\ell_{z}\right\} +\ldots \ \ \ \ \ (20)

The relevant Poisson brackets are (using the generic term {q_{i}} to represent the two coordinates {x} and {y}):

\displaystyle   \left\{ x,\ell_{z}\right\} \displaystyle  = \displaystyle  \sum_{i}\left[\frac{\partial x}{\partial q_{i}}\frac{\partial\ell_{z}}{\partial p_{i}}-\frac{\partial x}{\partial p_{i}}\frac{\partial\ell_{z}}{\partial q_{i}}\right]\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  -y\ \ \ \ \ (22)
\displaystyle  \left\{ y,\ell_{z}\right\} \displaystyle  = \displaystyle  \sum_{i}\left[\frac{\partial y}{\partial q_{i}}\frac{\partial\ell_{z}}{\partial p_{i}}-\frac{\partial y}{\partial p_{i}}\frac{\partial\ell_{z}}{\partial q_{i}}\right]\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  x \ \ \ \ \ (24)

Looking at how {x} transforms, we see that the Poisson brackets in 20 will cycle through the four values

\displaystyle   \left\{ x,\ell_{z}\right\} \displaystyle  = \displaystyle  -y\ \ \ \ \ (25)
\displaystyle  \left\{ \left\{ x,\ell_{z}\right\} ,\ell_{z}\right\} \displaystyle  = \displaystyle  -\left\{ y,\ell_{z}\right\} =-x\ \ \ \ \ (26)
\displaystyle  \left\{ \left\{ \left\{ x,\ell_{z}\right\} ,\ell_{z}\right\} ,\ell_{z}\right\} \displaystyle  = \displaystyle  -\left\{ x,\ell_{z}\right\} =y\ \ \ \ \ (27)
\displaystyle  \left\{ \left\{ \left\{ \left\{ x,\ell_{z}\right\} ,\ell_{z}\right\} ,\ell_{z}\right\} ,\ell_{z}\right\} \displaystyle  = \displaystyle  \left\{ y,\ell_{z}\right\} =x \ \ \ \ \ (28)

The series 20 thus expands to

\displaystyle   \bar{x} \displaystyle  = \displaystyle  x\left[1-\frac{\theta^{2}}{2!}+\frac{\theta^{4}}{4!}-\ldots\right]-y\left[\theta-\frac{\theta^{3}}{3!}+\frac{\theta^{5}}{5!}-\ldots\right]\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  x\cos\theta-y\sin\theta \ \ \ \ \ (30)

We can do the same calculation for {\bar{y}} to get

\displaystyle  \bar{y}=x\sin\theta+y\cos\theta \ \ \ \ \ (31)

Translational invariance and conservation of momentum

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 11.

One consequence of the invariance of the Hamiltonian under translation is that the momentum and Hamiltonian commute:

\displaystyle \left[P,H\right]=0 \ \ \ \ \ (1)

In quantum mechanics, commuting quantities are simultaneously observable, and we can find a basis for the Hilbert space consisting of eigenstates of both {P} and {H}. We’ve seen that Ehrenfest’s theorem allows us to conclude that for such a system, the average momentum is conserved so that {\left\langle \dot{P}\right\rangle =0}. We can go a step further and state that if a system starts out in an eigenstate of {P}, then it remains in that eigenstate for all time.

First, we need to make a rather subtle observation, which is that

\displaystyle \left[P,H\right]=0\rightarrow\left[P,U\left(t\right)\right]=0 \ \ \ \ \ (2)


That is, if {P} and {H} commute, then {P} also commutes with the propagator {U\left(t\right)}. For a time-independent Hamiltonian, the propagator is

\displaystyle U\left(t\right)=e^{-iHt/\hbar} \ \ \ \ \ (3)

Since this can be expanded in a power series in the Hamiltonian, condition 2 follows easily enough. What if the Hamiltonian is time-dependent? In this case, the propagator comes out to a time-ordered integral

\displaystyle U\left(t\right)=T\left\{ \exp\left[-\frac{i}{\hbar}\int_{0}^{t}H\left(t^{\prime}\right)dt^{\prime}\right]\right\} \equiv\lim_{N\rightarrow\infty}\prod_{n=0}^{N-1}e^{-i\Delta H\left(n\Delta\right)/\hbar} \ \ \ \ \ (4)


Here the time interval {\left[0,t\right]} is divided into {N} time slices, each of length {\Delta=t/N}. As explained in the earlier post, the reason we can’t just integrate the RHS directly by summing the exponents is that such a procedure works only if the operators in the exponents all commute with each other. If {H} is time-dependent, its forms at different times may not commute, so we can’t get a simple closed form for {U\left(t\right)}.

However, if {\left[P,H\left(t\right)\right]=0} for all times, then {P} commutes with all the exponents on the RHS of 4, so we still get {\left[P,U\left(t\right)\right]=0}. Another way of looking at this is by imposing the condition {\left[P,H\left(t\right)\right]=0} we’re saying that if {H\left(t\right)} can be expanded in a power series in {X} and {P}, it depends only on {P}, and not on {X}. This follows from the fact that

\displaystyle \left[X^{n},P\right]=i\hbar nX^{n-1} \ \ \ \ \ (5)

so that {P} does not commute with any power of {X}.

Given that 2 is valid for all Hamiltonians, then if we start in a eigenstate {\left|p\right\rangle } of {P}, then

\displaystyle P\left|p\right\rangle \displaystyle = \displaystyle p\left|p\right\rangle \ \ \ \ \ (6)
\displaystyle PU\left(t\right)\left|p\right\rangle \displaystyle = \displaystyle U\left(t\right)P\left|p\right\rangle \ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle U\left(t\right)p\left|p\right\rangle \ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle pU\left(t\right)\left|p\right\rangle \ \ \ \ \ (9)

Thus {U\left(t\right)\left|p\right\rangle } remains an eigenstate of {P} with the same eigenvalue {p} for all time. For a single particle moving in one dimension, the state {\left|p\right\rangle } describes a free particle with momentum {p} (and thus a completely undetermined position).