Vector operators; transformation under rotation

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

A vector operator ${\mathbf{V}}$ is defined as an operator whose components transform under rotation according to

$\displaystyle U^{\dagger}\left[R\right]V_{i}U\left[R\right]=\sum_{j}R_{ij}V_{j} \ \ \ \ \ (1)$

where ${R}$ is the rotation matrix in either 2 or 3 dimensions. We’ve seen that, for an infinitesimal rotation about an arbitrary axis ${\delta\boldsymbol{\theta}}$, a vector transforms like

$\displaystyle \mathbf{V}\rightarrow\mathbf{V}+\delta\boldsymbol{\theta}\times\mathbf{V} \ \ \ \ \ (2)$

This can be written more compactly using the Levi-Civita tensor, since component ${i}$ of a cross product is

$\displaystyle \left(\delta\boldsymbol{\theta}\times\mathbf{V}\right)_{i}=\sum_{j,k}\varepsilon_{ijk}\left(\delta\theta\right)_{j}V_{k} \ \ \ \ \ (3)$

We get

$\displaystyle \sum_{j}R_{ij}V_{j}=V_{i}+\sum_{j,k}\varepsilon_{ijk}\left(\delta\theta\right)_{j}V_{k} \ \ \ \ \ (4)$

The operator ${U\left[R\right]}$ is given by

$\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L} \ \ \ \ \ (5)$

where ${\mathbf{L}}$ is the angular momentum. Plugging this into 1, we have, to first order in ${\delta\boldsymbol{\theta}}$ (remembering that the components of ${\mathbf{L}}$ do not commute with each other and, in general also do not commute with the components of ${\mathbf{V}}$):

 $\displaystyle \left(I+\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\right)V_{i}\left(I-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\right)$ $\displaystyle =$ $\displaystyle V_{i}+\frac{i}{\hbar}\sum_{j}\left(\delta\theta_{j}L_{j}\right)V_{i}-\frac{i}{\hbar}V_{i}\sum_{j}\left(\delta\theta_{j}L_{j}\right)\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle V_{i}+\frac{i}{\hbar}\sum_{j}\delta\theta_{j}\left[L_{j},V_{i}\right] \ \ \ \ \ (7)$

Setting this equal to the RHS of 4 we have, equating coefficients of ${\delta\theta_{j}}$:

 $\displaystyle \frac{i}{\hbar}\left[L_{j},V_{i}\right]$ $\displaystyle =$ $\displaystyle \sum_{k}\varepsilon_{ijk}V_{k}\ \ \ \ \ (8)$ $\displaystyle \left[V_{i},L_{j}\right]$ $\displaystyle =$ $\displaystyle i\hbar\sum_{k}\varepsilon_{ijk}V_{k} \ \ \ \ \ (9)$

With ${\mathbf{V}=\mathbf{L}}$, we regain the commutation relations for the components of angular momentum

 $\displaystyle \left[L_{x},L_{y}\right]$ $\displaystyle =$ $\displaystyle i\hbar L_{z}\ \ \ \ \ (10)$ $\displaystyle \left[L_{y},L_{z}\right]$ $\displaystyle =$ $\displaystyle i\hbar L_{x}\ \ \ \ \ (11)$ $\displaystyle \left[L_{z},L_{x}\right]$ $\displaystyle =$ $\displaystyle i\hbar L_{y} \ \ \ \ \ (12)$

By the way, it is possible to write these commutation relations in the compact form

$\displaystyle \mathbf{L}\times\mathbf{L}=i\hbar\mathbf{L} \ \ \ \ \ (13)$

This looks wrong if you’re used to the standard definition of the cross product for vectors whose components are ordinary numbers, since for such a vector ${\mathbf{a}}$, we always have ${\mathbf{a}\times\mathbf{a}=0}$. However, if the components of the vector are operators that don’t commute, then the result is not zero, as we can see:

 $\displaystyle \left(\mathbf{L}\times\mathbf{L}\right)_{i}$ $\displaystyle =$ $\displaystyle \sum_{j,k}\varepsilon_{ijk}L_{j}L_{k} \ \ \ \ \ (14)$

If ${i=x}$, for example, then the sum on the RHS gives

 $\displaystyle \left(\mathbf{L}\times\mathbf{L}\right)_{x}$ $\displaystyle =$ $\displaystyle \sum_{j,k}\varepsilon_{xjk}L_{j}L_{k}\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle L_{y}L_{z}-L_{z}L_{y}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[L_{y},L_{z}\right] \ \ \ \ \ (17)$

From 13, this gives

$\displaystyle \left[L_{y},L_{z}\right]=i\hbar L_{x} \ \ \ \ \ (18)$

Finite rotations about an arbitrary axis in three dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.3.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The operators for an infinitesimal rotation in 3-d are

 $\displaystyle U\left[R\left(\varepsilon_{x}\hat{\mathbf{x}}\right)\right]$ $\displaystyle =$ $\displaystyle I-\frac{i\varepsilon_{x}L_{x}}{\hbar}\ \ \ \ \ (1)$ $\displaystyle U\left[R\left(\varepsilon_{y}\hat{\mathbf{y}}\right)\right]$ $\displaystyle =$ $\displaystyle I-\frac{i\varepsilon_{y}L_{y}}{\hbar}\ \ \ \ \ (2)$ $\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]$ $\displaystyle =$ $\displaystyle I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (3)$

If we have a finite (larger than infinitesimal) rotation about one of the coordinate axes, we can create the operator by dividing up the finite rotation angle ${\theta}$ into ${N}$ small increments and take the limit as ${N\rightarrow\infty}$, just as we did with finite translations. For example, for a finite rotation about the ${x}$ axis, we have

$\displaystyle U\left[R\left(\theta\hat{\mathbf{x}}\right)\right]=\lim_{N\rightarrow\infty}\left(I-\frac{i\theta L_{x}}{N\hbar}\right)^{N}=e^{-i\theta L_{x}/\hbar} \ \ \ \ \ (4)$

What if we have a finite rotation about some arbitrarily directed axis? Suppose we have a vector ${\mathbf{r}}$ as shown in the figure:

The vector ${\mathbf{r}}$ makes an angle ${\alpha}$ with the ${z}$ axis, and we wish to rotate ${\mathbf{r}}$ about the ${z}$ axis by an angle ${\delta\theta}$. Note that this argument is completely general, since if the axis of rotation is not the ${z}$ axis, we can rotate the entire coordinate system so that the axis of rotation is the ${z}$ axis. The generality enters through the fact that we’re keeping the angle ${\alpha}$ arbitrary.

The rotation by ${\delta\theta\hat{\mathbf{z}}\equiv\delta\boldsymbol{\theta}}$ shifts the tip of ${\mathbf{r}}$ along the circle shown by a distance ${\left(r\sin\alpha\right)\delta\theta}$ in a counterclockwise direction (looking down the ${z}$ axis). This shift is in a direction that is perpendicular to both ${\hat{\mathbf{z}}}$ and ${\mathbf{r}}$, so the little vector representing the shift in ${\mathbf{r}}$ is

$\displaystyle \delta\mathbf{r}=\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r} \ \ \ \ \ (5)$

Thus under the rotation ${\delta\boldsymbol{\theta}}$, a vector transforms as

$\displaystyle \mathbf{r}\rightarrow\mathbf{r}+\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r} \ \ \ \ \ (6)$

Just as with translations, if we rotate the coordinate system by an amount ${\delta\boldsymbol{\theta}}$, this is equivalent to rotating the wave function ${\psi\left(\mathbf{r}\right)}$ by the same angle, but in the opposite direction, so we require

$\displaystyle \psi\left(\mathbf{r}\right)\rightarrow\psi\left(\mathbf{r}-\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r}\right) \ \ \ \ \ (7)$

A first order Taylor expansion of the quantity on the RHS gives

$\displaystyle \psi\left(\mathbf{r}-\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r}\right)=\psi\left(\mathbf{r}\right)-\left(\delta\boldsymbol{\theta}\times\mathbf{r}\right)\cdot\nabla\psi \ \ \ \ \ (8)$

The operator generating this rotation will have the form (in analogy with the forms for the coordinate axes above):

$\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i\delta\theta}{\hbar}L_{\hat{\theta}} \ \ \ \ \ (9)$

where ${L_{\hat{\theta}}}$ is an angular momentum operator to be determined.

Writing out the RHS of 8, we have

 $\displaystyle \psi\left(\mathbf{r}\right)-\left(\delta\boldsymbol{\theta}\times\mathbf{r}\right)\cdot\nabla\psi$ $\displaystyle =$ $\displaystyle \psi\left(\mathbf{r}\right)-\left(\delta\theta_{y}z-\delta\theta_{z}y\right)\frac{\partial\psi}{\partial x}+\left(\delta\theta_{x}z-\delta\theta_{z}x\right)\frac{\partial\psi}{\partial y}-\left(\delta\theta_{x}y-\delta\theta_{y}x\right)\frac{\partial\psi}{\partial z}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(\mathbf{r}\right)-\delta\theta_{x}\left(y\frac{\partial\psi}{\partial z}-z\frac{\partial\psi}{\partial y}\right)-\delta\theta_{y}\left(z\frac{\partial\psi}{\partial x}-x\frac{\partial\psi}{\partial z}\right)-\delta\theta_{z}\left(x\frac{\partial\psi}{\partial y}-y\frac{\partial\psi}{\partial x}\right)\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(\mathbf{r}\right)-\delta\boldsymbol{\theta}\cdot\frac{i}{\hbar}\mathbf{r}\times\mathbf{p}\psi\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(\mathbf{r}\right)-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\psi\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]\psi \ \ \ \ \ (14)$

Comparing this with 9, we see that

$\displaystyle L_{\hat{\theta}}=\hat{\boldsymbol{\theta}}\cdot\mathbf{L} \ \ \ \ \ (15)$

where ${\hat{\boldsymbol{\theta}}}$ is the unit vector along the axis of rotation. Since all rotations about the same axis commute, we can use the same procedure as above to generate a finite rotation ${\boldsymbol{\theta}}$ about an arbitrary axis and get

$\displaystyle U\left[R\left(\boldsymbol{\theta}\right)\right]=e^{-i\boldsymbol{\theta}\cdot\mathbf{L}/\hbar} \ \ \ \ \ (16)$

Combining translations and rotations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

When it comes to symmetries in quantum mechanics, we’ve looked at translations and rotations in two dimensions, and found that the generators are the momenta ${P_{x}}$ and ${P_{y}}$ for translations, and the angular momentum ${L_{z}}$ for rotations.

From the fact that ${L_{z}}$ does not commute with either momentum or position operators, you might guess that if we performed some sequence of translations and rotations on a system that the order in which these operations are done matters. In fact, you can see this by considering simple two-dimensional geometry, without reference to quantum mechanics. Consider the ${x}$ and ${y}$ axes on a sheet of graph paper. First, translate these axes by adding the vector ${\mathbf{r}}$ to all points, so that the new origin of coordinates lies at position ${\mathbf{r}}$ as referenced in the original coordinates. Next, do a rotation about the original origin by some angle ${\phi}$. This will move the new origin around the original ${z}$ axis. Now, do the inverse of the original translation by adding ${-\mathbf{r}}$ to all points. Finally, do the inverse of the rotation by rotating the system by ${-\phi}$ around the original ${z}$ axis. You’ll find that the ${xy}$ axes that have undergone this sequence of transformations does not coincide with the original ${xy}$ axes. However, if you did the same set of four transformations in the order: translate by ${\mathbf{r}}$, translate by ${-\mathbf{r}}$, rotate by ${\phi}$, rotate by ${-\phi}$, the transformed axes would coincide with the original axes.

To see how this works in quantum mechanics, we can again consider infinitesimal translations and rotations. If we start with a point at location ${\left[x,y\right]}$ and apply the four transformations described above, but now for an infinitesimal translation ${\boldsymbol{\varepsilon}=\varepsilon_{x}\hat{\mathbf{x}}+\varepsilon_{y}\hat{\mathbf{y}}}$ and rotation ${\varepsilon_{z}\hat{\mathbf{z}}}$, then the successive transformations work as follows. In each case, we’ll retain terms up to order ${\varepsilon_{x}\varepsilon_{z}}$ and ${\varepsilon_{y}\varepsilon_{z}}$ but discard terms of order ${\varepsilon_{x}^{2}}$, ${\varepsilon_{y}^{2}}$, ${\varepsilon_{z}^{2}}$ and higher. [I’m not quite sure of the rationale that allows us to do this, apart from the fact that it gives the right answer.]

 $\displaystyle \left[\begin{array}{c} x\\ y \end{array}\right]$ $\displaystyle {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)}$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right]\ \ \ \ \ (1)$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}\\ y+\varepsilon_{y} \end{array}\right]$ $\displaystyle {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)}$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (2)$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]$ $\displaystyle {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)}$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}-\left(y+\varepsilon_{y}\right)\varepsilon_{z}-\varepsilon_{x}\\ y+\varepsilon_{y}+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\varepsilon_{y} \end{array}\right]\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (4)$ $\displaystyle \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]$ $\displaystyle {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)}$ $\displaystyle \left[\begin{array}{c} x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}+\left[y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}\right]\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z}-\left[x-\left(y+\varepsilon_{y}\right)\varepsilon_{z}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{c} x-\varepsilon_{y}\varepsilon_{z}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (6)$

Thus, to this order in the infinitesimals, the combination of translation-rotation-translation-rotation is equivalent to a single translation by a distance ${\left[-\varepsilon_{y}\varepsilon_{z},\varepsilon_{x}\varepsilon_{z}\right]}$. We can write this in terms of the unitary quantum operators for translations and rotations as

$\displaystyle U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(-\varepsilon_{y}\varepsilon_{z}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (7)$

Using the forms of these operators for infinitesimal transformations, we can expand both sides to give

 $\displaystyle \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I+\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right]\times$ $\displaystyle \left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}\right)\left[I-\frac{i}{\hbar}\left(\varepsilon_{x}P_{x}+\varepsilon_{y}P_{y}\right)\right]$ $\displaystyle = \ \ \ \ \ (8)$ $\displaystyle I-\frac{i}{\hbar}\left(-\varepsilon_{y}\varepsilon_{z}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right)$

Since the infinitesimal displacements are arbitrary, this equation can be valid only if the coefficients of each combination of ${\varepsilon_{x},\varepsilon_{y}}$ and ${\varepsilon_{z}}$ are equal on both sides. As above, we’ll discard any terms of order ${\varepsilon_{x}^{2}}$, ${\varepsilon_{y}^{2}}$, ${\varepsilon_{z}^{2}}$ and higher. The algebra is straightforward although a bit tedious, so I’ll just give a couple of examples here.

The coefficient of ${\varepsilon_{z}}$ on its own is, on the LHS

$\displaystyle \frac{i\varepsilon_{z}}{\hbar}L_{z}-\frac{i\varepsilon_{z}}{\hbar}L_{z}=0 \ \ \ \ \ (9)$

On the RHS, there is no term in ${\varepsilon_{z}}$, so we get 0 on the RHS. In this case, we see the equation is consistent.

For the ${\varepsilon_{x}\varepsilon_{z}}$ term, we get on the LHS:

$\displaystyle \varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left(L_{z}P_{x}-L_{z}P_{x}-P_{x}L_{z}+L_{z}P_{x}\right)=-\varepsilon_{x}\varepsilon_{z}\frac{i^{2}}{\hbar^{2}}\left[P_{x},L_{z}\right] \ \ \ \ \ (10)$

On the RHS, the term is

$\displaystyle -\frac{i}{\hbar}\varepsilon_{x}\varepsilon_{z}P_{y} \ \ \ \ \ (11)$

Thus the condition here becomes

$\displaystyle \left[P_{x},L_{z}\right]=-i\hbar P_{y} \ \ \ \ \ (12)$

which agrees with the commutation relation we found earlier. By considering the coefficient of ${\varepsilon_{y}\varepsilon_{z}}$, we arrive at the other condition, which is

$\displaystyle \left[P_{y},L_{z}\right]=i\hbar P_{x} \ \ \ \ \ (13)$

The result of this calculation doesn’t tell us anything new about the translation or rotation operators, but it does show that the condition 7 is consistent with what we already know about the commutators of position, momentum and angular momentum.

As Shankar points out, we might think that we need to verify the conditions for an infinite number of combinations of rotations and translations, since each such combination gives rise to a different overall transformation. He says that it has actually been shown that the example above is sufficient to guarantee that all such combinations do in fact give valid results, although he doesn’t give the details. We are, however, given the exercise of verifying this claim for one special case, which we’ll consider now.

In this example, we’ll consider the same four transformations, in the same order, as above except that we’ll take the translation to be entirely in the ${x}$ direction so that ${\varepsilon_{y}=0}$. This time, we’ll retain terms up to ${\varepsilon_{x}\varepsilon_{z}^{2}}$ and see what we get. We start by repeating the calculations in 1 through 6. However, because we’re saving higher order terms, we need to represent the infinitesimal rotations by

$\displaystyle R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} 1-\frac{\varepsilon_{z}^{2}}{2} & -\varepsilon_{z}\\ \varepsilon_{z} & 1-\frac{\varepsilon_{z}^{2}}{2} \end{array}\right] \ \ \ \ \ (14)$

That is, we’re approximating ${\cos\varepsilon_{z}}$ by the first two terms in its expansion. Using this, we have

 $\displaystyle \left[\begin{array}{c} x\\ y \end{array}\right]$ $\displaystyle {\longrightarrow\atop T\left(\boldsymbol{\varepsilon}\right)}$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right]\ \ \ \ \ (15)$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}\\ y \end{array}\right]$ $\displaystyle {\longrightarrow\atop R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)}$ $\displaystyle \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (16)$ $\displaystyle \left[\begin{array}{c} x+\varepsilon_{x}-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]$ $\displaystyle {\longrightarrow\atop T\left(-\boldsymbol{\varepsilon}\right)}$ $\displaystyle \left[\begin{array}{c} \left(x+\varepsilon_{x}\right)\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\varepsilon_{x}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{c} x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z} \end{array}\right]\ \ \ \ \ (18)$ $\displaystyle \left[\begin{array}{c} x-y\varepsilon_{z}\\ y+\left(x+\varepsilon_{x}\right)\varepsilon_{z} \end{array}\right]$ $\displaystyle {\longrightarrow\atop R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)}$ $\displaystyle \left[\begin{array}{c} \left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\varepsilon_{z}\\ \left[y\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)+\varepsilon_{z}x+\varepsilon_{x}\varepsilon_{z}\right]\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-\left[x\left(1-\frac{\varepsilon_{z}^{2}}{2}\right)-y\varepsilon_{z}-\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\right]\varepsilon_{z} \end{array}\right]\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{c} x\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}+\frac{1}{4}\varepsilon_{x}\varepsilon_{z}^{4}\\ y\left(1+\frac{\varepsilon_{z}^{4}}{4}\right)+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (20)$

To get the last line, I used Maple to do the algebra in multiplying out the terms. At this point, we can neglect the terms in ${\varepsilon_{z}^{4}}$, leaving us with the overall transformation:

$\displaystyle \left[\begin{array}{c} x\\ y \end{array}\right]\longrightarrow\left[\begin{array}{c} x+\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\\ y+\varepsilon_{x}\varepsilon_{z} \end{array}\right] \ \ \ \ \ (21)$

This is equivalent to a translation by ${\boldsymbol{\varepsilon}=\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}}$, so by analogy with 7, we have the condition

$\displaystyle U\left[R\left(-\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(-\boldsymbol{\varepsilon}\right)U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]T\left(\boldsymbol{\varepsilon}\right)=T\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}\hat{\mathbf{x}}+\varepsilon_{x}\varepsilon_{z}\hat{\mathbf{y}}\right) \ \ \ \ \ (22)$

To expand the operators on the LHS and retain terms up to ${\varepsilon_{x}\varepsilon_{z}^{2}}$, we need to expand the rotation operators up to order ${\varepsilon_{z}^{2}}$. Treating the rotation operator as an exponential, this expansion is

$\displaystyle R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)=I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}+\ldots \ \ \ \ \ (23)$

Using this approximation gives us

 $\displaystyle \left(I+\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I+\frac{i}{\hbar}\varepsilon_{x}P_{x}\right]\left(I-\frac{i\varepsilon_{z}}{\hbar}L_{z}+\frac{i^{2}\varepsilon_{z}^{2}}{2\hbar^{2}}L_{z}^{2}\right)\left[I-\frac{i}{\hbar}\varepsilon_{x}P_{x}\right]$ $\displaystyle =$ $\displaystyle I-\frac{i}{\hbar}\left(\frac{1}{2}\varepsilon_{x}\varepsilon_{z}^{2}P_{x}+\varepsilon_{x}\varepsilon_{z}P_{y}\right) \ \ \ \ \ (24)$

By equating the coefficients of ${\varepsilon_{x}\varepsilon_{z}}$ we regain 12, so that condition checks out.

Extracting the coefficient of ${\varepsilon_{x}\varepsilon_{z}^{2}}$ on the LHS gives

 $\displaystyle \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}-\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}-\frac{L_{z}^{2}P_{x}}{2}+L_{z}^{2}P_{x}\right)$ $\displaystyle =$ $\displaystyle \frac{i^{3}}{\hbar^{3}}\varepsilon_{x}\varepsilon_{z}^{2}\left(-L_{z}P_{x}L_{z}+\frac{L_{z}^{2}P_{x}}{2}+\frac{P_{x}L_{z}^{2}}{2}\right) \ \ \ \ \ (25)$

Matching this to the ${\varepsilon_{x}\varepsilon_{z}^{2}}$ term on the RHS of 24, we get the condition specified in Shankar’s problem:

$\displaystyle -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2}=\hbar^{2}P_{x} \ \ \ \ \ (26)$

We can show that this condition reduces to the already-known commutators by using the identity

 $\displaystyle \left[\Lambda,\left[\Lambda,\Omega\right]\right]$ $\displaystyle =$ $\displaystyle \Lambda\left(\Lambda\Omega-\Omega\Lambda\right)-\left(\Lambda\Omega-\Omega\Lambda\right)\Lambda\ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -2\Lambda\Omega\Lambda+\Lambda^{2}\Omega+\Omega\Lambda^{2} \ \ \ \ \ (28)$

Applying this to 26 we have

 $\displaystyle -2L_{z}P_{x}L_{z}+L_{z}^{2}P_{x}+P_{x}L_{z}^{2}$ $\displaystyle =$ $\displaystyle \left[L_{z},\left[L_{z},P_{x}\right]\right]\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar\left[L_{z},P_{y}\right]\ \ \ \ \ (30)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar\left(-i\hbar P_{x}\right)\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \hbar^{2}P_{x} \ \ \ \ \ (32)$

Thus the more complicated condition 26 actually reduces to existing commutators.

Rotations through a finite angle; use of polar coordinates

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.3.

The angluar momentum operator ${L_{z}}$ is the generator of rotations in the ${xy}$ plane. We did the derivation for infinitesimal rotations, but we can generalize this to finite rotations in a similar manner to that used for translations. The unitary transformation for an infinitesimal rotation is

$\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)$

For rotation through a finite angle ${\phi_{0}}$, we divide up the angle into ${N}$ small angles, so ${\varepsilon_{z}=\phi_{0}/N}$. Rotation through the full angle ${\phi_{0}}$ is then given by

$\displaystyle U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]=\lim_{N\rightarrow\infty}\left(I-\frac{i\phi_{0}L_{z}}{N\hbar}\right)^{N}=e^{-i\phi_{0}L_{z}/\hbar} \ \ \ \ \ (2)$

The limit follows because the only non-trivial operator involved is ${L_{z}}$, so no commutation problems arise.

In rectangular coordinates, ${L_{z}}$ has the relatively non-obvious form

 $\displaystyle L_{z}$ $\displaystyle =$ $\displaystyle XP_{y}-YP_{x}\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -i\hbar\left(x\frac{\partial}{\partial y}-y\frac{\partial}{\partial x}\right) \ \ \ \ \ (4)$

so it’s not immediately clear that 2 does in fact lead to the desired rotation. Trying to calculate the exponential with ${L_{z}}$ expressed this way is not easy, given that the two terms ${x\frac{\partial}{\partial y}}$ and ${y\frac{\partial}{\partial x}}$ don’t commute.

It turns out that ${L_{z}}$ has a much simpler form in polar coordinates, and there are two ways of converting it to polar form. First, we recall the transformation equations.

 $\displaystyle x$ $\displaystyle =$ $\displaystyle \rho\cos\phi\ \ \ \ \ (5)$ $\displaystyle y$ $\displaystyle =$ $\displaystyle \rho\sin\phi\ \ \ \ \ (6)$ $\displaystyle \rho$ $\displaystyle =$ $\displaystyle \sqrt{x^{2}+y^{2}}\ \ \ \ \ (7)$ $\displaystyle \phi$ $\displaystyle =$ $\displaystyle \tan^{-1}\frac{y}{x} \ \ \ \ \ (8)$

From the chain rule, we can convert the derivatives:

 $\displaystyle \frac{\partial}{\partial x}$ $\displaystyle =$ $\displaystyle \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}+\frac{\partial\cos\phi}{\partial x}\frac{\partial}{\partial\left(\cos\phi\right)}\ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\partial\rho}{\partial x}\frac{\partial}{\partial\rho}-\sin\phi\frac{\partial\phi}{\partial x}\frac{\partial}{\left(-\sin\phi\right)\partial\phi}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{x}{\rho}\frac{\partial}{\partial\rho}-\sin\phi\frac{-y/x^{2}}{1+y^{2}/x^{2}}\left(\frac{-1}{\sin\phi}\right)\frac{\partial}{\partial\phi}\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (12)$

Using similar methods, we get for the other derivative

 $\displaystyle \frac{\partial}{\partial y}$ $\displaystyle =$ $\displaystyle \frac{\partial\rho}{\partial y}\frac{\partial}{\partial\rho}+\frac{\partial\sin\phi}{\partial x}\frac{\partial}{\partial\left(\sin\phi\right)}\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi} \ \ \ \ \ (14)$

Plugging these into 4 we have

 $\displaystyle L_{z}$ $\displaystyle =$ $\displaystyle -i\hbar\left[x\left(\frac{y}{\rho}\frac{\partial}{\partial\rho}+\frac{x}{\rho^{2}}\frac{\partial}{\partial\phi}\right)-y\left(\frac{x}{\rho}\frac{\partial}{\partial\rho}-\frac{y}{\rho^{2}}\frac{\partial}{\partial\phi}\right)\right]\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -i\hbar\frac{x^{2}+y^{2}}{\rho^{2}}\frac{\partial}{\partial\phi}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (17)$

Another method of converting ${L_{z}}$ to polar coordinates is to consider the effect of ${U\left[R\right]}$ for an infinitesimal rotation ${\varepsilon_{z}}$ on a state vector expressed in polar coordinates ${\psi\left(\rho,\phi\right)}$. Shankar states that

$\displaystyle \left\langle \rho,\phi\left|U\left[R\right]\right|\psi\left(\rho,\phi\right)\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (18)$

If you don’t believe this, it can be shown using a method similar to that for the one-dimensional translation. In this case, we’re dealing with position eigenkets in polar coordinates, so we have

$\displaystyle U\left[R\right]\left|\rho,\phi\right\rangle =\left|\rho,\phi+\varepsilon_{z}\right\rangle \ \ \ \ \ (19)$

Applying this, we get

 $\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle$ $\displaystyle =$ $\displaystyle U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left[R\right]\int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho,\phi+\varepsilon_{z}\right\rangle \left\langle \rho,\phi\left|\psi\right.\right\rangle \rho d\rho\;d\phi\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{0}^{2\pi}\int_{0}^{\infty}\left|\rho^{\prime},\phi^{\prime}\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime} \ \ \ \ \ (23)$

where in the last line, we used the substitution ${\phi^{\prime}=\phi+\varepsilon_{z}}$. (The substitution ${\rho^{\prime}=\rho}$ is used just to give the radial variable a different name in the integrand.) We can use the same limits of integration for ${\phi}$ and ${\phi^{\prime}}$, since we just need to ensure that the integral covers the total range of angles. It then follows that

 $\displaystyle \left\langle \rho,\phi\left|\psi_{\varepsilon_{z}}\right.\right\rangle$ $\displaystyle =$ $\displaystyle \int_{0}^{2\pi}\int_{0}^{\infty}\left\langle \rho,\phi\left|\rho^{\prime},\phi^{\prime}\right.\right\rangle \left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{0}^{2\pi}\int_{0}^{\infty}\delta\left(\rho-\rho^{\prime}\right)\delta\left(\phi-\phi^{\prime}\right)\left\langle \rho^{\prime},\phi^{\prime}-\varepsilon_{z}\left|\psi\right.\right\rangle \rho^{\prime}d\rho^{\prime}\;d\phi^{\prime}\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (26)$

Combining this with 1 we have

$\displaystyle \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi-\varepsilon_{z}\right) \ \ \ \ \ (27)$

Expanding the RHS to order ${\varepsilon_{z}}$ we have

$\displaystyle \left\langle \rho,\phi\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle =\psi\left(\rho,\phi\right)-\varepsilon_{z}\frac{\partial\psi}{\partial\phi} \ \ \ \ \ (28)$

from which 17 follows again.

Once we have ${L_{z}}$ in this form, the exponential form of a finite rotation is easier to interpret, for we have, from 2

 $\displaystyle e^{-i\phi_{0}L_{z}/\hbar}$ $\displaystyle =$ $\displaystyle \exp\left[-\phi_{0}\frac{\partial}{\partial\phi}\right]\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1-\phi_{0}\frac{\partial}{\partial\phi}+\frac{\phi_{0}^{2}}{2!}\frac{\partial^{2}}{\partial\phi^{2}}+\ldots \ \ \ \ \ (30)$

Applying this to a state function ${\psi\left(\rho,\phi\right)}$, we see that we get the Taylor series for ${\psi\left(\rho,\phi-\phi_{0}\right)}$, so the exponential does indeed represent a rotation through a finite angle.

Rotational transformations using passive transformations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.2.

We can also derive the generator of rotations ${L_{z}}$ by considering passive transformations of the position and momentum operators, in a way similar to that used for deriving the generator of translations. In a passive transformation, the operators are modified while the state vectors remain the same. For an infinitesimal rotation ${\varepsilon_{z}\hat{\mathbf{z}}}$ about the ${z}$ axis in two dimensions, the unitary operator has the form

$\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (1)$

For a finite rotation by ${\phi_{0}\hat{\mathbf{z}}}$ the transformations are given by

 $\displaystyle \left\langle X\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (2)$ $\displaystyle \left\langle Y\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (3)$ $\displaystyle \left\langle P_{x}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (4)$ $\displaystyle \left\langle P_{y}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (5)$

For the infinitesimal transformation, ${\phi_{0}=\varepsilon_{z}}$ and these equations reduce to

 $\displaystyle \left\langle X\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle -\left\langle Y\right\rangle \varepsilon_{z}\ \ \ \ \ (6)$ $\displaystyle \left\langle Y\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle \varepsilon_{z}+\left\langle Y\right\rangle \ \ \ \ \ (7)$ $\displaystyle \left\langle P_{x}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle -\left\langle P_{y}\right\rangle \varepsilon_{z}\ \ \ \ \ (8)$ $\displaystyle \left\langle P_{y}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle \varepsilon_{z}+\left\langle P_{y}\right\rangle \ \ \ \ \ (9)$

In the passive transformation scheme, we move the transformation to the operators to get

 $\displaystyle U^{\dagger}\left[R\right]XU\left[R\right]$ $\displaystyle =$ $\displaystyle X-Y\varepsilon_{z}\ \ \ \ \ (10)$ $\displaystyle U^{\dagger}\left[R\right]YU\left[R\right]$ $\displaystyle =$ $\displaystyle X\varepsilon_{z}+Y\ \ \ \ \ (11)$ $\displaystyle U^{\dagger}\left[R\right]P_{x}U\left[R\right]$ $\displaystyle =$ $\displaystyle P_{x}-P_{y}\varepsilon_{z}\ \ \ \ \ (12)$ $\displaystyle U^{\dagger}\left[R\right]P_{y}U\left[R\right]$ $\displaystyle =$ $\displaystyle P_{x}\varepsilon_{z}+P_{y} \ \ \ \ \ (13)$

Substituting 1 into these equations gives us the commutation relations satisfied by ${L_{z}}$. For example, in the first equation we have

 $\displaystyle U^{\dagger}\left[R\right]XU\left[R\right]$ $\displaystyle =$ $\displaystyle \left(I+\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)X\left(I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right)\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X+\frac{i\varepsilon_{z}}{\hbar}\left(L_{z}X-XL_{z}\right)\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X-Y\varepsilon_{z} \ \ \ \ \ (16)$

Equating the last two lines, we get

$\displaystyle \left[X,L_{z}\right]=-i\hbar Y \ \ \ \ \ (17)$

Similarly, for the other three equations we get

 $\displaystyle \left[Y,L_{z}\right]$ $\displaystyle =$ $\displaystyle i\hbar X\ \ \ \ \ (18)$ $\displaystyle \left[P_{x},L_{z}\right]$ $\displaystyle =$ $\displaystyle -i\hbar P_{y}\ \ \ \ \ (19)$ $\displaystyle \left[P_{y},L_{z}\right]$ $\displaystyle =$ $\displaystyle i\hbar P_{x} \ \ \ \ \ (20)$

We can use these commutation relations to derive the form of ${L_{z}}$ by using the commutation relations for coordinates and momenta:

$\displaystyle \left[X,P_{x}\right]=\left[Y,P_{y}\right]=i\hbar \ \ \ \ \ (21)$

with all other commutators involving ${X,Y,P_{x}}$ and ${P_{y}}$ being zero. Starting with 17, we see that

$\displaystyle \left[X,L_{z}\right]=-\left[X,P_{x}\right]Y \ \ \ \ \ (22)$

We can therefore deduce that

$\displaystyle L_{z}=-P_{x}Y+f\left(X,Y,P_{y}\right) \ \ \ \ \ (23)$

where ${f}$ is some unknown function. We must include ${f}$ since the commutators of ${X}$ with ${X,Y}$ and ${P_{y}}$ are all zero, so adding on ${f}$ still satisfies 17. (You can think of it as similar to adding on the constant in an indefinite integral.)

Now from 18, we have

$\displaystyle \left[Y,L_{z}\right]=\left[Y,P_{y}\right]X \ \ \ \ \ (24)$

so combining this with 23 we have

$\displaystyle L_{z}=-P_{x}Y+P_{y}X+g\left(X,Y\right) \ \ \ \ \ (25)$

The undetermined function is now a function only of ${X}$ and ${Y}$, since the dependence of ${L_{z}}$ on ${P_{x}}$ and ${P_{y}}$ has been determined uniquely by the commutators 17 and 18.

From 19 we have

$\displaystyle \left[P_{x},L_{z}\right]=\left[P_{x},X\right]P_{y} \ \ \ \ \ (26)$

We can see that this is satisfied already by 25, except that we now know that the function ${g}$ cannot depend on ${X}$, since then ${\left[P_{x},g\right]\ne0}$. Thus we have narrowed down ${L_{z}}$ to

$\displaystyle L_{z}=-P_{x}Y+P_{y}X+h\left(Y\right) \ \ \ \ \ (27)$

Finally, from 20 we have

$\displaystyle \left[P_{y},L_{z}\right]=-\left[P_{y},Y\right]P_{x} \ \ \ \ \ (28)$

This is satisfied by 27 if we take ${h=0}$ (well, technically, we could take ${h}$ to be some constant, but we might as well take the constant to be zero), giving us the final form for ${L_{z}}$:

$\displaystyle L_{z}=-P_{x}Y+P_{y}X \ \ \ \ \ (29)$

Rotational invariance in two dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.2.1.

As a first look at rotational invariance in quantum mechanics, we’ll look at two-dimensional rotations about the ${z}$ axis. Classically, a rotation by an angle ${\phi_{0}}$ about the ${z}$ axis is given by the matrix equation for the coordinates

$\displaystyle \left[\begin{array}{c} \bar{x}\\ \bar{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} x\\ y \end{array}\right] \ \ \ \ \ (1)$

The momenta transform the same way, since we are merely changing the direction of the ${x}$ and ${y}$ axes. Thus we have also

$\displaystyle \left[\begin{array}{c} \bar{p}_{x}\\ \bar{p}_{y} \end{array}\right]=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right]\left[\begin{array}{c} p_{x}\\ p_{y} \end{array}\right] \ \ \ \ \ (2)$

The rotation matrix can be written as an operator, defined as

$\displaystyle R\left(\phi_{0}\hat{\mathbf{z}}\right)=\left[\begin{array}{cc} \cos\phi_{0} & -\sin\phi_{0}\\ \sin\phi_{0} & \cos\phi_{0} \end{array}\right] \ \ \ \ \ (3)$

In quantum mechanics, due to the uncertainty principle we cannot specify position and momentum precisely at the same time, so as with the case of translational invariance, we deal with expectation values. As usual, a rotation is represented by a unitary operator ${U\left[R\left(\phi_{0}\hat{\mathbf{z}}\right)\right]}$ so that a quantum state transforms according to

$\displaystyle \left|\psi\right\rangle \rightarrow\left|\psi_{R}\right\rangle =U\left[R\right]\left|\psi\right\rangle \ \ \ \ \ (4)$

Dealing with expectation values means that the rotation operator must satisfy

 $\displaystyle \left\langle X\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle \cos\phi_{0}-\left\langle Y\right\rangle \sin\phi_{0}\ \ \ \ \ (5)$ $\displaystyle \left\langle Y\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle X\right\rangle \sin\phi_{0}+\left\langle Y\right\rangle \cos\phi_{0}\ \ \ \ \ (6)$ $\displaystyle \left\langle P_{x}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle \cos\phi_{0}-\left\langle P_{y}\right\rangle \sin\phi_{0}\ \ \ \ \ (7)$ $\displaystyle \left\langle P_{y}\right\rangle _{R}$ $\displaystyle =$ $\displaystyle \left\langle P_{x}\right\rangle \sin\phi_{0}+\left\langle P_{y}\right\rangle \cos\phi_{0} \ \ \ \ \ (8)$

The expectation values on the LHS of these equations are calculated using the rotated state, so that

$\displaystyle \left\langle X\right\rangle _{R}=\left\langle \psi_{R}\left|X\right|\psi_{R}\right\rangle \ \ \ \ \ (9)$

and so on.

In two dimensions, the position eigenkets depend on the two independent coordinates ${x}$ and ${y}$, and each of these eigenkets transforms under rotation in the same way the position variables above. Operating on such an eigenket with the unitary rotation operator thus must give

$\displaystyle U\left[R\right]\left|x,y\right\rangle =\left|x\cos\phi_{0}-y\sin\phi_{0},x\sin\phi_{0}+y\cos\phi_{0}\right\rangle \ \ \ \ \ (10)$

As with the translation operator, we try to construct an explicity form for ${U\left[R\right]}$ by considering an infinitesimal rotation ${\varepsilon_{z}\hat{\mathbf{z}}}$ about the ${z}$ axis. We propose that the unitary operator for this rotation is given by

$\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (11)$

where ${L_{z}}$ is, at this stage, an unknown operator called the generator of infinitesimal rotations (although, as the notation suggests, it will turn out to be the ${z}$ component of angular momentum). Under this rotation, we have, to first order in ${\varepsilon_{z}}$:

$\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (12)$

Note that we’ve omitted a possible phase factor in this rotation. That is, we could have written

$\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle =e^{i\varepsilon_{z}g\left(x,y\right)/\hbar}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \ \ \ \ \ (13)$

for some real function ${g\left(x,y\right)}$. Dropping the phase factor has the effect of making the momentum expectation values transform in the same way as the position expectaton values, as shown by Shankar in his equation 12.2.13, so we’ll just take the phase factor to be 1 from now on.

We can now find the position space form of a general state vector ${\left|\psi\right\rangle }$ under an infinitesimal rotation by following a similar procedure to that for a translation.

We have

 $\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle$ $\displaystyle =$ $\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|\psi\right\rangle \ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\left|x,y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x-y\varepsilon_{z},x\varepsilon_{z}+y\right\rangle \left\langle x,y\left|\psi\right.\right\rangle dx\;dy \ \ \ \ \ (17)$

We can now change integration variables if we define

 $\displaystyle x^{\prime}$ $\displaystyle \equiv$ $\displaystyle x-y\varepsilon_{z}\ \ \ \ \ (18)$ $\displaystyle y^{\prime}$ $\displaystyle =$ $\displaystyle x\varepsilon_{z}+y \ \ \ \ \ (19)$

The differentials transform by considering terms only up to first order in infinitesimal quantities, so we have

 $\displaystyle dx^{\prime}$ $\displaystyle =$ $\displaystyle dx-\varepsilon_{z}dy=dx\ \ \ \ \ (20)$ $\displaystyle dy^{\prime}$ $\displaystyle =$ $\displaystyle \varepsilon_{z}dx+dy=dy \ \ \ \ \ (21)$

Also, to first order in infinitesimal quantities, we can invert the variables to get

 $\displaystyle x^{\prime}+\varepsilon_{z}y^{\prime}$ $\displaystyle =$ $\displaystyle x-y\varepsilon_{z}+x\varepsilon_{z}^{2}+y\varepsilon_{z}=x\ \ \ \ \ (22)$ $\displaystyle y^{\prime}-\varepsilon_{z}x^{\prime}$ $\displaystyle =$ $\displaystyle x\varepsilon_{z}+y-x\varepsilon_{z}+y\varepsilon_{z}^{2}=y \ \ \ \ \ (23)$

The ranges of integration are still ${\pm\infty}$, so we end up with

$\displaystyle \left|\psi_{\varepsilon_{z}}\right\rangle =\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left|x^{\prime},y^{\prime}\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime} \ \ \ \ \ (24)$

Multiplying on the left by the bra ${\left\langle x,y\right|}$ we have

 $\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\left\langle x,y\left|x^{\prime},y^{\prime}\right.\right\rangle \left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\delta\left(x-x^{\prime}\right)\delta\left(y-y^{\prime}\right)\left\langle x^{\prime}+\varepsilon_{z}y^{\prime},y^{\prime}-\varepsilon_{z}x^{\prime}\left|\psi\right.\right\rangle dx^{\prime}\;dy^{\prime}\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle x+\varepsilon_{z}y,y-\varepsilon_{z}x\left|\psi\right.\right\rangle \ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right) \ \ \ \ \ (28)$

This can now be expanded in a 2-variable Taylor series to give, to first order in ${\varepsilon_{z}}$:

$\displaystyle \psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)=\psi\left(x,y\right)+y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y} \ \ \ \ \ (29)$

We can compare this with 11 inserted into 14:

 $\displaystyle \left\langle x,y\left|\psi_{\varepsilon_{z}}\right.\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle x,y\left|U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]\right|\psi\right\rangle \ \ \ \ \ (30)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle x,y\left|I-\frac{i\varepsilon_{z}L_{z}}{\hbar}\right|\psi\right\rangle \ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \psi\left(x,y\right)-\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle \ \ \ \ \ (32)$

Setting 32 equal to 29 we have

 $\displaystyle -\frac{i\varepsilon_{z}}{\hbar}\left\langle x,y\left|L_{z}\right|\psi\right\rangle$ $\displaystyle =$ $\displaystyle y\varepsilon_{z}\frac{\partial\psi}{\partial x}-x\varepsilon_{z}\frac{\partial\psi}{\partial y}\ \ \ \ \ (33)$ $\displaystyle \left\langle x,y\left|L_{z}\right|\psi\right\rangle$ $\displaystyle =$ $\displaystyle x\left(-i\hbar\frac{\partial\psi}{\partial y}\right)-y\left(-i\hbar\frac{\partial\psi}{\partial x}\right) \ \ \ \ \ (34)$

Using the position-space forms of the momenta

 $\displaystyle P_{x}$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (35)$ $\displaystyle P_{y}$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial y} \ \ \ \ \ (36)$

we see that ${L_{z}}$ is given by

$\displaystyle L_{z}=XP_{y}-YP_{x} \ \ \ \ \ (37)$

which is the quantum equivalent of the ${z}$ component of angular momentum, as promised.

Infinitesimal rotations in canonical and noncanonical transformations

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.8; Exercises 2.8.3 – 2.8.4.

Here are a couple of examples of transformations of variables and their consequences with regard to conservation laws.

First, we look at the 2-d harmonic oscillator where the Hamiltonian is

$\displaystyle H=\frac{1}{2m}\left(p_{x}^{2}+p_{y}^{2}\right)+\frac{1}{2}m\omega^{2}\left(x^{2}+y^{2}\right) \ \ \ \ \ (1)$

If we rotate the system so that both the coordinates and momenta get rotated, then

 $\displaystyle \bar{x}$ $\displaystyle =$ $\displaystyle x\cos\theta-y\sin\theta\ \ \ \ \ (2)$ $\displaystyle \bar{y}$ $\displaystyle =$ $\displaystyle x\sin\theta+y\cos\theta\ \ \ \ \ (3)$ $\displaystyle \bar{p}_{x}$ $\displaystyle =$ $\displaystyle p_{x}\cos\theta-p_{y}\sin\theta\ \ \ \ \ (4)$ $\displaystyle \bar{p}_{y}$ $\displaystyle =$ $\displaystyle p_{x}\sin\theta+p_{y}\cos\theta \ \ \ \ \ (5)$

We can show by direct calculation that ${H}$ is invariant under this transformation, and we can verify that this is a canonical transformation. Shankar shows in his equation 2.8.8 that the generator of this transformation is the angular momentum ${\ell_{z}=xp_{y}-yp_{x}}$.

However, if we rotate only the coordinates and not the momenta, we get the transformation:

 $\displaystyle \bar{x}$ $\displaystyle =$ $\displaystyle x\cos\theta-y\sin\theta\ \ \ \ \ (6)$ $\displaystyle \bar{y}$ $\displaystyle =$ $\displaystyle x\sin\theta+y\cos\theta\ \ \ \ \ (7)$ $\displaystyle \bar{p}_{x}$ $\displaystyle =$ $\displaystyle p_{x}\ \ \ \ \ (8)$ $\displaystyle \bar{p}_{y}$ $\displaystyle =$ $\displaystyle p_{y} \ \ \ \ \ (9)$

Again, we can show by direct calculation that

$\displaystyle \bar{x}^{2}+\bar{y}^{2}=x^{2}+y^{2} \ \ \ \ \ (10)$

so ${H}$ is also invariant under this transformation. However, this transformation is noncanonical, as we can see by calculating one of the Poisson brackets:

 $\displaystyle \left\{ \bar{x},\bar{p}_{x}\right\}$ $\displaystyle =$ $\displaystyle \sum_{i}\left(\frac{\partial\overline{x}}{\partial q_{i}}\frac{\partial\bar{p}_{x}}{\partial p_{i}}-\frac{\partial\overline{x}}{\partial p_{i}}\frac{\partial\bar{p}_{x}}{\partial q_{i}}\right)\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \cos\theta\ne1 \ \ \ \ \ (12)$

The other mixed brackets (with a coordinate and a momentum) are also not either 0 or 1 as would be required if the transformation were to be canonical.

In order for this transformation to give rise to a conservation law, we would need to find a generator ${g}$ that satisfied, for an infinitesimal rotation ${\varepsilon}$:

 $\displaystyle \bar{q}_{i}$ $\displaystyle =$ $\displaystyle q_{i}+\varepsilon\frac{\partial g}{\partial p_{i}}\equiv q_{i}+\delta q_{i}\ \ \ \ \ (13)$ $\displaystyle \bar{p}_{i}$ $\displaystyle =$ $\displaystyle p_{i}-\varepsilon\frac{\partial g}{\partial q_{i}}\equiv p_{i}+\delta p_{i} \ \ \ \ \ (14)$

For an infinitesimal rotation, the transformation 6 becomes

 $\displaystyle \bar{x}$ $\displaystyle =$ $\displaystyle x-\varepsilon y\ \ \ \ \ (15)$ $\displaystyle \bar{y}$ $\displaystyle =$ $\displaystyle y+\varepsilon x\ \ \ \ \ (16)$ $\displaystyle \bar{p}_{x}$ $\displaystyle =$ $\displaystyle p_{x}\ \ \ \ \ (17)$ $\displaystyle \bar{p}_{y}$ $\displaystyle =$ $\displaystyle p_{y} \ \ \ \ \ (18)$

Therefore, the generator would have to satisfy

 $\displaystyle \frac{\partial g}{\partial p_{x}}$ $\displaystyle =$ $\displaystyle -y\ \ \ \ \ (19)$ $\displaystyle \frac{\partial g}{\partial p_{y}}$ $\displaystyle =$ $\displaystyle x\ \ \ \ \ (20)$ $\displaystyle \frac{\partial g}{\partial x}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (21)$ $\displaystyle \frac{\partial g}{\partial y}$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (22)$

The last two conditions state that ${g}$ cannot depend on ${x}$ or ${y}$, but integrating the first two conditions, we get

$\displaystyle g=-yp_{x}+xp_{y}+f\left(x,y\right) \ \ \ \ \ (23)$

where ${f}$ is a function that depends only on ${x}$ and/or ${y}$. Thus there is no ${g}$ that satisfies all four conditions, so there is no conservation law associated with a rotation of the coordinates only, even though the Hamiltonian is invariant under this transformation. Only canonical transformations that leave ${H}$ invariant give rise to conservation laws.

As another example, suppose he have the one-dimensional system with

$\displaystyle H=\frac{1}{2}\left(p^{2}+x^{2}\right) \ \ \ \ \ (24)$

and perform a rotation in phase space, that is, in the ${x-p}$ plane:

 $\displaystyle \bar{x}$ $\displaystyle =$ $\displaystyle x\cos\theta-p\sin\theta\ \ \ \ \ (25)$ $\displaystyle \bar{p}$ $\displaystyle =$ $\displaystyle x\sin\theta+p\cos\theta \ \ \ \ \ (26)$

The Hamiltonian is invariant:

 $\displaystyle \bar{p}^{2}+\bar{x}^{2}$ $\displaystyle =$ $\displaystyle x^{2}\sin^{2}\theta+2xp\sin\theta\cos\theta+p^{2}\cos^{2}\theta+\ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle$ $\displaystyle x^{2}\cos^{2}\theta-2xp\sin\theta\cos\theta+p^{2}\sin^{2}\theta\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle x^{2}+p^{2} \ \ \ \ \ (29)$

The transformation is canonical as we can verify by calculating the Poisson bracket

 $\displaystyle \left\{ \bar{x},\bar{p}\right\}$ $\displaystyle =$ $\displaystyle \frac{\partial\overline{x}}{\partial x}\frac{\partial\bar{p}}{\partial p}-\frac{\partial\overline{x}}{\partial p}\frac{\partial\bar{p}}{\partial x}\ \ \ \ \ (30)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \cos^{2}\theta-\left(-\sin^{2}\theta\right)\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1 \ \ \ \ \ (32)$

An infinitesimal rotation gives the transformation

 $\displaystyle \bar{x}$ $\displaystyle =$ $\displaystyle x-\varepsilon p\ \ \ \ \ (33)$ $\displaystyle \bar{p}$ $\displaystyle =$ $\displaystyle p+\varepsilon x \ \ \ \ \ (34)$

To find the generator, we need to solve 13 and 14:

 $\displaystyle \frac{\partial g}{\partial p}$ $\displaystyle =$ $\displaystyle -p\ \ \ \ \ (35)$ $\displaystyle \frac{\partial g}{\partial x}$ $\displaystyle =$ $\displaystyle -x \ \ \ \ \ (36)$

These can be integrated to give

$\displaystyle g\left(x,p\right)=-\frac{1}{2}\left(p^{2}+x^{2}\right)+C \ \ \ \ \ (37)$

where ${C}$ is a constant of integration. Thus the quantity that is conserved is (apart from the minus sign, which we could eliminate by rotating through ${-\theta}$ instead of ${\theta}$) is just the original Hamiltonian, or total energy.

Angular momentum as a generator of rotations

Required math: calculus

Required physics: 3-d Schrödinger equation

Reference: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 4.56.

An interesting property of the operator ${L_{z}}$ is that it can act as a generator of rotations about the ${z}$ axis.

Using the series expansion of the exponential, and the form of ${L_{z}}$ in spherical coordinates, ${L_{z}=(\hbar/i)\partial/\partial\phi}$, we get

$\displaystyle e^{iL_{z}\varphi/\hbar}f(\phi)=\sum_{j=0}^{\infty}\frac{\varphi^{j}}{j!}\frac{\partial^{j}f}{\partial\phi^{j}} \ \ \ \ \ (1)$

which is the Taylor series for ${f(\phi+\varphi)}$. Thus the operator ${e^{iL_{z}\varphi/\hbar}}$ effectively rotates ${f\left(\phi\right)}$ through an angle ${\varphi}$.

In general, ${e^{i\mathbf{L}\cdot\hat{n}\varphi/\hbar}}$ is an operator that will rotate a function through an angle ${\varphi}$ about the axis ${\hat{n}}$.

The use of ${\mathbf{L}}$ causes rotations in ordinary 3-d space. If we want to rotate spinors, we can use the spin operator ${\mathbf{S}}$, and in the case of spin 1/2, we can use the Pauli matrices to produce the operator ${e^{i\sigma\cdot\hat{n}\varphi/2}}$, which will rotate a spin 1/2 spinor ${\chi_{\pm}}$.

To see what this means, we can work out the exponential in a more convenient form. We start with

$\displaystyle \hat{n}\cdot\mathbf{\sigma}=\sigma_{x}\sin\theta\cos\phi+\sigma_{y}\sin\theta\sin\phi+\sigma_{z}\cos\theta \ \ \ \ \ (2)$

Substituting the spin matrices, we get

$\displaystyle \hat{n}\cdot\mathbf{\sigma}=\left(\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right)\sin\theta\cos\phi+\left(\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right)\sin\theta\sin\phi+\left(\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right)\cos\theta=\left(\begin{array}{cc} \cos\theta & \sin\theta e^{-i\phi}\\ \sin\theta e^{i\phi} & -\cos\theta \end{array}\right) \ \ \ \ \ (3)$

Note that (by direct multiplication):

 $\displaystyle (\hat{n}\cdot\mathbf{\sigma})^{2j}$ $\displaystyle =$ $\displaystyle I\ \ \ \ \ (4)$ $\displaystyle (\hat{n}\cdot\mathbf{\sigma})^{2j+1}$ $\displaystyle =$ $\displaystyle \hat{n}\cdot\mathbf{\sigma} \ \ \ \ \ (5)$

where ${j=0,1,2,3,\ldots}$ That is, all even powers of ${\hat{n}\cdot\mathbf{\sigma}}$ are the identity matrix ${I}$ and all odd powers are ${\hat{n}\cdot\mathbf{\sigma}}$ itself. We can plug this into the expression ${e^{i\sigma\cdot\hat{n}\varphi/2}}$ for spinor rotations and use the series expansion of the exponential:

 $\displaystyle e^{i(\hat{n}\cdot\mathbf{\sigma})\varphi/2}$ $\displaystyle =$ $\displaystyle \sum_{j=0}^{\infty}\frac{(i(\hat{n}\cdot\mathbf{\sigma})\varphi/2)^{j}}{j!}\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{j=0}^{\infty}\frac{(i\varphi/2)^{2j}}{(2j)!}+(\hat{n}\cdot\mathbf{\sigma})\sum_{j=0}^{\infty}\frac{(i\varphi/2)^{2j+1}}{(2j+1)!}\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1-\frac{(\varphi/2)^{2}}{2!}+\frac{(\varphi/2)^{4}}{4!}-\ldots+i(\hat{n}\cdot\mathbf{\sigma})\left[\frac{(\varphi/2)}{1!}-\frac{(\varphi/2)^{3}}{3!}+\frac{(\varphi/2)^{5}}{5!}-\ldots\right]\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \cos(\varphi/2)+i(\hat{n}\cdot\mathbf{\sigma})\sin(\varphi/2) \ \ \ \ \ (9)$

where we have used the standard series expansions for cos and sin to get the last line.

If the axis of rotation is the ${x}$-axis, then ${\hat{n}=[1,0,0]}$ and ${\hat{n}\cdot\mathbf{\sigma}=\sigma_{x}}$ so for a rotation of ${\varphi=\pi}$ we get for the rotation matrix ${R}$:

$\displaystyle R=e^{i\sigma\cdot\hat{n}\varphi/2}=\left(\begin{array}{cc} 0 & i\\ i & 0 \end{array}\right)=i\sigma_{x} \ \ \ \ \ (10)$

which swaps ${\chi_{+}}$ and ${\chi_{-}}$; that is, it converts spin up into spin down, and vice versa, as you would expect. The extra factor of ${i}$ is a phase shift in the wave function and can produce interference effects between particles.

With ${\hat{n}=[0,1,0]}$ and ${\varphi=\pi/2}$ we get ${\hat{n}\cdot\mathbf{\sigma}=\sigma_{y}}$

$\displaystyle R=\frac{\sqrt{2}}{2}(I+i\sigma_{y})=\frac{\sqrt{2}}{2}\left(\begin{array}{cc} 1 & 1\\ -1 & 1 \end{array}\right) \ \ \ \ \ (11)$

When applied to ${\chi_{+}}$ we get

$\displaystyle R\chi_{+}=\frac{\sqrt{2}}{2}\left(\begin{array}{cc} 1 & 1\\ -1 & 1 \end{array}\right)\left(\begin{array}{c} 1\\ 0 \end{array}\right)=\frac{\sqrt{2}}{2}\left(\begin{array}{c} 1\\ -1 \end{array}\right) \ \ \ \ \ (12)$

This is an eigenspinor of ${\sigma_{x}}$ which again is what you’d expect, since the rotation rotates the ${z}$ axis into the ${x}$ axis.

With ${\hat{n}=[0,0,1]}$ and ${\varphi=2\pi}$ we get ${\hat{n}\cdot\mathbf{\sigma}=\sigma_{z}}$

$\displaystyle R=-I=\left(\begin{array}{cc} -1 & 0\\ 0 & -1 \end{array}\right) \ \ \ \ \ (13)$

The fact that a rotation through ${2\pi}$ produces a factor of ${-1}$ is another phase shift effect, as in the first example above, and does actually produce interference effects, for example, when experiments involving rotation in a magnetic field are done.

Extra bit

Irrelevant to the question, but a cool proof so I thought I’d include it anyway.

The ${k}$th derivative of ${x^{n}f(x)}$ is given by

$\displaystyle \frac{d^{k}}{dx^{k}}(x^{n}f(x))=\sum_{j=0}^{k}\frac{n!}{(n-j)!}\left(\begin{array}{c} k\\ j \end{array}\right)x^{n-j}f^{(k-j)} \ \ \ \ \ (14)$

where ${\left(\begin{array}{c} k\\ j \end{array}\right)=\frac{k!}{j!(k-j)!}}$ is the binomial coefficient, and ${f^{(k-j)}}$ is the ${(k-j)}$th derivative of ${f}$.

We can prove this by induction. First, we prove the anchor step, for ${k=0}$. From this equation with ${k=0}$ both sides of the equation give ${x^{n}f(x)}$ so the formula is valid here.

Next, we assume the above equation is valid for ${k}$ and prove this implies it is valid also for ${k+1}$. Taking the derivative of both sides gives

$\displaystyle \frac{d^{k+1}}{dx^{k+1}}(x^{n}f(x))=\sum_{j=0}^{k}\frac{n!}{(n-j)!}\left(\begin{array}{c} k\\ j \end{array}\right)(n-j)x^{n-j-1}f^{(k-j)}+\sum_{j=0}^{k}\frac{n!}{(n-j)!}\left(\begin{array}{c} k\\ j \end{array}\right)(n-j)x^{n-j}f^{(k-j+1)} \ \ \ \ \ (15)$

$\displaystyle =\frac{n!}{(n-k)!}(n-k)x^{n-k-1}f+\sum_{j=1}^{k}x^{n-j}f^{(k-j+1)}\left[\frac{n!}{(n-j+1)!}\left(\begin{array}{c} k\\ j-1 \end{array}\right)(n-j+1)+\frac{n!}{(n-j)!}\left(\begin{array}{c} k\\ j \end{array}\right)\right]+x^{n}f^{(k+1)} \ \ \ \ \ (16)$

$\displaystyle =\frac{n!}{(n-k-1)!}x^{n-k-1}f+\sum_{j=1}^{k}x^{n-j}f^{(k-j+1)}\frac{n!}{(n-j)!}\left[\left(\begin{array}{c} k\\ j-1 \end{array}\right)+\left(\begin{array}{c} k\\ j \end{array}\right)\right]+x^{n}f^{(k+1)} \ \ \ \ \ (17)$

$\displaystyle =\sum_{j=0}^{k+1}\frac{n!}{(n-j)!}\left(\begin{array}{c} k+1\\ j \end{array}\right)x^{n-j}f^{(k+1-j)} \ \ \ \ \ (18)$

In going from step 1 to step 2, we have separated out the ${j=k}$ term from the first sum and the ${j=0}$ term from the second sum. Then we replaced ${j}$ by ${j-1}$ in the first sum so we could group together common powers of ${x}$ in the two sums.

The last step uses the formula

$\displaystyle \left(\begin{array}{c} k\\ j-1 \end{array}\right)+\left(\begin{array}{c} k\\ j \end{array}\right)=\left(\begin{array}{c} k+1\\ j \end{array}\right) \ \ \ \ \ (19)$

which can be proved by putting the LHS over a common denominator and adding.