Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are on the welcome page.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. Details of the number of visits and distinct visitors are given on the hit statistics page.

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

I should point out that although I did study physics at the university level, this was back in the 1970s and by the time I started this blog in December 2010, I had forgotten pretty much everything I had learned back then. This blog represents my journey back to some level of literacy in physics. I am by no means a professional physicist or an authority on any aspect of the subject. I offer this blog as a record of my own notes and problem solutions as I worked through various books, in the hope that it will help, and possibly even inspire, others to explore this wonderful subject.

Before leaving a comment, you may find it useful to read the “Instructions for commenters“.

Total angular momentum – finite rotations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercises 12.5.4 – 12.5.5.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

For infinitesimal 3-d rotations, we’ve seen that the generator is {\hat{\boldsymbol{\theta}}\cdot\mathbf{L}} where {\hat{\boldsymbol{\theta}}} is a unit vector along the axis of rotation. Generalizing this to the total angular momentum {\mathbf{J}} we have the operator for a general 3-d rotation through an infinitesimal angle:

\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i\delta\boldsymbol{\theta}\cdot\mathbf{J}}{\hbar} \ \ \ \ \ (1)

In principle ‘all’ we need to do to get the operator for a finite 3-d rotation is take the exponential, in the form

\displaystyle e^{-i\boldsymbol{\theta}\cdot\mathbf{J}/\hbar} \ \ \ \ \ (2)

 

The problem is that in this case, {\mathbf{J}} is infinite dimensional, so the exponential of such a matrix cannot be calculated. However, because the components of {\mathbf{J}} are block diagonal (see Shankar’s equations 12.5.23 and 12.5.24), all powers of these components are also block diagonal, and thus so is the exponential. For a given value of the total momentum quantum number {j}, the corresponding block is a {\left(2j+1\right)\times\left(2j+1\right)} sub-matrix {J_{i}^{\left(j\right)}} (where the suffix {i} refers to {x}, {y} or {z}), so the block in the exponential, defined as {D^{\left(j\right)}\left[R\left(\boldsymbol{\theta}\right)\right]} is calculated as

\displaystyle D^{\left(j\right)}\left[R\left(\boldsymbol{\theta}\right)\right]=\sum_{n=0}^{\infty}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}\left(\hat{\boldsymbol{\theta}}\cdot\mathbf{J}\right)^{n} \ \ \ \ \ (3)

 

This may still look pretty hopeless in terms of actual calculation, but for small values of {j}, we can actually get closed-form solutions.

First, we look at the eigenvalues of {\hat{\boldsymbol{\theta}}\cdot\mathbf{J}}. If we review the calculations by which we found that the eigenvalues of {L_{z}} (and thus also {J_{z}}) were {-j,-j+1,\ldots,j-1,j} (multiplied by {\hbar}), we see that there’s nothing special about the fact that we chose the {z} direction over any other direction as the component of {\mathbf{J}} for which we calculated the eigenvalues. We could, for example, go through exactly the same calculations taking {L_{x}} to be the chosen component. We would then define raising and lowering operators as {J_{\pm}=J_{y}\pm iJ_{z}} and come out with the conclusion that the eigenvalues of {L_{x}} are also {-j,-j+1,\ldots,j-1,j} (multiplied by {\hbar}). We can generalize even further and choose the ‘special’ direction to be the axis of rotation, however that axis may be oriented in space. This would lead us to the conclusion that the eigenvalues of {\hat{\boldsymbol{\theta}}\cdot\mathbf{J}} are the same as those of {J_{z}}.

Now consider the operator (where {J\equiv\hat{\boldsymbol{\theta}}\cdot\mathbf{J}}):

\displaystyle \left(J-j\hbar\right)\left(J-\left(j-1\right)\hbar\right)\left(J-\left(j-2\right)\hbar\right)\ldots\left(J+\left(j-1\right)\hbar\right)\left(J+j\hbar\right) \ \ \ \ \ (4)

 

First, suppose that {J=J_{z}} (so that {\hat{\boldsymbol{\theta}}} is along the {z} axis). Then if we’re in an eigenstate {\left|jm\right\rangle } of {J_{z}}, the term {\left(J-m\hbar\right)} in this operator will give zero when operating on this state. Thus the operator 4 will always give zero when operating on an eigenstate of {J_{z}}. However, since the set of eigenstates of {J_{z}} span the space in which the total angular momentum number is {j}, any state in this space can be expressed as a linear combination of eigenstates of {J_{z}}, so when 4 operates on this state, there is always one factor in the operator that gives zero for each term in the linear combination. Thus this operator always gives zero when operating on any state with angular momentum {j}. [Note that the order in which we write the factors in 4 doesn’t matter; the only operator in the expression is {J}, so all the factors commute with each other.] That is, we have

\displaystyle \left(J-j\hbar\right)\left(J-\left(j-1\right)\hbar\right)\left(J-\left(j-2\right)\hbar\right)\ldots\left(J+\left(j-1\right)\hbar\right)\left(J+j\hbar\right)=0 \ \ \ \ \ (5)

 

If we multiply out this operator, we get a polynomial of degree {2j+1} in {J}. The highest power can thus be written as a linear combination of lower powers:

\displaystyle J^{2j+1}=\sum_{n=0}^{2j}a_{n}J^{n} \ \ \ \ \ (6)

where the coefficients {a_{n}} can be found by expanding the formula (which we won’t need to do here). But this implies that all higher powers of {J} can also be written as linear combinations of powers up to {J^{2j}}. To see this, consider

\displaystyle J^{2j+2} \displaystyle = \displaystyle J\times J^{2j+1}\ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle \sum_{n=0}^{2j}a_{n}J^{n+1}\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle \sum_{n=1}^{2j}a_{n-1}J^{n}+a_{2j}J^{2j+1}\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle \sum_{n=1}^{2j}a_{n-1}J^{n}+a_{2j}\sum_{n=0}^{2j}a_{n}J^{n} \ \ \ \ \ (10)

Thus {J^{2j+2}} can be written as a linear combination of powers of {J} up to {J^{2j}}. By iterating this process, we can express all higher powers of {J} as a linear combination of powers of {J} up to {J^{2j}}. Here are a couple of examples. [Shankar marks these as ‘hard’, though I can’t see that they are any more difficult than most of his other problems, so hopefully I’m not missing anything.]

Consider {D^{\left(1/2\right)}\left[R\right]}, starting from 3. We first use 5 with {j=\frac{1}{2}}:

\displaystyle \left(J-\frac{\hbar}{2}\right)\left(J+\frac{\hbar}{2}\right) \displaystyle = \displaystyle 0\ \ \ \ \ (11)
\displaystyle J^{2} \displaystyle = \displaystyle \frac{\hbar^{2}}{4} \ \ \ \ \ (12)

We can now iterate this formula as described above to get (to be accurate, all the {I} and {J} terms should have a superscript {\left(1/2\right)} to indicate that they refer to the subspace with {j=\frac{1}{2}}, but this would clutter the notation).

\displaystyle J^{0} \displaystyle = \displaystyle I\ \ \ \ \ (13)
\displaystyle J^{1} \displaystyle = \displaystyle J\ \ \ \ \ (14)
\displaystyle J^{2} \displaystyle = \displaystyle \left(\frac{\hbar}{2}\right)^{2}I\ \ \ \ \ (15)
\displaystyle J^{3} \displaystyle = \displaystyle \left(\frac{\hbar}{2}\right)^{2}J\ \ \ \ \ (16)
\displaystyle J^{4} \displaystyle = \displaystyle \left(\frac{\hbar}{2}\right)^{4}I\ \ \ \ \ (17)
\displaystyle \displaystyle \vdots

From 3 we have

\displaystyle D^{\left(1/2\right)}\left[R\right] \displaystyle = \displaystyle \sum_{n=0}^{\infty}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}J^{n} \ \ \ \ \ (18)

We can consider the even and odd terms in this sum separately. For the evens:

\displaystyle \left(D^{\left(1/2\right)}\left[R\right]\right)_{even} \displaystyle = \displaystyle \sum_{n\;even}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}\left(\frac{\hbar}{2}\right)^{n}I\ \ \ \ \ (19)
\displaystyle \displaystyle = \displaystyle \sum_{n\;even}\frac{1}{n!}\left(\frac{-i\theta}{2}\right)^{n}I\ \ \ \ \ (20)
\displaystyle \displaystyle = \displaystyle \left[1-\left(\frac{\theta}{2}\right)^{2}\frac{1}{2!}+\left(\frac{\theta}{2}\right)^{4}\frac{1}{4!}-\ldots\right]I\ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle I\cos\frac{\theta}{2} \ \ \ \ \ (22)

For the odds:

\displaystyle \left(D^{\left(1/2\right)}\left[R\right]\right)_{odd} \displaystyle = \displaystyle \sum_{n\;odd}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}\left(\frac{\hbar}{2}\right)^{n-1}J\ \ \ \ \ (23)
\displaystyle \displaystyle = \displaystyle \frac{2J}{\hbar}\sum_{n\;odd}\frac{\left(-i\right)^{n}}{n!}\left(\frac{\theta}{2}\right)^{n}\ \ \ \ \ (24)
\displaystyle \displaystyle = \displaystyle \frac{2J}{\hbar}\left[-\frac{\theta}{2}i+\left(\frac{\theta}{2}\right)^{3}\frac{i}{3!}-\left(\frac{\theta}{2}\right)^{5}\frac{i}{5!}+\ldots\right]\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle -\frac{2iJ}{\hbar}\sin\frac{\theta}{2} \ \ \ \ \ (26)

Thus we get

\displaystyle D^{\left(1/2\right)}\left[R\right] \displaystyle = \displaystyle I\cos\frac{\theta}{2}-\frac{2iJ}{\hbar}\sin\frac{\theta}{2}\ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle I^{\left(1/2\right)}\cos\frac{\theta}{2}-\frac{2i\hat{\boldsymbol{\theta}}\cdot\mathbf{J}^{\left(1/2\right)}}{\hbar}\sin\frac{\theta}{2} \ \ \ \ \ (28)

(I’ve restored the superscript {\left(1/2\right)}.)

Going through the same process for {j=1}, we first look at 5 to get

\displaystyle \left(J-\hbar\right)J\left(J+\hbar\right) \displaystyle = \displaystyle 0\ \ \ \ \ (29)
\displaystyle J^{3} \displaystyle = \displaystyle \hbar^{2}J \ \ \ \ \ (30)

Again, by iterating we find the pattern:

\displaystyle J^{0} \displaystyle = \displaystyle I\ \ \ \ \ (31)
\displaystyle J^{1} \displaystyle = \displaystyle J\ \ \ \ \ (32)
\displaystyle J^{2} \displaystyle = \displaystyle J^{2}\ \ \ \ \ (33)
\displaystyle J^{3} \displaystyle = \displaystyle \hbar^{2}J\ \ \ \ \ (34)
\displaystyle J^{4} \displaystyle = \displaystyle \hbar^{2}J^{2}\ \ \ \ \ (35)
\displaystyle \displaystyle \vdots

We then have

\displaystyle D^{\left(1\right)}\left[R\right]=\sum_{n=0}^{\infty}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}J^{n} \ \ \ \ \ (36)

Again, we can consider evens and odds separately:

\displaystyle \left(D^{\left(1\right)}\left[R\right]\right)_{even} \displaystyle = \displaystyle \sum_{n\;even}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}J^{n}\ \ \ \ \ (37)
\displaystyle \displaystyle = \displaystyle I+\sum_{n=2,4,\ldots}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}\hbar^{n-2}J^{2}\ \ \ \ \ (38)
\displaystyle \displaystyle = \displaystyle I+\frac{J^{2}}{\hbar^{2}}\sum_{n=2,4,\ldots}\frac{\left(-i\theta\right)^{n}}{n!}\ \ \ \ \ (39)
\displaystyle \displaystyle = \displaystyle I+\frac{J^{2}}{\hbar^{2}}\left(\cos\theta-1\right) \ \ \ \ \ (40)

For the odds:

\displaystyle \left(D^{\left(1\right)}\left[R\right]\right)_{even} \displaystyle = \displaystyle \sum_{n\;odd}\frac{1}{n!}\left(\frac{-i\theta}{\hbar}\right)^{n}\hbar^{n-1}J\ \ \ \ \ (41)
\displaystyle \displaystyle = \displaystyle \frac{J}{\hbar}\sum_{n\;odd}\frac{\left(-i\theta\right)^{n}}{n!}\ \ \ \ \ (42)
\displaystyle \displaystyle = \displaystyle -\frac{iJ}{\hbar}\sin\theta \ \ \ \ \ (43)

We have

\displaystyle D^{\left(1\right)}\left[R\right] \displaystyle = \displaystyle I+\frac{J^{2}}{\hbar^{2}}\left(\cos\theta-1\right)-\frac{iJ}{\hbar}\sin\theta\ \ \ \ \ (44)
\displaystyle \displaystyle = \displaystyle I^{\left(1\right)}+\frac{\left(\hat{\boldsymbol{\theta}}\cdot\mathbf{J}^{\left(1\right)}\right)^{2}}{\hbar^{2}}\left(\cos\theta-1\right)-\frac{i\hat{\boldsymbol{\theta}}\cdot\mathbf{J}^{\left(1\right)}}{\hbar}\sin\theta \ \ \ \ \ (45)

[I’m not sure why Shankar restricts this problem to the {x} axis, or, for that matter, why he expects us to use the matrix for {J_{x}}.]

Angular momentum in 3-d: expectation values and uncertainty principle

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.5.3.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

For 3-d angular momentum, we’ve seen that the components {J_{x}} and {J_{y}} can be written in terms of raising and lowering operators

\displaystyle J_{\pm}\equiv J_{x}\pm iJ_{y} \ \ \ \ \ (1)

In the basis of eigenvectors of {J^{2}} and {J_{z}} (that is, the states {\left|jm\right\rangle }) the raising and lowering operators have the following effects:

\displaystyle J_{\pm}\left|jm\right\rangle =\hbar\sqrt{\left(j\mp m\right)\left(j\pm m+1\right)}\left|j,m\pm1\right\rangle \ \ \ \ \ (2)

 

We can use these relations to construct the matrix elements of {J_{x}} and {J_{y}} in this basis. We can also use these relations to work out expectation values and uncertainties for the angular momentum components in this basis.

First, since diagonals of both the {J_{x}} and {J_{y}} matrices have only zero elements,

\displaystyle \left\langle J_{x}\right\rangle \displaystyle = \displaystyle \left\langle jm\left|J_{x}\right|jm\right\rangle =0\ \ \ \ \ (3)
\displaystyle \left\langle J_{y}\right\rangle \displaystyle = \displaystyle \left\langle jm\left|J_{y}\right|jm\right\rangle =0 \ \ \ \ \ (4)

To work out {\left\langle J_{x}^{2}\right\rangle } and {\left\langle J_{y}^{2}\right\rangle }, we can write these operators in terms of the raising and lowering operators:

\displaystyle J_{x} \displaystyle = \displaystyle \frac{1}{2}\left(J_{+}+J_{-}\right)\ \ \ \ \ (5)
\displaystyle J_{y} \displaystyle = \displaystyle \frac{1}{2i}\left(J_{+}-J_{-}\right) \ \ \ \ \ (6)

We can then use the fact that the basis states are orthonormal, so that

\displaystyle \left\langle j^{\prime}m^{\prime}\left|jm\right.\right\rangle =\delta_{j^{\prime}j}\delta_{m^{\prime}m} \ \ \ \ \ (7)

The required squares are

\displaystyle J_{x}^{2} \displaystyle = \displaystyle \frac{1}{4}\left(J_{+}^{2}+J_{+}J_{-}+J_{-}J_{+}+J_{-}^{2}\right)\ \ \ \ \ (8)
\displaystyle J_{y}^{2} \displaystyle = \displaystyle -\frac{1}{4}\left(J_{+}^{2}-J_{+}J_{-}-J_{-}J_{+}+J_{-}^{2}\right)\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle \frac{1}{4}\left(-J_{+}^{2}+J_{+}J_{-}+J_{-}J_{+}-J_{-}^{2}\right) \ \ \ \ \ (10)

The diagonal matrix elements {\left\langle jm\left|J_{x}^{2}\right|jm\right\rangle } and {\left\langle jm\left|J_{y}^{2}\right|jm\right\rangle } will get non-zero contributions only from those terms that leave {j} and {m} unchanged when operating on {\left|jm\right\rangle }. This means that only the terms that contain an equal number of {J_{+}} and {J_{-}} terms will contribute. We therefore have

\displaystyle \left\langle jm\left|J_{x}^{2}\right|jm\right\rangle \displaystyle = \displaystyle \frac{1}{4}\left\langle jm\left|J_{+}J_{-}+J_{-}J_{+}\right|jm\right\rangle \ \ \ \ \ (11)
\displaystyle \displaystyle = \displaystyle \frac{\hbar}{4}\sqrt{\left(j+m\right)\left(j-m+1\right)}\left\langle jm\left|J_{+}\right|j,m-1\right\rangle +\ \ \ \ \ (12)
\displaystyle \displaystyle \displaystyle \frac{\hbar}{4}\sqrt{\left(j-m\right)\left(j+m+1\right)}\left\langle jm\left|J_{-}\right|j,m+1\right\rangle \ \ \ \ \ (13)
\displaystyle \displaystyle = \displaystyle \frac{\hbar^{2}}{4}\sqrt{\left(j+m\right)\left(j-m+1\right)}\sqrt{\left(j-m+1\right)\left(j+m\right)}+\ \ \ \ \ (14)
\displaystyle \displaystyle \displaystyle \frac{\hbar^{2}}{4}\sqrt{\left(j-m\right)\left(j+m+1\right)}\sqrt{\left(j+m+1\right)\left(j-m\right)}\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle \frac{\hbar^{2}}{4}\left(\left(j+m\right)\left(j-m+1\right)+\left(j-m\right)\left(j+m+1\right)\right)\ \ \ \ \ (16)
\displaystyle \displaystyle = \displaystyle \frac{\hbar^{2}}{4}\left(j^{2}-m^{2}+j+m+j^{2}-m^{2}+j-m\right)\ \ \ \ \ (17)
\displaystyle \displaystyle = \displaystyle \frac{\hbar^{2}}{2}\left(j\left(j+1\right)-m^{2}\right) \ \ \ \ \ (18)

From 10 we see that the only terms that contribute to {\left\langle jm\left|J_{y}^{2}\right|jm\right\rangle } are the same as the corresponding terms in {\left\langle jm\left|J_{x}^{2}\right|jm\right\rangle }, so the result is the same:

\displaystyle \left\langle jm\left|J_{y}^{2}\right|jm\right\rangle =\frac{\hbar^{2}}{2}\left(j\left(j+1\right)-m^{2}\right) \ \ \ \ \ (19)

We can check that {J_{x}} and {J_{y}} satisfy the uncertainty principle, as derived by Shankar. That is, we want to verify that

\displaystyle \Delta J_{x}\cdot\Delta J_{y}\ge\left|\left\langle jm\left|\left(J_{x}-\left\langle J_{x}\right\rangle \right)\left(J_{y}-\left\langle J_{y}\right\rangle \right)\right|jm\right\rangle \right| \ \ \ \ \ (20)

On the LHS

\displaystyle \Delta J_{x} \displaystyle = \displaystyle \sqrt{\left\langle J_{x}^{2}\right\rangle -\left\langle J_{x}\right\rangle ^{2}}\ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle \sqrt{\left\langle J_{x}^{2}\right\rangle }\ \ \ \ \ (22)
\displaystyle \displaystyle = \displaystyle \sqrt{\frac{\hbar^{2}}{2}\left(j\left(j+1\right)-m^{2}\right)}\ \ \ \ \ (23)
\displaystyle \Delta J_{y} \displaystyle = \displaystyle \sqrt{\frac{\hbar^{2}}{2}\left(j\left(j+1\right)-m^{2}\right)}\ \ \ \ \ (24)
\displaystyle \Delta J_{x}\cdot\Delta J_{y} \displaystyle = \displaystyle \frac{\hbar^{2}}{2}\left(j\left(j+1\right)-m^{2}\right) \ \ \ \ \ (25)

On the RHS

\displaystyle \left|\left\langle jm\left|\left(J_{x}-\left\langle J_{x}\right\rangle \right)\left(J_{y}-\left\langle J_{y}\right\rangle \right)\right|jm\right\rangle \right|=\left|\left\langle jm\left|J_{x}J_{y}\right|jm\right\rangle \right| \ \ \ \ \ (26)

Using the same technique as that above for deriving {\left\langle jm\left|J_{x}^{2}\right|jm\right\rangle } we have

\displaystyle \left\langle jm\left|J_{x}J_{y}\right|jm\right\rangle \displaystyle = \displaystyle \frac{1}{4i}\left\langle jm\left|\left(J_{+}+J_{-}\right)\left(J_{+}-J_{-}\right)\right|jm\right\rangle \ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \frac{1}{4i}\left\langle jm\left|J_{-}J_{+}-J_{+}J_{-}\right|jm\right\rangle \ \ \ \ \ (28)
\displaystyle \displaystyle = \displaystyle \frac{\hbar^{2}}{4i}\left(\left(j-m\right)\left(j+m+1\right)-\left(j+m\right)\left(j-m+1\right)\right)\ \ \ \ \ (29)
\displaystyle \displaystyle = \displaystyle -\frac{\hbar^{2}m}{2i} \ \ \ \ \ (30)

We therefore need to verify that

\displaystyle j\left(j+1\right)-m^{2}\ge\left|m\right| \ \ \ \ \ (31)

for all allowed values of {m}. We know that {-j\le m\le+j}, so

\displaystyle j\left(j+1\right)-m^{2}\ge j^{2}+j-j^{2}=j\ge\left|m\right| \ \ \ \ \ (32)

Thus the inequality is indeed satisfied.

In the case {\left|m\right|=j} we have

\displaystyle j\left(j+1\right)-j^{2}=j=\left|m\right| \ \ \ \ \ (33)

so the inequality saturates (becomes an equality) in that case.

Uncertainty principle – Shankar’s more general treatment

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 9.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Shankar’s derivation of the general uncertainty principle relating the variances of two Hermitian operators actually gives a different result from that in Griffiths. To follow this post, you should first review the earlier post. To keep things consistent I’ll use the original Griffiths notation up to equation 11, which is a summary of the earlier post.

Shankar’s derivation is the same as Griffiths’s up to equation (13) in the earlier post. To summarize, we have two operators {\hat{A}} and {\hat{B}} and calculate their variances as

\displaystyle   \sigma_{A}^{2} \displaystyle  = \displaystyle  \left\langle \Psi|(\hat{A}-\left\langle A\right\rangle )^{2}\Psi\right\rangle \ \ \ \ \ (1)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \left(\hat{A}-\left\langle A\right\rangle \right)\Psi|\left(\hat{A}-\left\langle A\right\rangle \right)\Psi\right\rangle \ \ \ \ \ (2)
\displaystyle  \displaystyle  \equiv \displaystyle  \left\langle f|f\right\rangle \ \ \ \ \ (3)

where the function {f} is defined by this equation.

Similarly, for {\hat{B}}:

\displaystyle   \sigma_{B}^{2} \displaystyle  = \displaystyle  \left\langle \Psi|(\hat{B}-\left\langle B\right\rangle )^{2}\Psi\right\rangle \ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \left\langle \left(\hat{B}-\left\langle B\right\rangle \right)\Psi|\left(\hat{B}-\left\langle B\right\rangle \right)\Psi\right\rangle \ \ \ \ \ (5)
\displaystyle  \displaystyle  \equiv \displaystyle  \left\langle g|g\right\rangle \ \ \ \ \ (6)

We now invoke the Schwarz inequality to say

\displaystyle   \sigma_{A}^{2}\sigma_{B}^{2} \displaystyle  = \displaystyle  \left\langle f|f\right\rangle \left\langle g|g\right\rangle \ \ \ \ \ (7)
\displaystyle  \displaystyle  \ge \displaystyle  |\left\langle f|g\right\rangle |^{2} \ \ \ \ \ (8)

At this point, Griffiths continues by saying that

\displaystyle  \left\langle f|g\right\rangle |^{2}\ge\left(\Im\left\langle f\left|g\right.\right\rangle \right)^{2} \ \ \ \ \ (9)

That is, he throws away the real part of {\left\langle f\left|g\right.\right\rangle } to get another inequality. Shankar retains the full complex number and thus states that

\displaystyle   |\left\langle f|g\right\rangle |^{2} \displaystyle  = \displaystyle  \left|\left\langle \left(\hat{A}-\left\langle A\right\rangle \right)\Psi|\left(\hat{B}-\left\langle B\right\rangle \right)\Psi\right\rangle \right|^{2}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \left|\left\langle \Psi\left|\left(\hat{A}-\left\langle A\right\rangle \right)\left(\hat{B}-\left\langle B\right\rangle \right)\right|\Psi\right\rangle \right|^{2} \ \ \ \ \ (11)

Defining the operators

\displaystyle   \hat{\Omega} \displaystyle  \equiv \displaystyle  \hat{A}-\left\langle A\right\rangle \ \ \ \ \ (12)
\displaystyle  \hat{\Lambda} \displaystyle  \equiv \displaystyle  \hat{B}-\left\langle B\right\rangle \ \ \ \ \ (13)

we have

\displaystyle   |\left\langle f|g\right\rangle |^{2} \displaystyle  = \displaystyle  \left|\left\langle \Psi\left|\hat{\Omega}\hat{\Lambda}\right|\Psi\right\rangle \right|^{2}\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4}\left|\left\langle \Psi\left|\left[\hat{\Omega},\hat{\Lambda}\right]_{+}+\left[\hat{\Omega},\hat{\Lambda}\right]\right|\Psi\right\rangle \right|^{2} \ \ \ \ \ (15)

where

\displaystyle  \left[\hat{\Omega},\hat{\Lambda}\right]_{+}\equiv\hat{\Omega}\hat{\Lambda}+\hat{\Lambda}\hat{\Omega} \ \ \ \ \ (16)

is the anticommutator. For two Hermitian operators, the commutator is the difference between a value and its complex conjugate, so is always pure imaginary (and thus the anticommutator is always real), so we can write this as

\displaystyle  \left[\hat{\Omega},\hat{\Lambda}\right]=i\Gamma \ \ \ \ \ (17)

for some Hermitian operator {\Gamma}. Using the triangle inequality, we thus arrive at

\displaystyle  \sigma_{A}^{2}\sigma_{B}^{2}\ge|\left\langle f|g\right\rangle |^{2}\ge\frac{1}{4}\left\langle \Psi\left|\left[\hat{\Omega},\hat{\Lambda}\right]_{+}\right|\Psi\right\rangle ^{2}+\frac{1}{4}\left\langle \Psi\left|\Gamma\right|\Psi\right\rangle ^{2} \ \ \ \ \ (18)

Comparing this with Griffiths’s result, he had

\displaystyle  \sigma_{A}^{2}\sigma_{B}^{2}\ge\left(\frac{1}{2i}\left\langle [\hat{A},\hat{B}]\right\rangle \right)^{2}=\frac{1}{4}\left\langle \Psi\left|\Gamma\right|\Psi\right\rangle ^{2} \ \ \ \ \ (19)

That is, Griffiths’s uncertainty principle is actually weaker than Shankar’s as he includes only the last term in 18. For canonically conjugate operators (such as {X} and {P}) the commutator is always

\displaystyle  \left[X,P\right]=i\hbar \ \ \ \ \ (20)

so the last term in 18 is always {\hbar^{2}/4} for any wave function {\Psi}. The first term in 18, which involves the anticommutator, will, in general, depend on the wave function {\Psi}, but it is always positive (or zero), so we can still state that, for such operators

\displaystyle  \sigma_{A}^{2}\sigma_{B}^{2}\ge\frac{\hbar^{2}}{4} \ \ \ \ \ (21)

Total angular momentum – matrix elements and commutation relations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.5.2.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

In Shankar’s Chapter 12 treatment of the eigenvalues of the angular momentum operators {L^{2}} and {L_{z}}, he retraces much of what we’ve already covered as a result of working through Griffiths’s book. He defines raising and lowering operators for angular momentum as

\displaystyle L_{\pm}\equiv L_{x}\pm iL_{y} \ \ \ \ \ (1)

These operators can be used to discover the eigenvalues of {L^{2}} to be {\ell\left(\ell+1\right)\hbar^{2}}, where {\ell=0,\frac{1}{2},1,\frac{3}{2},\ldots} and the eigenvalues of {L_{z}} are {m\hbar} where {m} ranges from {-\ell} to {+\ell} in integer steps. The eigenvalues of {L_{\pm}} can also be found to satisfy

\displaystyle L_{\pm}\left|\ell m\right\rangle =\hbar\sqrt{\left(\ell\mp m\right)\left(\ell\pm m+1\right)}\left|\ell,m\pm1\right\rangle \ \ \ \ \ (2)

When dealing with vector wave functions (as opposed to scalar ones) in two dimensions, we found that a quantity {J_{z}} is the generator of infinitesimal rotations about the {z} axis, where

\displaystyle J_{z}=L_{z}+S_{z} \ \ \ \ \ (3)

and the operator producing the rotation by {\varepsilon_{z}} is

\displaystyle U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right]=I-\frac{i\varepsilon_{z}}{\hbar}J_{z} \ \ \ \ \ (4)

For a scalar wave function in three dimensions, we found that the properties of two successive rotations by {\varepsilon_{x}} about the {x} axis and {\varepsilon_{y}} about the {y} axis led to the commutations

\displaystyle \left[L_{x},L_{y}\right] \displaystyle = \displaystyle i\hbar L_{z}\ \ \ \ \ (5)
\displaystyle \left[L_{y},L_{z}\right] \displaystyle = \displaystyle i\hbar L_{x}\ \ \ \ \ (6)
\displaystyle \left[L_{z},L_{x}\right] \displaystyle = \displaystyle i\hbar L_{y} \ \ \ \ \ (7)

For a vector wave function, the rotation is generated by {J_{i}} rather than {L_{i}} but because the effects of rotations are the same, the {J_{i}} must have the same commutation relations, so that

\displaystyle \left[J_{x},J_{y}\right] \displaystyle = \displaystyle i\hbar J_{z}\ \ \ \ \ (8)
\displaystyle \left[J_{y},J_{z}\right] \displaystyle = \displaystyle i\hbar J_{x}\ \ \ \ \ (9)
\displaystyle \left[J_{z},J_{x}\right] \displaystyle = \displaystyle i\hbar J_{y} \ \ \ \ \ (10)

We can do the same analysis on {J} as we did above with {L} to define the raising and lowering operators

\displaystyle J_{\pm}\equiv J_{x}\pm iJ_{y} \ \ \ \ \ (11)

and get the same eigenvalue relations

\displaystyle J_{\pm}\left|jm\right\rangle =\hbar\sqrt{\left(j\mp m\right)\left(j\pm m+1\right)}\left|j,m\pm1\right\rangle \ \ \ \ \ (12)

 

The three components of {\mathbf{J}} are then {J_{z}} and

\displaystyle J_{x} \displaystyle = \displaystyle \frac{1}{2}\left(J_{+}+J_{-}\right)\ \ \ \ \ (13)
\displaystyle J_{y} \displaystyle = \displaystyle \frac{1}{2i}\left(J_{+}-J_{-}\right) \ \ \ \ \ (14)

Using these three equations, we can generate the matrix elements of the components of {\mathbf{J}} in the orthonormal basis {\left|jm\right\rangle } (that is, the basis consisting of eigenfunctions with total angular momentum number {j} and {J_{z}} number {m}). These matrix elements are

\displaystyle \left\langle j^{\prime}m^{\prime}\left|J_{x}\right|jm\right\rangle \displaystyle = \displaystyle \frac{1}{2}\left\langle j^{\prime}m^{\prime}\left|J_{+}+J_{-}\right|jm\right\rangle \ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle \frac{\hbar}{2}\sqrt{\left(j-m\right)\left(j+m+1\right)}\left\langle j^{\prime}m^{\prime}\left|j,m+1\right.\right\rangle +\ \ \ \ \ (16)
\displaystyle \displaystyle \displaystyle \frac{\hbar}{2}\sqrt{\left(j+m\right)\left(j-m+1\right)}\left\langle j^{\prime}m^{\prime}\left|j,m-1\right.\right\rangle \ \ \ \ \ (17)
\displaystyle \displaystyle = \displaystyle \frac{\hbar}{2}\left[\sqrt{\left(j-m\right)\left(j+m+1\right)}\delta_{j^{\prime}j}\delta_{m^{\prime},m+1}+\sqrt{\left(j+m\right)\left(j-m+1\right)}\delta_{j^{\prime}j}\delta_{m^{\prime},m-1}\right] \ \ \ \ \ (18)
\displaystyle \left\langle j^{\prime}m^{\prime}\left|J_{y}\right|jm\right\rangle \displaystyle = \displaystyle \frac{1}{2i}\left\langle j^{\prime}m^{\prime}\left|J_{+}-J_{-}\right|jm\right\rangle \ \ \ \ \ (19)
\displaystyle \displaystyle = \displaystyle \frac{\hbar}{2i}\sqrt{\left(j-m\right)\left(j+m+1\right)}\left\langle j^{\prime}m^{\prime}\left|j,m+1\right.\right\rangle -\ \ \ \ \ (20)
\displaystyle \displaystyle \displaystyle \frac{\hbar}{2i}\sqrt{\left(j+m\right)\left(j-m+1\right)}\left\langle j^{\prime}m^{\prime}\left|j,m-1\right.\right\rangle \ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle \frac{\hbar}{2i}\left[\sqrt{\left(j-m\right)\left(j+m+1\right)}\delta_{j^{\prime}j}\delta_{m^{\prime},m+1}-\sqrt{\left(j+m\right)\left(j-m+1\right)}\delta_{j^{\prime}j}\delta_{m^{\prime},m-1}\right] \ \ \ \ \ (22)

\displaystyle \left\langle j^{\prime}m^{\prime}\left|J_{z}\right|jm\right\rangle =m\hbar\delta_{j^{\prime}j}\delta_{m^{\prime},m} \ \ \ \ \ (23)

The full matrix for each component {J_{i}} is actually infinite-dimensional, since {j} can be any half-integer from 0 up to infinity. However, the sub-matrix for each value of {j} is completely orthogonal to all other sub-matrices with different {j} values, so the complete matrix for each {J_{i}} is block-diagonal. Shankar gives the matrices for {J_{x}} and {J_{y}} up to {j=1} in his equations 12.5.23 and 12.5.24. This means that the commutation relations 9 should be obeyed for each set of sub-matrices corresponding to a particular {j} value.

For {j=\frac{1}{2}} we have for the 3 sub-matrices (we can copy these from Shankar or use the above formulas to work them out). The values of {m} are {+\frac{1}{2}} and {-\frac{1}{2}} in that order, from top to bottom and left to right.

\displaystyle J_{x}^{\left(1/2\right)} \displaystyle = \displaystyle \frac{\hbar}{2}\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (24)
\displaystyle J_{y}^{\left(1/2\right)} \displaystyle = \displaystyle \frac{\hbar}{2i}\left[\begin{array}{cc} 0 & 1\\ -1 & 0 \end{array}\right]\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \frac{i\hbar}{2}\left[\begin{array}{cc} 0 & -1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (26)
\displaystyle \left[J_{x}^{\left(1/2\right)},J_{y}^{\left(1/2\right)}\right] \displaystyle = \displaystyle \frac{i\hbar^{2}}{4}\left(\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]-\left[\begin{array}{cc} -1 & 0\\ 0 & 1 \end{array}\right]\right)\ \ \ \ \ (27)
\displaystyle \displaystyle \displaystyle \frac{i\hbar^{2}}{2}\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\ \ \ \ \ (28)
\displaystyle \displaystyle = \displaystyle i\hbar J_{z}^{\left(1/2\right)} \ \ \ \ \ (29)

For {j=1} we have for the 3 sub-matrices

\displaystyle J_{x}^{\left(1\right)} \displaystyle = \displaystyle \frac{\hbar}{\sqrt{2}}\left[\begin{array}{ccc} 0 & 1 & 0\\ 1 & 0 & 1\\ 0 & 1 & 0 \end{array}\right]\ \ \ \ \ (30)
\displaystyle J_{y}^{\left(1\right)} \displaystyle = \displaystyle \frac{i\hbar}{\sqrt{2}}\left[\begin{array}{ccc} 0 & -1 & 0\\ 1 & 0 & -1\\ 0 & 1 & 0 \end{array}\right]\ \ \ \ \ (31)
\displaystyle \left[J_{x}^{\left(1\right)},J_{y}^{\left(1\right)}\right] \displaystyle = \displaystyle \frac{i\hbar^{2}}{2}\left(\left[\begin{array}{ccc} 1 & 0 & -1\\ 0 & 0 & 0\\ 1 & 0 & -1 \end{array}\right]-\left[\begin{array}{ccc} -1 & 0 & -1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end{array}\right]\right)\ \ \ \ \ (32)
\displaystyle \displaystyle = \displaystyle i\hbar^{2}\left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & -1 \end{array}\right]\ \ \ \ \ (33)
\displaystyle \displaystyle = \displaystyle i\hbar J_{z}^{\left(1\right)} \ \ \ \ \ (34)

For {j=\frac{3}{2}} we need to work out the matrices from the formulas above for the matrix elements. Ordering the values of {m=\frac{3}{2},\frac{1}{2},-\frac{1}{2},\frac{3}{2}} from left to right (columns) and top to bottom (rows), we get

\displaystyle J_{x}^{\left(3/2\right)} \displaystyle = \displaystyle \frac{\hbar}{2}\left[\begin{array}{cccc} 0 & \sqrt{3} & 0 & 0\\ \sqrt{3} & 0 & 2 & 0\\ 0 & 2 & 0 & \sqrt{3}\\ 0 & 0 & \sqrt{3} & 0 \end{array}\right]\ \ \ \ \ (35)
\displaystyle J_{y}^{\left(3/2\right)} \displaystyle = \displaystyle \frac{\hbar}{2i}\left[\begin{array}{cccc} 0 & \sqrt{3} & 0 & 0\\ -\sqrt{3} & 0 & 2 & 0\\ 0 & -2 & 0 & \sqrt{3}\\ 0 & 0 & -\sqrt{3} & 0 \end{array}\right]\ \ \ \ \ (36)
\displaystyle \displaystyle = \displaystyle \frac{i\hbar}{2}\left[\begin{array}{cccc} 0 & -\sqrt{3} & 0 & 0\\ \sqrt{3} & 0 & -2 & 0\\ 0 & 2 & 0 & -\sqrt{3}\\ 0 & 0 & \sqrt{3} & 0 \end{array}\right]\ \ \ \ \ (37)
\displaystyle \left[J_{x}^{\left(3/2\right)},J_{y}^{\left(3/2\right)}\right] \displaystyle = \displaystyle \frac{i\hbar^{2}}{4}\left(\left[\begin{array}{cccc} 3 & 0 & -2\sqrt{3} & 0\\ 0 & 1 & 0 & -2\sqrt{3}\\ 2\sqrt{3} & 0 & -1 & 0\\ 0 & 2\sqrt{3} & 0 & -3 \end{array}\right]-\left[\begin{array}{cccc} -3 & 0 & -2\sqrt{3} & 0\\ 0 & -1 & 0 & -2\sqrt{3}\\ 2\sqrt{3} & 0 & 1 & 0\\ 0 & 2\sqrt{3} & 0 & 3 \end{array}\right]\right)\ \ \ \ \ (38)
\displaystyle \displaystyle = \displaystyle i\hbar^{2}\left[\begin{array}{cccc} \frac{3}{2} & 0 & 0 & 0\\ 0 & \frac{1}{2} & 0 & 0\\ 0 & 0 & -\frac{1}{2} & 0\\ 0 & 0 & 0 & -\frac{3}{2} \end{array}\right]\ \ \ \ \ (39)
\displaystyle \displaystyle = \displaystyle i\hbar J_{z}^{\left(3/2\right)} \ \ \ \ \ (40)

Thus the commutation relation {\left[J_{x},J_{y}\right]=i\hbar J_{z}} is satisfied for these three sets of sub-matrices.

Rotation of a vector wave function

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.5.1.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

We’ve seen that, for a rotation by an infinitesimal angle {\varepsilon_{z}} about the {z} axis, a scalar wave function transforms according to

\displaystyle \psi\left(x,y\right)\rightarrow\psi\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right) \ \ \ \ \ (1)

 

The meaning of this transformation can be seen in the figure:

The physical system represented by the wave function {\Psi} is rigidly rotated by the angle {\varepsilon_{z}}, so that the value of {\Psi} at point {A} is now sitting over the point {B}. However, in the primed (rotated) coordinate system, the numerical value of the coordinates of the point {B} in the figure are the same as the numerical values that the point {A} had in the original, unrotated coordinates. That is

\displaystyle \left(x_{B}^{\prime},y_{B}^{\prime}\right)=\left(x_{A},y_{A}\right) \ \ \ \ \ (2)

Just as {B} is obtained from {A} by rotating {A} by {+\varepsilon_{z}}, we can obtain {A} from {B} by rotating by {-\varepsilon_{z}}. For any given point, the primed (rotated) and unprimed (unrotated) coordinates are related by (all relations are to first order in {\varepsilon_{z}}):

\displaystyle x^{\prime} \displaystyle = \displaystyle x-y\varepsilon_{z}\ \ \ \ \ (3)
\displaystyle y^{\prime} \displaystyle = \displaystyle y+x\varepsilon_{z} \ \ \ \ \ (4)

The inverse relations are obtained by a rotation by {-\varepsilon_{z}}:

\displaystyle x \displaystyle = \displaystyle x^{\prime}+y^{\prime}\varepsilon_{z}\ \ \ \ \ (5)
\displaystyle y \displaystyle = \displaystyle y^{\prime}-x^{\prime}\varepsilon_{z} \ \ \ \ \ (6)

After rotation, the values of {\Psi^{\prime}} are related to the values {\Psi} before rotation by rotating through the angle {-\varepsilon_{z}}, so that

\displaystyle \Psi^{\prime}\left(x,y\right)=\Psi\left(x+y\varepsilon_{z},y-x\varepsilon_{z}\right) \ \ \ \ \ (7)

Now suppose the wave function is a vector {\mathbf{V}=V_{x}\hat{\mathbf{x}}+V_{y}\hat{\mathbf{y}}}. The situation is as shown:

The initial unrotated vector {\mathbf{V}} is the value of the wave function at point {A} (and is entirely in the {x} direction for convenience). After rotation, the vector gets moved to {B} and is also rotated so that it now makes an angle {\varepsilon_{z}} with the original {x} axis. However, its direction is now along the {x^{\prime}} axis, which makes an angle of {\varepsilon_{z}} with the original {x} axis.

In this case, each component of {\mathbf{V}} still gets transformed in the same way as the scalar function above, but the vector itself is also rotated. If the components {V_{x}} and {V_{y}} of the vector were constants, then the rotated vector is given by applying the 2-d rotation matrix

\displaystyle R=\left[\begin{array}{cc} 1 & -\varepsilon_{z}\\ \varepsilon_{z} & 1 \end{array}\right] \ \ \ \ \ (8)

so we get {\mathbf{V}^{\prime}=R\mathbf{V}}, or, in components:

\displaystyle V_{x}^{\prime} \displaystyle = \displaystyle V_{x}-V_{y}\varepsilon_{z}\ \ \ \ \ (9)
\displaystyle V_{y}^{\prime} \displaystyle = \displaystyle V_{y}+V_{x}\varepsilon_{z} \ \ \ \ \ (10)

If {V_{x}} and {V_{y}} vary from point to point, then we must apply the transformation 1 to each component, so that the overall transformation is

\displaystyle V_{x}^{\prime} \displaystyle = \displaystyle V_{x}\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)-V_{y}\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)\varepsilon_{z}\ \ \ \ \ (11)
\displaystyle V_{y}^{\prime} \displaystyle = \displaystyle V_{y}\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)+V_{x}\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)\varepsilon_{z} \ \ \ \ \ (12)

The operator that generates the transformation of a scalar function by an infinitesimal angle {\delta\boldsymbol{\theta}} is

\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L} \ \ \ \ \ (13)

 

In this case, the rotation is about the {z} axis so

\displaystyle \delta\boldsymbol{\theta} \displaystyle = \displaystyle \varepsilon_{z}\hat{\mathbf{z}}\ \ \ \ \ (14)
\displaystyle \delta\boldsymbol{\theta}\cdot\mathbf{L} \displaystyle = \displaystyle \varepsilon_{z}L_{z} \ \ \ \ \ (15)

Thus we have

\displaystyle V_{x,y}\left(x+\varepsilon_{z}y,y-\varepsilon_{z}x\right)=\left(I-\frac{i}{\hbar}\varepsilon_{z}L_{z}\right)V_{x,y}\left(x,y\right) \ \ \ \ \ (16)

Plugging this into 11 and keeping terms only up to order {\varepsilon_{z}} we have

\displaystyle V_{x}^{\prime} \displaystyle = \displaystyle \left(I-\frac{i}{\hbar}\varepsilon_{z}L_{z}\right)V_{x}-V_{y}\varepsilon_{z}\ \ \ \ \ (17)
\displaystyle V_{y}^{\prime} \displaystyle = \displaystyle \left(I-\frac{i}{\hbar}\varepsilon_{z}L_{z}\right)V_{y}+V_{x}\varepsilon_{z} \ \ \ \ \ (18)

In matrix form, this is

\displaystyle \left[\begin{array}{c} V_{x}^{\prime}\\ V_{y}^{\prime} \end{array}\right] \displaystyle = \displaystyle \left(\left[\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right]-\frac{i\varepsilon_{z}}{\hbar}\left[\begin{array}{cc} L_{z} & 0\\ 0 & L_{z} \end{array}\right]-\varepsilon_{z}\left[\begin{array}{cc} 0 & 1\\ -1 & 0 \end{array}\right]\right)\left[\begin{array}{c} V_{x}\\ V_{y} \end{array}\right]\ \ \ \ \ (19)
\displaystyle \displaystyle = \displaystyle \left(\left[\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right]-\frac{i\varepsilon_{z}}{\hbar}\left[\begin{array}{cc} L_{z} & 0\\ 0 & L_{z} \end{array}\right]-\frac{i\varepsilon_{z}}{\hbar}\left[\begin{array}{cc} 0 & -i\hbar\\ i\hbar & 0 \end{array}\right]\right)\left[\begin{array}{c} V_{x}\\ V_{y} \end{array}\right]\ \ \ \ \ (20)
\displaystyle \displaystyle = \displaystyle \left(I-\frac{i\varepsilon_{z}}{\hbar}J_{z}\right)\left[\begin{array}{c} V_{x}\\ V_{y} \end{array}\right] \ \ \ \ \ (21)

This has the same form as 13, except that the angular momentum generator is now the sum of {L_{z}} and the final matrix on the RHS above, which Shankar calls suggestively {S_{z}}, in anticipation of spin which at this stage he hasn’t considered. That is,

\displaystyle J_{z} \displaystyle = \displaystyle L_{z}+S_{z}\ \ \ \ \ (22)
\displaystyle \displaystyle = \displaystyle \left[\begin{array}{cc} L_{z} & 0\\ 0 & L_{z} \end{array}\right]+\left[\begin{array}{cc} 0 & -i\hbar\\ i\hbar & 0 \end{array}\right] \ \ \ \ \ (23)

The eigenvalues of the second matrix here are just {\pm\hbar}, so we haven’t yet encountered half-integral values of angular momentum.

Vector operators; transformation under rotation

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

A vector operator {\mathbf{V}} is defined as an operator whose components transform under rotation according to

\displaystyle U^{\dagger}\left[R\right]V_{i}U\left[R\right]=\sum_{j}R_{ij}V_{j} \ \ \ \ \ (1)

 

where {R} is the rotation matrix in either 2 or 3 dimensions. We’ve seen that, for an infinitesimal rotation about an arbitrary axis {\delta\boldsymbol{\theta}}, a vector transforms like

\displaystyle \mathbf{V}\rightarrow\mathbf{V}+\delta\boldsymbol{\theta}\times\mathbf{V} \ \ \ \ \ (2)

This can be written more compactly using the Levi-Civita tensor, since component {i} of a cross product is

\displaystyle \left(\delta\boldsymbol{\theta}\times\mathbf{V}\right)_{i}=\sum_{j,k}\varepsilon_{ijk}\left(\delta\theta\right)_{j}V_{k} \ \ \ \ \ (3)

We get

\displaystyle \sum_{j}R_{ij}V_{j}=V_{i}+\sum_{j,k}\varepsilon_{ijk}\left(\delta\theta\right)_{j}V_{k} \ \ \ \ \ (4)

 

The operator {U\left[R\right]} is given by

\displaystyle U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L} \ \ \ \ \ (5)

 

where {\mathbf{L}} is the angular momentum. Plugging this into 1, we have, to first order in {\delta\boldsymbol{\theta}} (remembering that the components of {\mathbf{L}} do not commute with each other and, in general also do not commute with the components of {\mathbf{V}}):

\displaystyle \left(I+\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\right)V_{i}\left(I-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\right) \displaystyle = \displaystyle V_{i}+\frac{i}{\hbar}\sum_{j}\left(\delta\theta_{j}L_{j}\right)V_{i}-\frac{i}{\hbar}V_{i}\sum_{j}\left(\delta\theta_{j}L_{j}\right)\ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle V_{i}+\frac{i}{\hbar}\sum_{j}\delta\theta_{j}\left[L_{j},V_{i}\right] \ \ \ \ \ (7)

Setting this equal to the RHS of 4 we have, equating coefficients of {\delta\theta_{j}}:

\displaystyle \frac{i}{\hbar}\left[L_{j},V_{i}\right] \displaystyle = \displaystyle \sum_{k}\varepsilon_{ijk}V_{k}\ \ \ \ \ (8)
\displaystyle \left[V_{i},L_{j}\right] \displaystyle = \displaystyle i\hbar\sum_{k}\varepsilon_{ijk}V_{k} \ \ \ \ \ (9)

With {\mathbf{V}=\mathbf{L}}, we regain the commutation relations for the components of angular momentum

\displaystyle \left[L_{x},L_{y}\right] \displaystyle = \displaystyle i\hbar L_{z}\ \ \ \ \ (10)
\displaystyle \left[L_{y},L_{z}\right] \displaystyle = \displaystyle i\hbar L_{x}\ \ \ \ \ (11)
\displaystyle \left[L_{z},L_{x}\right] \displaystyle = \displaystyle i\hbar L_{y} \ \ \ \ \ (12)

By the way, it is possible to write these commutation relations in the compact form

\displaystyle \mathbf{L}\times\mathbf{L}=i\hbar\mathbf{L} \ \ \ \ \ (13)

 

This looks wrong if you’re used to the standard definition of the cross product for vectors whose components are ordinary numbers, since for such a vector {\mathbf{a}}, we always have {\mathbf{a}\times\mathbf{a}=0}. However, if the components of the vector are operators that don’t commute, then the result is not zero, as we can see:

\displaystyle \left(\mathbf{L}\times\mathbf{L}\right)_{i} \displaystyle = \displaystyle \sum_{j,k}\varepsilon_{ijk}L_{j}L_{k} \ \ \ \ \ (14)

If {i=x}, for example, then the sum on the RHS gives

\displaystyle \left(\mathbf{L}\times\mathbf{L}\right)_{x} \displaystyle = \displaystyle \sum_{j,k}\varepsilon_{xjk}L_{j}L_{k}\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle L_{y}L_{z}-L_{z}L_{y}\ \ \ \ \ (16)
\displaystyle \displaystyle = \displaystyle \left[L_{y},L_{z}\right] \ \ \ \ \ (17)

From 13, this gives

\displaystyle \left[L_{y},L_{z}\right]=i\hbar L_{x} \ \ \ \ \ (18)

Finite rotations about an arbitrary axis in three dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.3.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The operators for an infinitesimal rotation in 3-d are

\displaystyle   U\left[R\left(\varepsilon_{x}\hat{\mathbf{x}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{x}L_{x}}{\hbar}\ \ \ \ \ (1)
\displaystyle  U\left[R\left(\varepsilon_{y}\hat{\mathbf{y}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{y}L_{y}}{\hbar}\ \ \ \ \ (2)
\displaystyle  U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (3)

If we have a finite (larger than infinitesimal) rotation about one of the coordinate axes, we can create the operator by dividing up the finite rotation angle {\theta} into {N} small increments and take the limit as {N\rightarrow\infty}, just as we did with finite translations. For example, for a finite rotation about the {x} axis, we have

\displaystyle  U\left[R\left(\theta\hat{\mathbf{x}}\right)\right]=\lim_{N\rightarrow\infty}\left(I-\frac{i\theta L_{x}}{N\hbar}\right)^{N}=e^{-i\theta L_{x}/\hbar} \ \ \ \ \ (4)

What if we have a finite rotation about some arbitrarily directed axis? Suppose we have a vector {\mathbf{r}} as shown in the figure:

The vector {\mathbf{r}} makes an angle {\alpha} with the {z} axis, and we wish to rotate {\mathbf{r}} about the {z} axis by an angle {\delta\theta}. Note that this argument is completely general, since if the axis of rotation is not the {z} axis, we can rotate the entire coordinate system so that the axis of rotation is the {z} axis. The generality enters through the fact that we’re keeping the angle {\alpha} arbitrary.

The rotation by {\delta\theta\hat{\mathbf{z}}\equiv\delta\boldsymbol{\theta}} shifts the tip of {\mathbf{r}} along the circle shown by a distance {\left(r\sin\alpha\right)\delta\theta} in a counterclockwise direction (looking down the {z} axis). This shift is in a direction that is perpendicular to both {\hat{\mathbf{z}}} and {\mathbf{r}}, so the little vector representing the shift in {\mathbf{r}} is

\displaystyle  \delta\mathbf{r}=\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r} \ \ \ \ \ (5)

Thus under the rotation {\delta\boldsymbol{\theta}}, a vector transforms as

\displaystyle  \mathbf{r}\rightarrow\mathbf{r}+\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r} \ \ \ \ \ (6)

Just as with translations, if we rotate the coordinate system by an amount {\delta\boldsymbol{\theta}}, this is equivalent to rotating the wave function {\psi\left(\mathbf{r}\right)} by the same angle, but in the opposite direction, so we require

\displaystyle  \psi\left(\mathbf{r}\right)\rightarrow\psi\left(\mathbf{r}-\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r}\right) \ \ \ \ \ (7)

A first order Taylor expansion of the quantity on the RHS gives

\displaystyle  \psi\left(\mathbf{r}-\left(\delta\boldsymbol{\theta}\right)\times\mathbf{r}\right)=\psi\left(\mathbf{r}\right)-\left(\delta\boldsymbol{\theta}\times\mathbf{r}\right)\cdot\nabla\psi \ \ \ \ \ (8)

The operator generating this rotation will have the form (in analogy with the forms for the coordinate axes above):

\displaystyle  U\left[R\left(\delta\boldsymbol{\theta}\right)\right]=I-\frac{i\delta\theta}{\hbar}L_{\hat{\theta}} \ \ \ \ \ (9)

where {L_{\hat{\theta}}} is an angular momentum operator to be determined.

Writing out the RHS of 8, we have

\displaystyle   \psi\left(\mathbf{r}\right)-\left(\delta\boldsymbol{\theta}\times\mathbf{r}\right)\cdot\nabla\psi \displaystyle  = \displaystyle  \psi\left(\mathbf{r}\right)-\left(\delta\theta_{y}z-\delta\theta_{z}y\right)\frac{\partial\psi}{\partial x}+\left(\delta\theta_{x}z-\delta\theta_{z}x\right)\frac{\partial\psi}{\partial y}-\left(\delta\theta_{x}y-\delta\theta_{y}x\right)\frac{\partial\psi}{\partial z}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(\mathbf{r}\right)-\delta\theta_{x}\left(y\frac{\partial\psi}{\partial z}-z\frac{\partial\psi}{\partial y}\right)-\delta\theta_{y}\left(z\frac{\partial\psi}{\partial x}-x\frac{\partial\psi}{\partial z}\right)-\delta\theta_{z}\left(x\frac{\partial\psi}{\partial y}-y\frac{\partial\psi}{\partial x}\right)\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(\mathbf{r}\right)-\delta\boldsymbol{\theta}\cdot\frac{i}{\hbar}\mathbf{r}\times\mathbf{p}\psi\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \psi\left(\mathbf{r}\right)-\frac{i}{\hbar}\delta\boldsymbol{\theta}\cdot\mathbf{L}\psi\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  U\left[R\left(\delta\boldsymbol{\theta}\right)\right]\psi \ \ \ \ \ (14)

Comparing this with 9, we see that

\displaystyle  L_{\hat{\theta}}=\hat{\boldsymbol{\theta}}\cdot\mathbf{L} \ \ \ \ \ (15)

where {\hat{\boldsymbol{\theta}}} is the unit vector along the axis of rotation. Since all rotations about the same axis commute, we can use the same procedure as above to generate a finite rotation {\boldsymbol{\theta}} about an arbitrary axis and get

\displaystyle  U\left[R\left(\boldsymbol{\theta}\right)\right]=e^{-i\boldsymbol{\theta}\cdot\mathbf{L}/\hbar} \ \ \ \ \ (16)

Angular momentum in three dimensions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.2.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

We can now generalize our treatment of rotation, originally studied in two dimensions, to three dimensions. We’ll view a 3-d rotation as a combination of rotations about the {x}, {y} and {z} axes, each of which can be represented by a {3\times3} matrix. These matrices are as follows:

\displaystyle   R\left(\theta\hat{\mathbf{x}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & \cos\theta & -\sin\theta\\ 0 & \sin\theta & \cos\theta \end{array}\right]\ \ \ \ \ (1)
\displaystyle  R\left(\theta\hat{\mathbf{y}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} \cos\theta & 0 & \sin\theta\\ 0 & 1 & 0\\ -\sin\theta & 0 & \cos\theta \end{array}\right]\ \ \ \ \ (2)
\displaystyle  R\left(\theta\hat{\mathbf{z}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} \cos\theta & -\sin\theta & 0\\ \sin\theta & \cos\theta & 0\\ 0 & 0 & 1 \end{array}\right] \ \ \ \ \ (3)

We’re interested in infinitesimal rotations, for which we retain terms up to first order in the rotation angle {\varepsilon_{i}}, so that {\cos\varepsilon_{i}=1} and {\sin\varepsilon_{i}=\varepsilon_{i}}. This gives the infinitesimal rotation matrices as

\displaystyle   R\left(\varepsilon_{x}\hat{\mathbf{x}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & -\varepsilon_{x}\\ 0 & \varepsilon_{x} & 1 \end{array}\right]\ \ \ \ \ (4)
\displaystyle  R\left(\varepsilon_{y}\hat{\mathbf{y}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & 0 & \varepsilon_{y}\\ 0 & 1 & 0\\ -\varepsilon_{y} & 0 & 1 \end{array}\right]\ \ \ \ \ (5)
\displaystyle  R\left(\varepsilon_{z}\hat{\mathbf{z}}\right) \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & -\varepsilon_{z} & 0\\ \varepsilon_{z} & 1 & 0\\ 0 & 0 & 1 \end{array}\right] \ \ \ \ \ (6)

We now consider the series of rotations as follows: first, a rotation by {\varepsilon_{x}\hat{\mathbf{x}}}, then by {\varepsilon_{y}\hat{\mathbf{y}}}, then by {-\varepsilon_{x}\hat{\mathbf{x}}} and finally by {-\varepsilon_{y}\hat{\mathbf{y}}}. Because the various rotations don’t commute, we don’t end up back where we started. We can calculate the matrix products to find the final rotation.

\displaystyle   R \displaystyle  = \displaystyle  R\left(-\varepsilon_{y}\hat{\mathbf{y}}\right)R\left(-\varepsilon_{x}\hat{\mathbf{x}}\right)R\left(\varepsilon_{y}\hat{\mathbf{y}}\right)R\left(\varepsilon_{x}\hat{\mathbf{x}}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & 0 & -\varepsilon_{y}\\ 0 & 1 & 0\\ \varepsilon_{y} & 0 & 1 \end{array}\right]\left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & \varepsilon_{x}\\ 0 & -\varepsilon_{x} & 1 \end{array}\right]\left[\begin{array}{ccc} 1 & 0 & \varepsilon_{y}\\ 0 & 1 & 0\\ -\varepsilon_{y} & 0 & 1 \end{array}\right]\left[\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & -\varepsilon_{x}\\ 0 & \varepsilon_{x} & 1 \end{array}\right]\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1 & \varepsilon_{x}\varepsilon_{y} & -\varepsilon_{y}\\ 0 & 1 & \varepsilon_{x}\\ \varepsilon_{y} & -\varepsilon_{x} & 1 \end{array}\right]\left[\begin{array}{ccc} 1 & \varepsilon_{x}\varepsilon_{y} & \varepsilon_{y}\\ 0 & 1 & -\varepsilon_{x}\\ -\varepsilon_{y} & \varepsilon_{x} & 1 \end{array}\right]\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{ccc} 1+\varepsilon_{y}^{2} & \varepsilon_{x}\varepsilon_{y} & -\varepsilon_{x}^{2}\varepsilon_{y}\\ -\varepsilon_{x}\varepsilon_{y} & 1+\varepsilon_{x}^{2} & 0\\ 0 & \varepsilon_{x}\varepsilon_{y}^{2} & 1+\varepsilon_{x}^{2}+\varepsilon_{y}^{2} \end{array}\right] \ \ \ \ \ (10)

To get the third line, we multiplied the first two matrices in the second line, and the last two matrices in the second line. In the final result, we can discard terms containing {\varepsilon_{x}^{2}} or {\varepsilon_{y}^{2}} to get

\displaystyle  R=\left[\begin{array}{ccc} 1 & \varepsilon_{x}\varepsilon_{y} & 0\\ -\varepsilon_{x}\varepsilon_{y} & 1 & 0\\ 0 & 0 & 1 \end{array}\right]=R\left(-\varepsilon_{x}\varepsilon_{y}\hat{\mathbf{z}}\right) \ \ \ \ \ (11)

Thus the result of the four rotations about the {x} and {y} axes is a single rotation about the {z} axis.

To convert this to quantum operators, we define the operator {U\left[R\right]} by comparison with the procedure we used for 2-d rotations. That is, the operator {U} is given by the corresponding angular momentum operator {L_{x}}, {L_{y}} or {L_{z}} as

\displaystyle   U\left[R\left(\varepsilon_{x}\hat{\mathbf{x}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{x}L_{x}}{\hbar}\ \ \ \ \ (12)
\displaystyle  U\left[R\left(\varepsilon_{y}\hat{\mathbf{y}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{y}L_{y}}{\hbar}\ \ \ \ \ (13)
\displaystyle  U\left[R\left(\varepsilon_{z}\hat{\mathbf{z}}\right)\right] \displaystyle  = \displaystyle  I-\frac{i\varepsilon_{z}L_{z}}{\hbar} \ \ \ \ \ (14)

By comparing 7 and 11 we thus require these {U} operators to satisfy

\displaystyle  U\left[R\left(-\varepsilon_{y}\hat{\mathbf{y}}\right)\right]U\left[R\left(-\varepsilon_{x}\hat{\mathbf{x}}\right)\right]U\left[R\left(\varepsilon_{y}\hat{\mathbf{y}}\right)\right]U\left[R\left(\varepsilon_{x}\hat{\mathbf{x}}\right)\right]=U\left[R\left(-\varepsilon_{x}\varepsilon_{y}\hat{\mathbf{z}}\right)\right] \ \ \ \ \ (15)

We can get the commutation relation {\left[L_{x},L_{y}\right]} by matching coefficients of {\varepsilon_{x}\varepsilon_{y}} on each side of this equation. On the RHS, the coefficient is {\frac{iL_{z}}{\hbar}}. On the LHS, we can pick out the terms involving {\varepsilon_{x}\varepsilon_{y}} to get

\displaystyle  -\frac{1}{\hbar^{2}}\left(L_{y}L_{x}-L_{y}L_{x}-L_{x}L_{y}+L_{y}L_{x}\right)=\frac{1}{\hbar^{2}}\left[L_{x},L_{y}\right] \ \ \ \ \ (16)

The first term on the LHS comes from the {\varepsilon_{x}} term in the first {U} in 15 multiplied by the {\varepsilon_{y}} term in the second {U} (with the {I} term in the other two {U}s); the second term on the LHS comes from the {\varepsilon_{x}} term in the first {U} in 15 multiplied by the {\varepsilon_{y}} term in the fourth {U}, and so on.

Matching the two sides, we get

\displaystyle  \left[L_{x},L_{y}\right]=i\hbar L_{z} \ \ \ \ \ (17)

By comparison with the classical definitions of the three components of {\mathbf{L}}, we can write the quantum operators in terms of position and momentum operators as

\displaystyle   L_{x} \displaystyle  = \displaystyle  YP_{z}-ZP_{y}\ \ \ \ \ (18)
\displaystyle  L_{y} \displaystyle  = \displaystyle  ZP_{x}-XP_{z}\ \ \ \ \ (19)
\displaystyle  L_{z} \displaystyle  = \displaystyle  XP_{y}-YP_{x} \ \ \ \ \ (20)

From the commutators of position and momentum {\left[X,P_{x}\right]=i\hbar} and so on, we can verify 17 from these relations as well.

\displaystyle   \left[L_{x},L_{y}\right] \displaystyle  = \displaystyle  \left[YP_{z}-ZP_{y},ZP_{x}-XP_{z}\right]\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \left[YP_{z},ZP_{x}-XP_{z}\right]-\left[ZP_{y},ZP_{x}-XP_{z}\right]\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  -i\hbar YP_{x}+i\hbar P_{y}X\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  i\hbar\left(XP_{y}-YP_{x}\right)\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  i\hbar L_{z} \ \ \ \ \ (25)

The third line follows because {\left[YP_{z},XP_{z}\right]=\left[ZP_{y},ZP_{x}\right]=0}. The other two commutation relations follow by cyclic permutation of {x}, {y} and {z}:

\displaystyle   \left[L_{y},L_{z}\right] \displaystyle  = \displaystyle  i\hbar L_{x}\ \ \ \ \ (26)
\displaystyle  \left[L_{z},L_{x}\right] \displaystyle  = \displaystyle  i\hbar L_{y} \ \ \ \ \ (27)

Levi-Civita antisymmetric tensor, vector products and systems of 3 fermions

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.4.1.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The Levi-Civita symbol {\varepsilon_{ijk}} is defined as {+1} if {i,j,k} have the values 1,2,3 (in that order), 2,3,1 or 3,1,2. Swapping any pair of indices multiplies the value by {-1}, so that, for example, {\varepsilon_{123}=+1} and {\varepsilon_{213}=-1}. If two indices are the same, such as {i=j=1}, then swapping them leaves {\varepsilon_{11k}} unchanged so the requirement that {\varepsilon_{ijk}=-\varepsilon_{jik}} means that {\varepsilon_{ijk}=0} if any two of its indices are equal.

The symbol is actually an antisymmetric tensor of rank 3, and is found frequently in physical and mathematical equations. One example is in the cross product of two 3-d vectors. If

\displaystyle  \mathbf{c}=\mathbf{a}\times\mathbf{b} \ \ \ \ \ (1)

we can work out the components of {\mathbf{c}} in the usual way by calculating the determinant:

\displaystyle   \mathbf{c} \displaystyle  = \displaystyle  \left|\begin{array}{ccc} \hat{\mathbf{x}}_{1} & \hat{\mathbf{x}}_{2} & \hat{\mathbf{x}}_{3}\\ a_{1} & a_{2} & a_{3}\\ b_{1} & b_{2} & b_{3} \end{array}\right|\ \ \ \ \ (2)
\displaystyle  \displaystyle  = \displaystyle  \left(a_{2}b_{3}-b_{2}a_{3}\right)\hat{\mathbf{x}}_{1}-\left(a_{1}b_{3}-b_{1}a_{3}\right)\hat{\mathbf{x}}_{2}+\left(a_{1}b_{2}-b_{2}a_{1}\right)\hat{\mathbf{x}}_{3} \ \ \ \ \ (3)

where I’ve used {\hat{\mathbf{x}}_{1}=\hat{\mathbf{x}}}, {\hat{\mathbf{x}}_{2}=\hat{\mathbf{y}}} and {\hat{\mathbf{x}}_{3}=\hat{\mathbf{z}}}.

Using {\varepsilon_{ijk}} we can write this in the compact form

\displaystyle  \mathbf{c}=\sum_{i,j,k}\varepsilon_{ijk}\hat{\mathbf{x}}_{i}a_{j}b_{k} \ \ \ \ \ (4)

as can be verified by expanding the sum and comparing with 3.

The Levi-Civita symbol can be used to write a completely antisymmetric wave function for a set of three fermions. Suppose the wave function for a single fermion in state {n} with coordinate {x_{a}} is {U_{n}\left(x_{a}\right)} (where both {n} and {a} can take values 1, 2 or 3). Then a completely antisymmetric wave function is

\displaystyle  \psi_{A}\left(x_{1},x_{2},x_{3}\right)=\frac{1}{\sqrt{6}}\sum_{i,j,k}\varepsilon_{ijk}U_{i}\left(x_{1}\right)U_{j}\left(x_{2}\right)U_{k}\left(x_{3}\right) \ \ \ \ \ (5)

The factor of {\frac{1}{\sqrt{6}}} is for normalization and assumes that the {U_{n}} are all normalized wave functions.

Swapping the locations {x_{1}} and {x_{2}}, for example, is equivalent to swapping {i} and {j} in the sum, which produces the negative of the original sum. That is

\displaystyle   \psi_{A}\left(x_{2},x_{1},x_{3}\right) \displaystyle  = \displaystyle  \frac{1}{\sqrt{6}}\sum_{i,j,k}\varepsilon_{ijk}U_{i}\left(x_{2}\right)U_{j}\left(x_{1}\right)U_{k}\left(x_{3}\right)\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\sqrt{6}}\sum_{i,j,k}\varepsilon_{jik}U_{i}\left(x_{1}\right)U_{j}\left(x_{2}\right)U_{k}\left(x_{3}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  -\frac{1}{\sqrt{6}}\sum_{i,j,k}\varepsilon_{ijk}U_{i}\left(x_{1}\right)U_{j}\left(x_{2}\right)U_{k}\left(x_{3}\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  -\psi_{A}\left(x_{1},x_{2},x_{3}\right) \ \ \ \ \ (9)

The same argument applies to swapping the other pairs of locations.

Angular momentum of circular motion

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 12, Exercise 12.3.6.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

A particle of mass {\mu} constrained to move (at constant speed {v}, we assume) on a circle of radius {a} centred at the origin in the {xy} plane has a constant kinetic energy of {\frac{1}{2}\mu v^{2}}. As its momentum {\mathbf{p}} is always perpendicular to the radius vector {\mathbf{r}}, the angular momentum is given by

\displaystyle  \mathbf{L}=\mathbf{r}\times\mathbf{p}=\mu av\hat{\mathbf{z}}=L_{z}\hat{\mathbf{z}} \ \ \ \ \ (1)

The energy can thus be written as

\displaystyle  H=\frac{1}{2}\mu v^{2}=\frac{L_{z}^{2}}{2\mu a^{2}} \ \ \ \ \ (2)

In polar coordinates, the angular momentum operator is

\displaystyle  L_{z}=-i\hbar\frac{\partial}{\partial\phi} \ \ \ \ \ (3)

The eigenvalue problem for this system is therefore

\displaystyle   H\psi \displaystyle  = \displaystyle  E\psi\ \ \ \ \ (4)
\displaystyle  -\frac{\hbar^{2}}{2\mu a^{2}}\frac{\partial^{2}\psi}{\partial\phi^{2}} \displaystyle  = \displaystyle  E\psi \ \ \ \ \ (5)

The eigenvalues of {L_{z}} are the solutions of

\displaystyle   -\hbar^{2}\frac{\partial^{2}\psi}{\partial\phi^{2}} \displaystyle  = \displaystyle  \ell_{z}\psi \ \ \ \ \ (6)

which are

\displaystyle  \psi=Ae^{i\ell_{z}\phi/\hbar}=Ae^{im\phi} \ \ \ \ \ (7)

for some constant {A}, with the quantization condition (arising from the requirement that {\psi\left(\phi+2\pi\right)=\psi\left(\phi\right)})

\displaystyle  \ell_{z}=m\hbar \ \ \ \ \ (8)

where {m} is an integer (positive, negative or zero). Plugging this into 5 we find

\displaystyle  E=\frac{\hbar^{2}m^{2}}{2\mu a^{2}} \ \ \ \ \ (9)

Each energy is two-fold degenerate since {\pm m} both give the same energy. This corresponds to the particle moving round the circle in the clockwise or counterclockwise direction.