Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are on the welcome page.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. Details of the number of visits and distinct visitors are given on the hit statistics page.

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

I should point out that although I did study physics at the university level, this was back in the 1970s and by the time I started this blog in December 2010, I had forgotten pretty much everything I had learned back then. This blog represents my journey back to some level of literacy in physics. I am by no means a professional physicist or an authority on any aspect of the subject. I offer this blog as a record of my own notes and problem solutions as I worked through various books, in the hope that it will help, and possibly even inspire, others to explore this wonderful subject.

Before leaving a comment, you may find it useful to read the “Instructions for commenters“.

Spin in a precessing magnetic field – part 2

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.4.3, Part 2.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

In the first part of this article, we saw that a particle with spin placed in a precessing magnetic field can be analyzed by moving to a frame rotating with the same frequency as the field. In this rotating frame, the magnetic field is independent of time and looks like this:

\displaystyle  \mathbf{B}_{r}=B\hat{\mathbf{x}}_{r}+\left(B_{0}-\frac{\omega}{\gamma}\right)\hat{\mathbf{z}} \ \ \ \ \ (1)

where {\hat{\mathbf{x}}_{r}} is a unit vector along the rotating {x} axis. In this frame, the Schrödinger equation has the form

\displaystyle   i\hbar\frac{\partial}{\partial t}\left|\psi_{r}\left(t\right)\right\rangle \displaystyle  = \displaystyle  -\gamma\mathbf{S}\cdot\mathbf{B}_{r}\left|\psi_{r}\left(t\right)\right\rangle \ \ \ \ \ (2)
\displaystyle  \displaystyle  = \displaystyle  \left[\left(\omega-\gamma B_{0}\right)S_{z}-\gamma BS_{x}\right]\left|\psi_{r}\left(t\right)\right\rangle \ \ \ \ \ (3)

where {\left|\psi_{r}\left(t\right)\right\rangle } is the state vector in the rotating frame, in the {S_{z}} basis. The Hamiltonian in the rotating frame is thus

\displaystyle   H \displaystyle  = \displaystyle  \left(\omega-\gamma B_{0}\right)S_{z}-\gamma BS_{x}\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar}{2}\left(\omega-\gamma B_{0}\right)\sigma_{z}-\frac{\hbar}{2}\gamma B\sigma_{x} \ \ \ \ \ (5)

Given the initial state {\left|\psi_{r}\left(0\right)\right\rangle } we can find the state at other times if we can find the propagator in the rotating frame

\displaystyle  U_{r}\left(t\right)=e^{-iHt/\hbar} \ \ \ \ \ (6)

The propagator is complicated by the fact that the Hamiltonian 5 contains two operators ({\sigma_{x}} and {\sigma_{z}}) that don’t commute, so we can’t split the exponential into the product of two simpler exponentials. However, if we expand the exponential in a power series, we see that it does actually have a fairly simple form. We have

\displaystyle   e^{-iHt/\hbar} \displaystyle  = \displaystyle  e^{-i\left[\left(\omega-\gamma B_{0}\right)\sigma_{z}-\gamma B\sigma_{x}\right]t/2}\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  e^{i\left[\left(\gamma B_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]t/2} \ \ \ \ \ (8)

We can expand this in a power series, but first it’s useful to introduce some shorthand. We have

\displaystyle   \omega_{0} \displaystyle  \equiv \displaystyle  \gamma B_{0}\ \ \ \ \ (9)
\displaystyle  \omega_{r} \displaystyle  \equiv \displaystyle  \sqrt{\left(\gamma B_{0}-\omega\right)^{2}+\gamma^{2}B^{2}}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \sqrt{\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}} \ \ \ \ \ (11)

We get

\displaystyle   e^{-iHt/\hbar} \displaystyle  = \displaystyle  I+\frac{it}{2}\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]+\ \ \ \ \ (12)
\displaystyle  \displaystyle  \displaystyle  -\frac{1}{2!}\frac{t^{2}}{2^{2}}\left(\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]\right)^{2}+\ \ \ \ \ (13)
\displaystyle  \displaystyle  \displaystyle  -\frac{1}{3!}\frac{it^{3}}{2^{3}}\left(\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]\right)^{3}+\ldots \ \ \ \ \ (14)

Consider the square term in the second line. Multiplying it out, we get

\displaystyle   \left(\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right)^{2} \displaystyle  = \displaystyle  \left(\omega_{0}-\omega\right)^{2}\sigma_{z}^{2}+\gamma^{2}B^{2}\sigma_{x}^{2}+\ \ \ \ \ (15)
\displaystyle  \displaystyle  \displaystyle  \left(\omega_{0}-\omega\right)\gamma B\left(\sigma_{z}\sigma_{x}+\sigma_{x}\sigma_{z}\right) \ \ \ \ \ (16)

Using a couple of identities for Pauli matrices:

\displaystyle   \sigma_{i}^{2} \displaystyle  = \displaystyle  I\ \ \ \ \ (17)
\displaystyle  \left[\sigma_{z},\sigma_{x}\right]_{+} \displaystyle  = \displaystyle  0 \ \ \ \ \ (18)

we see that the last term vanishes and the first two terms can be combined, so we get

\displaystyle   \left(\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right)^{2} \displaystyle  = \displaystyle  \left[\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}\right]I\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \omega_{r}^{2}I \ \ \ \ \ (20)

This simple form means that all higher terms in the power series 12 are easy to calculate. If we call the {n}th term in the series {a_{n}}, the terms with an even exponent are

\displaystyle  a_{2n}=\left(-1\right)^{n}\frac{t^{2n}}{\left(2n\right)!2^{2n}}\omega_{r}^{2n}I \ \ \ \ \ (21)

The {\left(-1\right)^{n}} comes in because of the {i} in the exponent which gets raised to successively higher powers in the series. The series of even terms is therefore a cosine:

\displaystyle  \sum_{n=0}^{\infty}a_{2n}=\cos\frac{\omega_{r}t}{2}I \ \ \ \ \ (22)

For odd terms, we have

\displaystyle  a_{2n+1}=\left(-1\right)^{n}i\omega_{r}^{2n}\frac{t^{2n+1}}{\left(2n+1\right)!2^{2n+1}}\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right] \ \ \ \ \ (23)

The series of odd terms comes out to

\displaystyle   \sum_{n=0}^{\infty}a_{2n+1} \displaystyle  = \displaystyle  \frac{i}{\omega_{r}}\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]\sum_{n=0}^{\infty}\frac{\left(-1\right)^{n}t^{2n+1}\omega_{r}^{2n+1}}{\left(2n+1\right)!2^{2n+1}}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \frac{i}{\omega_{r}}\left[\left(\omega_{0}-\omega\right)\sigma_{z}+\gamma B\sigma_{x}\right]\sin\frac{\omega_{r}t}{2} \ \ \ \ \ (25)

We can therefore write out {U} as a matrix by using the Pauli matrices:

\displaystyle  \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (26)

\displaystyle  U_{r}\left(t\right)=\left[\begin{array}{cc} \cos\frac{\omega_{r}t}{2}+\frac{\omega_{0}-\omega}{\omega_{r}}i\sin\frac{\omega_{r}t}{2} & \frac{i\gamma B}{\omega_{r}}\sin\frac{\omega_{r}t}{2}\\ \frac{i\gamma B}{\omega_{r}}\sin\frac{\omega_{r}t}{2} & \cos\frac{\omega_{r}t}{2}-\frac{\omega_{0}-\omega}{\omega_{r}}i\sin\frac{\omega_{r}t}{2} \end{array}\right] \ \ \ \ \ (27)

To rotate this back to the lab frame, we apply the inverse rotation operator

\displaystyle   e^{+i\omega tS_{z}/\hbar} \displaystyle  = \displaystyle  e^{i\omega t\sigma_{z}/2}\ \ \ \ \ (28)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{cc} e^{i\omega t/2} & 0\\ 0 & e^{-i\omega t/2} \end{array}\right] \ \ \ \ \ (29)

For a particle that starts in the spin up state

\displaystyle  \left|\psi\left(0\right)\right\rangle =\left[\begin{array}{c} 1\\ 0 \end{array}\right] \ \ \ \ \ (30)

Since the spin {z} direction is also the axis of rotation for the rotating frame, we have (except for a phase factor that isn’t observable physically):

\displaystyle  \left|\psi_{r}\left(0\right)\right\rangle =e^{-i\omega t/2}\left[\begin{array}{c} 1\\ 0 \end{array}\right] \ \ \ \ \ (31)

The general state at time {t} is

\displaystyle   \left|\psi\left(t\right)\right\rangle \displaystyle  = \displaystyle  e^{i\omega t\sigma_{z}/2}U_{r}\left(t\right)\left|\psi_{r}\left(0\right)\right\rangle \ \ \ \ \ (32)
\displaystyle  \displaystyle  = \displaystyle  e^{-i\omega t/2}\left[\begin{array}{c} \left[\cos\frac{\omega_{r}t}{2}+\frac{\omega_{0}-\omega}{\omega_{r}}i\sin\frac{\omega_{r}t}{2}\right]e^{i\omega t/2}\\ \frac{i\gamma B}{\omega_{r}}\sin\frac{\omega_{r}t}{2}e^{-i\omega t/2} \end{array}\right] \ \ \ \ \ (33)

In the case {\omega=\omega_{0}=\gamma B_{0}}, we have {\omega_{r}=\gamma B} from 11, so the state vector becomes

\displaystyle  \left|\psi\left(t\right)\right\rangle =e^{-i\omega t/2}\left[\begin{array}{c} \cos\frac{\gamma Bt}{2}e^{i\omega t/2}\\ i\sin\frac{\gamma Bt}{2}e^{-i\omega t/2} \end{array}\right] \ \ \ \ \ (34)

If we compare this to the eigenvector {\left|\hat{n}+\right\rangle } for spin up along a general direction given by the spherical angles {\theta} and {\phi}, which is

\displaystyle  \left|\hat{n}+\right\rangle =\left[\begin{array}{c} \cos\frac{\theta}{2}e^{-i\phi/2}\\ \sin\frac{\theta}{2}e^{i\phi/2} \end{array}\right] \ \ \ \ \ (35)

we see that, apart from the extra {i} in the sine term, the state {\left|\psi\left(t\right)\right\rangle } is the spin-up state for polar angles {\theta=\gamma Bt}, {\phi=-\omega t}. The probability of finding an up or down state is

\displaystyle   P_{up} \displaystyle  = \displaystyle  \left|\cos\frac{\gamma Bt}{2}e^{i\omega t/2}\right|^{2}=\cos^{2}\frac{\gamma Bt}{2}\ \ \ \ \ (36)
\displaystyle  P_{down} \displaystyle  = \displaystyle  \left|i\sin\frac{\gamma Bt}{2}e^{-i\omega t/2}\right|^{2}=\sin^{2}\frac{\gamma Bt}{2} \ \ \ \ \ (37)

The spin oscillates between a pure up state when {\gamma Bt/2} is a multiple of {\pi} to a pure down state when {\gamma Bt/2} is an odd multiple of {\frac{\pi}{2}}.

Finally, we can check that {\left\langle \mu_{z}\left(t\right)\right\rangle } agrees with the classical result

\displaystyle  \mu_{z}\left(t\right)=\mu_{z}\left(0\right)\left[\frac{\left(\omega_{0}-\omega\right)^{2}}{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}+\frac{\gamma^{2}B^{2}\cos\omega t}{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}\right] \ \ \ \ \ (38)

To find {\left\langle \mu_{z}\left(t\right)\right\rangle } we evaluate as follows.

\displaystyle   \left\langle \mu_{z}\left(t\right)\right\rangle \displaystyle  = \displaystyle  \left\langle \psi\left(t\right)\left|\mu_{z}\right|\psi\left(t\right)\right\rangle \ \ \ \ \ (39)
\displaystyle  \displaystyle  = \displaystyle  \gamma\left\langle \psi\left(t\right)\left|S_{z}\right|\psi\left(t\right)\right\rangle \ \ \ \ \ (40)
\displaystyle  \displaystyle  = \displaystyle  \frac{\gamma\hbar}{2}\left\langle \psi\left(t\right)\left|\sigma_{z}\right|\psi\left(t\right)\right\rangle \ \ \ \ \ (41)
\displaystyle  \displaystyle  = \displaystyle  \frac{\gamma\hbar}{2}\left[\begin{array}{cc} \left(\cos\frac{\omega_{r}t}{2}-\frac{\omega_{0}-\omega}{\omega_{r}}i\sin\frac{\omega_{r}t}{2}\right)e^{-i\omega t/2} & -\frac{i\gamma B}{\omega_{r}}\sin\frac{\omega_{r}t}{2}e^{i\omega t/2}\end{array}\right]\times\ \ \ \ \ (42)
\displaystyle  \displaystyle  \displaystyle  \left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\left[\begin{array}{c} \left(\cos\frac{\omega_{r}t}{2}+\frac{\omega_{0}-\omega}{\omega_{r}}i\sin\frac{\omega_{r}t}{2}\right)e^{i\omega t/2}\\ \frac{i\gamma B}{\omega_{r}}\sin\frac{\omega_{r}t}{2}e^{-i\omega t/2} \end{array}\right] \ \ \ \ \ (43)

We introduce shorthand for the trig functions:

\displaystyle   c \displaystyle  \equiv \displaystyle  \cos\frac{\omega_{r}t}{2}\ \ \ \ \ (44)
\displaystyle  s \displaystyle  \equiv \displaystyle  \sin\frac{\omega_{r}t}{2} \ \ \ \ \ (45)

Then we have (note that complex exponentials cancel out):

\displaystyle   \frac{2}{\gamma\hbar}\left\langle \mu_{z}\left(t\right)\right\rangle \displaystyle  = \displaystyle  \left[\begin{array}{cc} \left[c-\frac{\omega_{0}-\omega}{\omega_{r}}is\right]e^{-i\omega t/2} & -\frac{i\gamma B}{\omega_{r}}se^{i\omega t/2}\end{array}\right]\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\left[\begin{array}{c} \left[c+\frac{\omega_{0}-\omega}{\omega_{r}}is\right]e^{i\omega t/2}\\ \frac{i\gamma B}{\omega_{r}}se^{-i\omega t/2} \end{array}\right]\ \ \ \ \ (46)
\displaystyle  \displaystyle  \displaystyle  \left[\begin{array}{cc} c-\frac{\omega_{0}-\omega}{\omega_{r}}is & -\frac{i\gamma B}{\omega_{r}}s\end{array}\right]\left[\begin{array}{c} c+\frac{\omega_{0}-\omega}{\omega_{r}}is\\ -\frac{i\gamma B}{\omega_{r}}s \end{array}\right]\ \ \ \ \ (47)
\displaystyle  \displaystyle  = \displaystyle  c^{2}+\left(\frac{\omega_{0}-\omega}{\omega_{r}}\right)^{2}s^{2}-\left(\frac{\gamma B}{\omega_{r}}\right)^{2}s^{2}\ \ \ \ \ (48)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\omega_{r}^{2}}\left[\omega_{r}^{2}c^{2}+\left(\left(\omega_{0}-\omega\right)^{2}-\gamma^{2}B^{2}\right)s^{2}\right]\ \ \ \ \ (49)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\omega_{r}^{2}}\left[\left(\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}\right)c^{2}+\left(\left(\omega_{0}-\omega\right)^{2}-\gamma^{2}B^{2}\right)s^{2}\right]\ \ \ \ \ (50)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\omega_{r}^{2}}\left(\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}\left(c^{2}-s^{2}\right)\right) \ \ \ \ \ (51)

where we used 11 in the fourth line.

Using the trig identity

\displaystyle  \cos2\theta=\cos^{2}\theta-\sin^{2}\theta \ \ \ \ \ (52)

we see that

\displaystyle  c^{2}-s^{2}=\cos^{2}\frac{\omega_{r}t}{2}-\sin^{2}\frac{\omega_{r}t}{2}=\cos\omega_{r}t \ \ \ \ \ (53)

So we have

\displaystyle   \left\langle \mu_{z}\left(t\right)\right\rangle \displaystyle  = \displaystyle  \frac{\gamma\hbar}{2}\frac{\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}\cos\omega_{r}t}{\omega_{r}^{2}}\ \ \ \ \ (54)
\displaystyle  \displaystyle  = \displaystyle  \frac{\gamma\hbar}{2}\frac{\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}\cos\omega_{r}t}{\left(\omega_{0}-\omega\right)^{2}+\gamma^{2}B^{2}} \ \ \ \ \ (55)

This agrees with 38 provided {\mu_{z}\left(0\right)=\frac{\gamma\hbar}{2}}, which is true, since the magnitude of the magnetic moment is {\frac{\gamma\hbar}{2}} and it starts in the spin up position so {\mu_{z}\left(0\right)=\frac{\gamma\hbar}{2}}.

Spin in a precessing magnetic field – part 1

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.4.3, Part 1.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Classically, if a magnetic moment {\boldsymbol{\mu}} is placed in a magnetic field that precesses about the {z} axis, the magnetic moment itself precesses. If the field is given as

\displaystyle \mathbf{B}=B\cos\omega t\hat{\mathbf{x}}-B\sin\omega t\hat{\mathbf{y}}+B_{0}\hat{\mathbf{z}} \ \ \ \ \ (1)

 

then in a frame that rotates with the same frequency as the field, the magnetic field appears to be constant with value

\displaystyle \mathbf{B}_{r}=B\hat{\mathbf{x}}_{r}+\left(B_{0}-\frac{\omega}{\gamma}\right)\hat{\mathbf{z}} \ \ \ \ \ (2)

 

where

\displaystyle \hat{\mathbf{x}}_{r}=\cos\omega t\hat{\mathbf{x}}-\sin\omega t\hat{\mathbf{y}} \ \ \ \ \ (3)

is a unit vector along the {x} axis in the rotating frame. We now want to see how this result transfers into quantum mechanics.

We begin with the Schrödinger equation for the state {\left|\psi\left(t\right)\right\rangle } in the lab (non-rotating) frame, which is, as usual

\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi\left(t\right)\right\rangle =H\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (4)

We’ll study the case where {\left|\psi\left(t\right)\right\rangle } is a spin {\frac{1}{2}} state, for which the Hamiltonian is

\displaystyle H=-\gamma\mathbf{S}\cdot\mathbf{B} \ \ \ \ \ (5)

We can analyze the situation in the rotating frame by applying a unitary rotation operator to the lab state. That is

\displaystyle \left|\psi_{r}\left(t\right)\right\rangle \displaystyle = \displaystyle e^{-i\omega tS_{z}/\hbar}\left|\psi\left(t\right)\right\rangle =e^{-i\omega t\sigma_{z}/2}\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle \left[\cos\frac{\omega t}{2}I-i\sin\frac{\omega t}{2}\sigma_{z}\right]\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (7)

[It seems to me that this unitary operator is for a rotation by an angle {\omega t}, and since the rotation of the field in 1 is given by a frequency {-\omega\hat{\mathbf{z}}}, we should really be using the rotation operator {e^{i\omega t\sigma_{z}/2}}. However if we do this (I tried) we get the wrong answer, so presumably the transformation 6 is correct.]

Our first goal is to find the Schrödinger equation for {\left|\psi_{r}\left(t\right)\right\rangle }, which involves finding the corresponding Hamiltonian. The Schrödinger equation is

\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi_{r}\left(t\right)\right\rangle =H_{r}\left|\psi_{r}\left(t\right)\right\rangle \ \ \ \ \ (8)

 

Inserting 6 into the LHS and differentiating, we get

\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi_{r}\left(t\right)\right\rangle \displaystyle = \displaystyle \frac{\hbar\omega\sigma_{z}}{2}e^{-i\omega t\sigma_{z}/2}\left|\psi\left(t\right)\right\rangle +i\hbar e^{-i\omega t\sigma_{z}/2}\frac{\partial}{\partial t}\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle \frac{\hbar\omega\sigma_{z}}{2}\left|\psi_{r}\left(t\right)\right\rangle +e^{-i\omega t\sigma_{z}/2}H\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (10)
\displaystyle \displaystyle = \displaystyle \frac{\hbar\omega\sigma_{z}}{2}\left|\psi_{r}\left(t\right)\right\rangle -e^{-i\omega t\sigma_{z}/2}\gamma\mathbf{S}\cdot\mathbf{B}\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (11)

We would like the RHS to be in the form of the RHS of 8, but in the second term, the problem is that {e^{-i\omega t\sigma_{z}/2}} does not commute with {\mathbf{S}} so we can’t just swap the {e^{-i\omega t\sigma_{z}/2}} and {\mathbf{S}\cdot\mathbf{B}} factors. We need to multiply out the terms and see what simplifications we can do.

In what follows, it’s easier to work with the Pauli matrices defined by

\displaystyle \mathbf{S}=\frac{\hbar}{2}\boldsymbol{\sigma} \ \ \ \ \ (12)

We’ll also need a few theorems involving {\sigma_{i}}

\displaystyle \sigma_{i}\sigma_{j} \displaystyle = \displaystyle -\sigma_{j}\sigma_{i}\ \ \ \ \ (13)
\displaystyle \sigma_{i}\sigma_{j} \displaystyle = \displaystyle \delta_{ij}I+i\sum_{k}\varepsilon_{ijk}\sigma_{k} \ \ \ \ \ (14)

We’ll also define some shorthand for the trig functions:

\displaystyle c \displaystyle \equiv \displaystyle \cos\frac{\omega t}{2}\ \ \ \ \ (15)
\displaystyle s \displaystyle \equiv \displaystyle \sin\frac{\omega t}{2}\ \ \ \ \ (16)
\displaystyle c_{1} \displaystyle \equiv \displaystyle \cos\omega t\ \ \ \ \ (17)
\displaystyle s_{1} \displaystyle \equiv \displaystyle \sin\omega t \ \ \ \ \ (18)

The standard double-angle formulas are

\displaystyle c_{1} \displaystyle = \displaystyle c^{2}-s^{2}\ \ \ \ \ (19)
\displaystyle s_{1} \displaystyle = \displaystyle 2sc \ \ \ \ \ (20)

Using 1 and 7 we have

\displaystyle -e^{-i\omega t\sigma_{z}/2}\gamma\mathbf{S}\cdot\mathbf{B} \displaystyle = \displaystyle -\frac{\gamma\hbar}{2}\left[B\left(c-is\sigma_{z}\right)\left(\sigma_{x}c_{1}-\sigma_{y}s_{1}\right)+B_{0}\left(c-is\sigma_{z}\right)\sigma_{z}\right] \ \ \ \ \ (21)

The last term on the RHS is in the correct form since there are no commutation problems here. So we need to work on the first term, which we’ll isolate here:

\displaystyle \left(c-is\sigma_{z}\right)\left(\sigma_{x}c_{1}-\sigma_{y}s_{1}\right)=c_{1}c\sigma_{x}+ic_{1}s\sigma_{x}\sigma_{z}-s_{1}c\sigma_{y}-is_{1}s\sigma_{y}\sigma_{z} \ \ \ \ \ (22)

We can now use the identities 13 and 14 and the trig identities above to get

\displaystyle \left(c-is\sigma_{z}\right)\left(\sigma_{x}c_{1}-\sigma_{y}s_{1}\right) \displaystyle = \displaystyle \left(c^{2}-s^{2}\right)c\sigma_{x}+i\left(c^{2}-s^{2}\right)s\sigma_{x}\sigma_{z}-2sc^{2}\sigma_{y}-2is^{2}c\sigma_{y}\sigma_{z}\ \ \ \ \ (23)
\displaystyle \displaystyle = \displaystyle \left(c^{2}-s^{2}\right)c\sigma_{x}+i\left(c^{2}-s^{2}\right)s\sigma_{x}\sigma_{z}-2isc^{2}\sigma_{x}\sigma_{z}+2s^{2}c\sigma_{x}\ \ \ \ \ (24)
\displaystyle \displaystyle = \displaystyle \left(c^{3}-s^{2}c+2s^{2}c\right)\sigma_{x}+i\left(-s^{3}+c^{2}s-2sc^{2}\right)\sigma_{x}\sigma_{z}\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \left(c^{2}+s^{2}\right)c\sigma_{x}-i\left(c^{2}+s^{2}\right)s\sigma_{x}\sigma_{z}\ \ \ \ \ (26)
\displaystyle \displaystyle = \displaystyle \sigma_{x}\left(c-is\sigma_{z}\right)\ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \sigma_{x}e^{-i\omega t\sigma_{z}/2} \ \ \ \ \ (28)

Plugging this into 21 and then back into 11 we get

\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi_{r}\left(t\right)\right\rangle \displaystyle = \displaystyle \frac{\hbar\omega\sigma_{z}}{2}\left|\psi_{r}\left(t\right)\right\rangle -\frac{\gamma\hbar}{2}\left[B\sigma_{x}+B_{0}\sigma_{z}\right]e^{-i\omega t\sigma_{z}/2}\left|\psi\left(t\right)\right\rangle \ \ \ \ \ (29)
\displaystyle \displaystyle = \displaystyle \left[\left(\omega-\gamma B_{0}\right)S_{z}-\gamma BS_{x}\right]\left|\psi_{r}\left(t\right)\right\rangle \ \ \ \ \ (30)

Comparing this with 2, we see that we can write the result as

\displaystyle i\hbar\frac{\partial}{\partial t}\left|\psi_{r}\left(t\right)\right\rangle =-\gamma\mathbf{S}\cdot\mathbf{B}_{r}\left|\psi_{r}\left(t\right)\right\rangle \ \ \ \ \ (31)

Thus in the rotating frame, the Schrödinger equation has the same form as the classical relation, with a time-independent magnetic field {\mathbf{B}_{r}}.

Magnetic moment in precessing magnetic field

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.4.2.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

In classical electromagnetism, a magnetic moment precesses if placed in a constant magnetic field whose direction is not parallel to that of the magnetic moment. For a magnetic moment {\boldsymbol{\mu}} in a constant field {\mathbf{B}_{0}}, the precession has a frequency of

\displaystyle \boldsymbol{\omega}_{0}=-\gamma\mathbf{B}_{0} \ \ \ \ \ (1)

 

where {\gamma} is the gyromagnetic ratio.

Now suppose we view this precession in a frame of reference that is rotating about the same axis as {\boldsymbol{\omega}_{0}}, but with a frequency {\boldsymbol{\omega}} that may not be the same as {\boldsymbol{\omega}_{0}}. The precession frequency will now appear to be

\displaystyle \boldsymbol{\omega}_{r}=\boldsymbol{\omega}_{0}-\boldsymbol{\omega} \ \ \ \ \ (2)

[Although this is a vector equation, all vectors in it have the same direction.] Comparing this with 1, we see that, in the rotating frame, the effective magnetic field is

\displaystyle \mathbf{B}_{r}=-\frac{1}{\gamma}\boldsymbol{\omega}_{r}=\mathbf{B}_{0}+\frac{\omega}{\gamma} \ \ \ \ \ (3)

Now suppose the magnetic field is taken to be constant in the {z} direction with component {B_{0}\hat{\mathbf{z}}}, but with a small oscillating component in the {xy} plane, so that the total field is

\displaystyle \mathbf{B}=B\cos\omega t\hat{\mathbf{x}}-B\sin\omega t\hat{\mathbf{y}}+B_{0}\hat{\mathbf{z}} \ \ \ \ \ (4)

where {B\ll B_{0}}.

This is a magnetic field that precesses about the {z} axis, so it’s similar to the case we treated earlier, although in the earlier post we were concerned only with the behaviour of an electron in such a field, so we were interested in the quantum mechanics. The present treatment is purely classical.

If we place a magnetic moment in this field so that at {t=0} it’s pointing in the {+z} direction, we want to find how the magnetic moment varies with time. To analyze the problem, it’s easiest to transform to a rotating frame with frequency {\boldsymbol{\omega}=-\omega\hat{\mathbf{z}}} (minus, because it’s precessing in a clockwise direction). Since the frame is rotating at the same rate as the magnetic field, the field appears frozen in this rotating frame. For simplicity, we’ll assume that the field’s horizontal component lies along the {+x} direction, so the field lies in the {xz} plane. The {z} component of the field is thus effectively reduced to

\displaystyle B_{z}=B_{0}-\frac{\omega}{\gamma} \ \ \ \ \ (5)

In this frame, we therefore have a constant magnetic field given by

\displaystyle \mathbf{B}_{r}=B\hat{\mathbf{x}}+\left(B_{0}-\frac{\omega}{\gamma}\right)\hat{\mathbf{z}} \ \ \ \ \ (6)

 

The magnetic moment should then precess about {\mathbf{B}_{r}}. To get the frequency {\boldsymbol{\omega}_{r}} of this precession, we get the magnitude of the magnetic field:

\displaystyle B_{r}=\sqrt{B^{2}+\left(B_{0}-\frac{\omega}{\gamma}\right)^{2}} \ \ \ \ \ (7)

The precession frequency is then

\displaystyle \boldsymbol{\omega}_{r}=-\gamma\mathbf{B}_{r} \ \ \ \ \ (8)

Refer to the following figure (similar to Shankar’s Fig. 14.3, but with a few added points) for what follows.

In the figure {\boldsymbol{\mu}\left(0\right)} is given by the vector {OE}, so it starts off pointing in the {+z} direction. [Just as in Shankar’s figure, we’ve drawn this vector so it’s not quite parallel to the {z} axis, although in the problem {\boldsymbol{\mu}\left(0\right)} does actually point directly along the {z} axis. Drawing it this way makes the figure a bit easier to follow.] To get the {z} component of {\boldsymbol{\mu}} as it precesses about {\mathbf{B}_{r}}, suppose we look at {\boldsymbol{\mu}} at time {t}, when it has precessed through an angle {\omega t}, so {\boldsymbol{\mu}} now lies along the vector {OD} (I haven’t drawn the vector in the diagram since it would get too cluttered, but you can imagine the vector.) To get the {z} component of this vector, we look at its components parallel and perpendicular to the plane followed by the tip of {\boldsymbol{\mu}} as it precesses. This is the plane occupied by the circle in the diagram (well, ok, in the diagram it’s an ellipse because we’re looking at the circle from an angle). If the angle between {\boldsymbol{\mu}} and {\mathbf{B}_{r}} is {\alpha}, then the components of {\boldsymbol{\mu}\left(\omega t\right)} are

\displaystyle AD \displaystyle = \displaystyle \mu\sin\alpha\ \ \ \ \ (9)
\displaystyle OA \displaystyle = \displaystyle \mu\cos\alpha \ \ \ \ \ (10)

Note that the magnitude of {\boldsymbol{\mu}} is constant; only its direction changes by precession. The angle {\alpha} between {\boldsymbol{\mu}} and {\mathbf{B}_{r}} is also constant.

To get the projections of these two segments onto the {z} axis, we look first at the projection of {OA} since {OA} always lies in the {xz} plane. From the diagram

\displaystyle OA_{z}=\left(\mu\cos\alpha\right)\cos\alpha=\mu\cos^{2}\alpha \ \ \ \ \ (11)

 

To get the {z} projection of {AD}, we first project it onto the {xz} plane by projecting it onto {AE}, giving the segment {AC}:

\displaystyle AC \displaystyle = \displaystyle AD\cos\omega t\ \ \ \ \ (12)
\displaystyle \displaystyle = \displaystyle \mu\sin\alpha\cos\omega t \ \ \ \ \ (13)

We then project {AC} onto the {z} axis. The line {AC} makes an angle {\alpha} with the {x} axis, so the projection introduces another factor of {\sin\alpha}:

\displaystyle AC_{z}=AC\sin\alpha=\mu\sin^{2}\alpha\cos\omega t \ \ \ \ \ (14)

 

The {z} component of {\boldsymbol{\mu}} is therefore the sum of 11 and 14:

\displaystyle \mu_{z}=\mu\cos^{2}\alpha+\mu\sin^{2}\alpha\cos\omega t \ \ \ \ \ (15)

 

To get the final form, we need to eliminate {\alpha} which we can do from 6, since {\alpha} is the angle between {\mathbf{B}_{r}} and the {z} axis. Therefore

\displaystyle \sin\alpha \displaystyle = \displaystyle \frac{B}{B_{r}}\ \ \ \ \ (16)
\displaystyle \displaystyle = \displaystyle \frac{B}{\sqrt{B^{2}+\left(B_{0}-\frac{\omega}{\gamma}\right)^{2}}}\ \ \ \ \ (17)
\displaystyle \displaystyle = \displaystyle \frac{\gamma B}{\sqrt{\gamma^{2}B^{2}+\left(\gamma B_{0}-\omega\right)^{2}}}\ \ \ \ \ (18)
\displaystyle \cos\alpha \displaystyle = \displaystyle \frac{B_{0}-\frac{\omega}{\gamma}}{\sqrt{B^{2}+\left(B_{0}-\frac{\omega}{\gamma}\right)^{2}}}\ \ \ \ \ (19)
\displaystyle \displaystyle = \displaystyle \frac{\gamma B_{0}-\omega}{\sqrt{\gamma^{2}B^{2}+\left(\gamma B_{0}-\omega\right)^{2}}} \ \ \ \ \ (20)

We can write this in terms of the frequency {\omega_{0}} by which the magnetic moment would precess if the field were constant, which is

\displaystyle \omega_{0}=\left|\boldsymbol{\omega}_{0}\right|=\gamma B_{0} \ \ \ \ \ (21)

So we get

\displaystyle \sin\alpha \displaystyle = \displaystyle \frac{\gamma B}{\sqrt{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}}\ \ \ \ \ (22)
\displaystyle \cos\alpha \displaystyle = \displaystyle \frac{\omega_{0}-\omega}{\sqrt{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}} \ \ \ \ \ (23)

Plugging this into 15 we get

\displaystyle \mu_{z}\left(t\right)=\mu_{z}\left(0\right)\left[\frac{\left(\omega_{0}-\omega\right)^{2}}{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}+\frac{\gamma^{2}B^{2}\cos\omega t}{\gamma^{2}B^{2}+\left(\omega_{0}-\omega\right)^{2}}\right] \ \ \ \ \ (24)

Precession of angular momentum in a magnetic field

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.4.1.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

In classical electrodynamics, the torque on a magnetic moment {\boldsymbol{\mu}} in a constant magnetic field {\mathbf{B}} is given by (using Shankar’s notation):

\displaystyle \mathbf{T}=\boldsymbol{\mu}\times\mathbf{B} \ \ \ \ \ (1)

 

We can relate the magnetic moment to the angular momentum of a (classically) spinning charged object by introducing the gyromagnetic ratio

\displaystyle \gamma\equiv\frac{\mu}{l} \ \ \ \ \ (2)

If we apply this to a single particle of charge {q} and mass {m} travelling at constant speed {v} around a circular orbit, then its angular momentum is

\displaystyle l=mvr \ \ \ \ \ (3)

Tha magnetic moment can be calculated by taking the charge {q} to be smeared out over the circumference of the circle, giving a linear charge density of

\displaystyle \lambda=\frac{q}{2\pi r} \ \ \ \ \ (4)

Since the loop is spinning with speed {v}, the current (rate at which charge passed a fixed point on the circle) is

\displaystyle I=\lambda v=\frac{q}{2\pi r}r\omega=\frac{q}{2\pi}\omega \ \ \ \ \ (5)

where

\displaystyle \omega=\frac{2\pi}{P}=\frac{2\pi v}{2\pi r}=\frac{v}{r} \ \ \ \ \ (6)

is the angular frequency ({P} is the period, or time it takes for one complete orbit).

The magnetic moment is defined as

\displaystyle \boldsymbol{\mu}\equiv\frac{I}{c}\mathbf{a} \ \ \ \ \ (7)

where {\mathbf{a}} is the area of the loop, whose direction is determined by using the right-hand rule on the direction of the current around the loop. Thus if the current is travelling counterclockwise when viewed from above, {\mathbf{a}} points upwards. (The speed of light {c} enters because Shankar is using CGS units.) The magnetic moment here is then

\displaystyle \boldsymbol{\mu} \displaystyle = \displaystyle \frac{qv}{2\pi r}\frac{\pi r^{2}}{c}\hat{\mathbf{a}}\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle \left(\frac{q}{2mc}\right)\left(mvr\hat{\mathbf{a}}\right)\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle \left(\frac{q}{2mc}\right)\mathbf{l} \ \ \ \ \ (10)

where {\mathbf{l}} is the angular momentum vector. In this case, the gyromagnetic ratio is

\displaystyle \gamma=\frac{q}{2mc} \ \ \ \ \ (11)

In this case, the torque 1 is given by

\displaystyle \mathbf{T}=\gamma\mathbf{l}\times\mathbf{B} \ \ \ \ \ (12)

The interaction energy (between the angular momentum and magnetic field) is given by

\displaystyle H_{int}=\int T\left(\theta\right)d\theta \ \ \ \ \ (13)

where the torque is given as a function of the angle between {\boldsymbol{\mu}} and {\mathbf{B}} in 1, so that

\displaystyle T=\mu B\sin\theta \ \ \ \ \ (14)

Doing the integral (neglecting the constant of integration) we have

\displaystyle H_{int}=-\mu B\cos\theta=-\boldsymbol{\mu}\cdot\mathbf{B} \ \ \ \ \ (15)

 

{H_{int}} is minimized when {\boldsymbol{\mu}} and {\mathbf{B}} are parallel, so the torque’s effect is to try to bring these two vectors into alignment. This assumes that the magnetic moment doesn’t actually involve any angular momentum, which obviously isn’t the case with our rotating loop example above. In that case, the torque causes a precession about the direction of {\mathbf{B}}, which we can see as follows.

The angular version of Newton’s law, relating torque and angular momentum, is

\displaystyle \mathbf{T}=\frac{d\mathbf{l}}{dt}=\gamma\mathbf{l}\times\mathbf{B} \ \ \ \ \ (16)

Since the cross product is perpendicular to both its constituent vectors, the change in {\mathbf{l}} is always perpendicular to {\mathbf{l}} itself. The effect can be seen by looking at Shankar’s Figure 14.2 (too much effort to reproduce that here), in which we can see that

\displaystyle \Delta\mathbf{l} \displaystyle = \displaystyle \gamma\left(\mathbf{l}\times\mathbf{B}\right)\Delta t\ \ \ \ \ (17)
\displaystyle \Delta l \displaystyle = \displaystyle \gamma lB\sin\theta\Delta t \ \ \ \ \ (18)

where {\theta} is the angle between {\mathbf{l}} and {\mathbf{B}}, and {\Delta\mathbf{l}} is tangent to the circle of radius {l\sin\theta} that lies in the plane perpendicular to {\mathbf{B}}. The net effect is that {\mathbf{l}} precesses about the direction of {\mathbf{B}}, so that the magnitude of angular momentum remains constant, but its direction changes at a constant rate. The change in azimuthal angle {\Delta\phi} in time {\Delta t} is

\displaystyle \Delta\phi=\frac{-\Delta l}{l\sin\theta}=-\gamma B\Delta t \ \ \ \ \ (19)

where the minus sign is because the angular momentum precesses clockwise (as seen from above) around {\mathbf{B}}. The angular frequency of precession is therefore

\displaystyle \omega_{0}=\frac{\Delta\phi}{\Delta t}=-\gamma B \ \ \ \ \ (20)

If we include the direction of the axis of precession, which is parallel to {\mathbf{B}}, then

\displaystyle \boldsymbol{\omega}_{0}=-\gamma\mathbf{B} \ \ \ \ \ (21)

We can see that these results transfer over to quantum mechanics if we use Ehrenfest’s theorem. For an interaction hamiltonian 15, we can write it as

\displaystyle H=-\gamma\mathbf{L}\cdot\mathbf{B} \ \ \ \ \ (22)

We want to find the average of the angular momentum over time, so we use Ehrenfest’s theorem to write

\displaystyle \frac{d\left\langle \mathbf{L}\right\rangle }{dt}=-\frac{i}{\hbar}\left\langle \left[\mathbf{L},H\right]\right\rangle \ \ \ \ \ (23)

We can work out the RHS using the commutators of angular momentum:

\displaystyle \left[L_{i},L_{j}\right]=i\hbar\sum_{k}\varepsilon_{ijk}L_{k} \ \ \ \ \ (24)

As we’re dealing with a vector operator, we can work out each component separately. For {L_{x}} we get, assuming that {\mathbf{B}} is independent of position (and thus commutes with {\mathbf{L}}):

\displaystyle -\frac{i}{\hbar}\left[L_{x},H\right] \displaystyle = \displaystyle \frac{i\gamma}{\hbar}\left[L_{x},L_{x}B_{x}+L_{y}B_{y}+L_{z}B_{z}\right]\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \frac{i\gamma}{\hbar}\left(\left[L_{x},L_{x}\right]B_{x}+\left[L_{x},L_{y}\right]B_{y}+\left[L_{x},L_{z}\right]B_{z}\right)\ \ \ \ \ (26)
\displaystyle \displaystyle = \displaystyle -\gamma\left(0+L_{z}B_{y}-L_{y}B_{z}\right)\ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \gamma\left(\mathbf{L}\times\mathbf{B}\right)_{x}\ \ \ \ \ (28)
\displaystyle \displaystyle = \displaystyle \left(\boldsymbol{\mu}\times\mathbf{B}\right)_{x} \ \ \ \ \ (29)

The other two components work out similarly, so we have

\displaystyle -\frac{i}{\hbar}\left\langle \left[\mathbf{L},H\right]\right\rangle =\boldsymbol{\mu}\times\mathbf{B} \ \ \ \ \ (30)

As {\mathbf{B}} doesn’t depend on position, when we take the average over space we get

\displaystyle \frac{d\left\langle \mathbf{L}\right\rangle }{dt}=\left\langle \boldsymbol{\mu}\right\rangle \times\mathbf{B} \ \ \ \ \ (31)

Thus the mean of the quantum angular momentum also precessed about {\mathbf{B}}. Since the only assumption we made was that {\mathbf{B}} was independent of position, and all that was used in the derivation was the commutation relations of angular momentum, the result is also valid for spin angular momentum, and time-varying magnetic fields, provided they are constant over all space.

Pauli matrices: a couple of theorems about commutators and anticommutators

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.8.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Here are a couple of theorems concerning the Pauli matrices {\boldsymbol{\sigma}}, the components of which are

\displaystyle \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{y}=\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (1)

Both theorems arise from the fact that an arbitrary {2\times2} matrix can be written as a linear combination of the Pauli matrices and the unit matrix:

\displaystyle M \displaystyle = \displaystyle \left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right]\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}\left[\left(\alpha+\delta\right)I+\left(\beta+\gamma\right)\sigma_{x}+i\left(\beta-\gamma\right)\sigma_{y}+\left(\alpha-\delta\right)\sigma_{z}\right] \ \ \ \ \ (3)

We’ll also need the commutation and anticommutations relations

\displaystyle \left[\sigma_{i},\sigma_{j}\right]_{+} \displaystyle = \displaystyle 2\delta_{ij}I\ \ \ \ \ (4)
\displaystyle \left[\sigma_{i},\sigma_{j}\right] \displaystyle = \displaystyle 2i\sum_{k}\varepsilon_{ijk}\sigma_{k} \ \ \ \ \ (5)

Theorem 1 Any matrix that commutes with {\boldsymbol{\sigma}} (that is, it commutes with all 3 components of {\boldsymbol{\sigma}}) is a multiple of the unit matrix.

Proof: First, since {I} commutes with every matrix, it commutes with {\boldsymbol{\sigma}}. Now, from 5, any one of the Pauli matrices does not commute with the other two Pauli matrices, so {M} cannot have any component that is one of the Pauli matrices. From 3, this means that

\displaystyle \beta+\gamma \displaystyle = \displaystyle 0\ \ \ \ \ (6)
\displaystyle \beta-\gamma \displaystyle = \displaystyle 0\ \ \ \ \ (7)
\displaystyle \alpha-\delta \displaystyle = \displaystyle 0 \ \ \ \ \ (8)

The first two conditions say that {\beta=\gamma=-\gamma} which implies {\beta=\gamma=0} and the last condition gives us {\alpha=\delta}, so {M} must be a multiple of the unit matrix. \Box

Theorem 2 There is no matrix (apart from the zero matrix) that anticommutes with all 3 Pauli matrices.

Proof: Since {I} doesn’t anticommute with any matrix, {M} cannot contain a component with {I}. From 4, the anticommutator of two Pauli matrices is zero only if the two matrices are different. Therefore, if {M} contains a non-zero component for any one, say {\sigma_{x}}, of the Pauli matrices then {M} will not anticommute with {\sigma_{x}}. The same argument applies to the other two Pauli matrices, so there is no {M} that anticommutes with all 3 Pauli matrices.\Box

Pauli matrices: a few example calculations

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.7.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Here are a few examples of calculations using the Pauli matrices {\boldsymbol{\sigma}}, the components of which are

\displaystyle \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{y}=\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (1)

From Shankar’s equation 14.3.44, we know that the unitary rotation operator can be written as

\displaystyle U\left[R\left(\boldsymbol{\theta}\right)\right] \displaystyle = \displaystyle e^{-i\boldsymbol{\theta}\cdot\boldsymbol{\sigma}/2}\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle \cos\frac{\theta}{2}I-i\sin\frac{\theta}{2}\left(\hat{\theta}\cdot\boldsymbol{\sigma}\right) \ \ \ \ \ (3)

Example 1 Find {\left(I+i\sigma_{x}\right)^{1/2}}. As usual, the square root of a matrix {M} is the matrix {M^{1/2}} such that {M^{1/2}M^{1/2}=M}. To solve this, we would like to express {I+i\sigma_{x}} in the form 2, from which we can find the square root by simply dividing the exponent by 2. We first express it in the form 3, from which we see that we need an angle {\theta} such that

\displaystyle \cos\frac{\theta}{2}=-\sin\frac{\theta}{2} \ \ \ \ \ (4)

This is valid if

\displaystyle \theta \displaystyle = \displaystyle \frac{3\pi}{2}\ \ \ \ \ (5)
\displaystyle \cos\frac{\theta}{2} \displaystyle = \displaystyle -\frac{\sqrt{2}}{2}=-\sin\frac{\theta}{2} \ \ \ \ \ (6)

This gives

\displaystyle U \displaystyle = \displaystyle -\frac{\sqrt{2}}{2}\left(I+i\sigma_{x}\right)\ \ \ \ \ (7)
\displaystyle I+i\sigma_{x} \displaystyle = \displaystyle -\sqrt{2}e^{i\sigma_{x}3\pi/4}\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle \sqrt{2}e^{i\pi}e^{i\sigma_{x}3\pi/4} \ \ \ \ \ (9)

Therefore

\displaystyle \left(I+i\sigma_{x}\right)^{1/2} \displaystyle = \displaystyle 2^{1/4}e^{i\pi/2}e^{i\sigma_{x}3\pi/8}\ \ \ \ \ (10)
\displaystyle \displaystyle = \displaystyle 2^{1/4}i\left(\cos\frac{3\pi}{8}I-i\sin\frac{3\pi}{8}\sigma_{x}\right) \ \ \ \ \ (11)

We can check this by evaluating the cos and sin using the half-angle formulas

\displaystyle \sin\frac{\theta}{2} \displaystyle = \displaystyle \sqrt{\frac{1-\cos\theta}{2}}\ \ \ \ \ (12)
\displaystyle \cos\frac{\theta}{2} \displaystyle = \displaystyle \sqrt{\frac{1+\cos\theta}{2}} \ \ \ \ \ (13)

We therefore have

\displaystyle \sin\frac{3\pi}{8} \displaystyle = \displaystyle \frac{1}{2}\sqrt{2+\sqrt{2}}\ \ \ \ \ (14)
\displaystyle \cos\frac{3\pi}{8} \displaystyle = \displaystyle \frac{1}{2}\sqrt{2-\sqrt{2}} \ \ \ \ \ (15)

Plugging these into 11 we have

\displaystyle \left(I+i\sigma_{x}\right)^{1/2}=\frac{1}{2^{3/4}}\left[\begin{array}{cc} i\sqrt{2-\sqrt{2}} & \sqrt{2+\sqrt{2}}\\ \sqrt{2+\sqrt{2}} & i\sqrt{2-\sqrt{2}} \end{array}\right] \ \ \ \ \ (16)

Squaring this gives

\displaystyle I+i\sigma_{x} \displaystyle = \displaystyle \frac{1}{2^{3/2}}\left[\begin{array}{cc} -\left(2-\sqrt{2}\right)+2+\sqrt{2} & 2i\sqrt{2-\sqrt{2}}\sqrt{2+\sqrt{2}}\\ 2i\sqrt{2-\sqrt{2}}\sqrt{2+\sqrt{2}} & 2+\sqrt{2}-\left(2-\sqrt{2}\right) \end{array}\right]\ \ \ \ \ (17)
\displaystyle \displaystyle = \displaystyle \frac{1}{2\sqrt{2}}\left[\begin{array}{cc} 2\sqrt{2} & 2\sqrt{2}i\\ 2\sqrt{2}i & 2\sqrt{2} \end{array}\right]\ \ \ \ \ (18)
\displaystyle \displaystyle = \displaystyle \left[\begin{array}{cc} 1 & i\\ i & 1 \end{array}\right] \ \ \ \ \ (19)

which is correct.

[Incidentally, 11 is different from Shankar’s answer in the back of the book, but both are correct as can be verified by squaring Shankar’s answer. Unlike ordinary complex numbers, a {2\times2} matrix can have more than 2 square roots.]

Example 2 Find {\left(2I+\sigma_{x}\right)^{-1}}. In principle, we could solve this the same way as in Example 1, but this time we would need to find {\theta} such that {\cos\frac{\theta}{2}=-2\sin\frac{\theta}{2}}. This doesn’t give a ‘nice’ value of {\theta} (that is, a value that is some nice multiple of {\pi}). It seems easier to just calculate the matrix and then take its inverse using the standard formula for the inverse of a {2\times2} matrix. We can then convert this back to a linear combination of Pauli matrices using the formula for a matrix {M}:

\displaystyle M \displaystyle = \displaystyle \left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right]\ \ \ \ \ (20)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}\left[\left(\alpha+\delta\right)I+\left(\beta+\gamma\right)\sigma_{x}+i\left(\beta-\gamma\right)\sigma_{y}+\left(\alpha-\delta\right)\sigma_{z}\right] \ \ \ \ \ (21)

We get

\displaystyle 2I+\sigma_{x}=\left[\begin{array}{cc} 2 & 1\\ 1 & 2 \end{array}\right] \ \ \ \ \ (22)

The inverse of a matrix is given by

\displaystyle \left[\begin{array}{cc} a & b\\ c & d \end{array}\right]^{-1}=\frac{1}{ad-bc}\left[\begin{array}{cc} d & -b\\ -c & a \end{array}\right] \ \ \ \ \ (23)

so

\displaystyle \left(2I+\sigma_{x}\right)^{-1}=\left[\begin{array}{cc} \frac{2}{3} & -\frac{1}{3}\\ -\frac{1}{3} & \frac{2}{3} \end{array}\right] \ \ \ \ \ (24)

Using 21 we find

\displaystyle \left(2I+\sigma_{x}\right)^{-1}=\frac{1}{6}\left(4I-2\sigma_{x}\right)=\frac{1}{3}\left(2I-\sigma_{x}\right) \ \ \ \ \ (25)

We can check this by multiplication

\displaystyle \frac{1}{3}\left(2I-\sigma_{x}\right)\left(2I+\sigma_{x}\right) \displaystyle = \displaystyle \frac{1}{3}\left(4I-\sigma_{x}^{2}\right)\ \ \ \ \ (26)
\displaystyle \displaystyle = \displaystyle \frac{1}{3}\left(4I-I\right)\ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle I \ \ \ \ \ (28)

where we used {\sigma_{x}^{2}=I} to get the second line.

Example 3 Find {\sigma_{x}^{-1}}. Since {\sigma_{x}^{2}=I}, {\sigma_{x}^{-1}=\sigma_{x}}.

Rotation of the spin axis

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.6.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Just as orbital angular momentum operator {\mathbf{L}} is the generator of rotations, the spin operator {\mathbf{S}} can also be used as the generator of rotations in spin space by means of the unitary operator

\displaystyle  U\left[R\left(\boldsymbol{\theta}\right)\right]=e^{-i\boldsymbol{\theta}\cdot\mathbf{S}/\hbar}=e^{-i\boldsymbol{\theta}\cdot\boldsymbol{\sigma}/2} \ \ \ \ \ (1)

where we’ve written the operator in terms of the Pauli matrices {\boldsymbol{\sigma}}, the components of which are

\displaystyle  \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{y}=\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (2)

For a spin pointing the direction {\hat{n}}, where {\hat{n}} is defined in terms of the spherical angles as

\displaystyle  \hat{n}=\sin\theta\cos\phi\hat{\mathbf{x}}+\sin\theta\sin\phi\hat{\mathbf{y}}+\cos\theta\hat{\mathbf{z}} \ \ \ \ \ (3)

the corresponding eigenvectors of the operator {\hat{n}\cdot\mathbf{S}} are

\displaystyle   \left|\hat{n}+\right\rangle \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}e^{-i\phi/2}\\ \sin\frac{\theta}{2}e^{i\phi/2} \end{array}\right]\ \ \ \ \ (4)
\displaystyle  \left|\hat{n}-\right\rangle \displaystyle  = \displaystyle  \left[\begin{array}{c} -\sin\frac{\theta}{2}e^{-i\phi/2}\\ \cos\frac{\theta}{2}e^{i\phi/2} \end{array}\right] \ \ \ \ \ (5)

If we start with spin pointing in the {+z} direction, then it is in the state

\displaystyle  \left|s_{z}=\frac{\hbar}{2}\right\rangle =\frac{\hbar}{2}\left[\begin{array}{c} 1\\ 0 \end{array}\right] \ \ \ \ \ (6)

then it should be possible to rotate this state into the general state 4 by applying the correct rotation operators in sequence.

Suppose we first rotate the state by an angle {\theta} about the {y} axis. This rotates the axis of spin so that it lies in the {xz} plane in the first quadrant (that is, positive {x} and positive {z}), making an angle {\theta} with the {z} axis. We can now rotate again by an angle {\phi} about the (original) {z} axis. The axis of spin now points in the direction given by {\hat{n}} in 3. That is, it should be true that

\displaystyle  \left|\hat{n}+\right\rangle =U\left[R\left(\phi\hat{\mathbf{z}}\right)\right]U\left[R\left(\theta\hat{\mathbf{y}}\right)\right]\left[\begin{array}{c} 1\\ 0 \end{array}\right] \ \ \ \ \ (7)

In order to verify this by direct calculation, we need an explicit form for {U}. This is derived by Shankar in his equation 14.3.44 so we won’t repeat the derivation here. Basically, it uses the fact that {\left(\hat{n}\cdot\boldsymbol{\sigma}\right)^{2}=I} and expands the exponential 1 as a power series, with the result

\displaystyle  U\left[R\left(\boldsymbol{\theta}\right)\right]=\cos\frac{\theta}{2}I-i\sin\frac{\theta}{2}\left(\hat{\theta}\cdot\boldsymbol{\sigma}\right) \ \ \ \ \ (8)

We can use this formula to do the calculation.

\displaystyle   U\left[R\left(\theta\hat{\mathbf{y}}\right)\right]\left[\begin{array}{c} 1\\ 0 \end{array}\right] \displaystyle  = \displaystyle  \left[\cos\frac{\theta}{2}I-i\sin\frac{\theta}{2}\sigma_{y}\right]\left[\begin{array}{c} 1\\ 0 \end{array}\right]\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}\\ 0 \end{array}\right]-i\sin\frac{\theta}{2}\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{c} 1\\ 0 \end{array}\right]\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}\\ \sin\frac{\theta}{2} \end{array}\right] \ \ \ \ \ (11)

Applying the second rotation we get

\displaystyle   U\left[R\left(\phi\hat{\mathbf{z}}\right)\right]\left[\begin{array}{c} \cos\frac{\theta}{2}\\ \sin\frac{\theta}{2} \end{array}\right] \displaystyle  = \displaystyle  \left[\cos\frac{\phi}{2}I-i\sin\frac{\phi}{2}\sigma_{z}\right]\left[\begin{array}{c} \cos\frac{\theta}{2}\\ \sin\frac{\theta}{2} \end{array}\right]\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}\cos\frac{\phi}{2}\\ \sin\frac{\theta}{2}\cos\frac{\phi}{2} \end{array}\right]-i\sin\frac{\phi}{2}\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\left[\begin{array}{c} \cos\frac{\theta}{2}\\ \sin\frac{\theta}{2} \end{array}\right]\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}\left(\cos\frac{\phi}{2}-i\sin\frac{\phi}{2}\right)\\ \sin\frac{\theta}{2}\left(\cos\frac{\phi}{2}+i\sin\frac{\phi}{2}\right) \end{array}\right]\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{c} \cos\frac{\theta}{2}e^{-i\phi/2}\\ \sin\frac{\theta}{2}e^{i\phi/2} \end{array}\right] \ \ \ \ \ (15)

which agrees with 4.

Arbitrary 2×2 matrix as a linear combination of Pauli matrices

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.5.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

Any {2\times2} matrix can be written as a linear combination of the three Pauli matrices and the unit matrix. That is, for an arbitrary matrix {M} we have

\displaystyle  M=\sum_{\alpha}m_{\alpha}\sigma_{\alpha} \ \ \ \ \ (1)

where the coefficients are found from

\displaystyle  m_{\alpha}=\frac{1}{2}\mbox{Tr}\left(M\sigma_{\alpha}\right) \ \ \ \ \ (2)

We can write this out explicitly as follows

\displaystyle  M=\left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right] \ \ \ \ \ (3)

We then get

\displaystyle   m_{0} \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(M\sigma_{0}\right)\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(MI\right)\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(M\right)\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \frac{\alpha+\delta}{2}\ \ \ \ \ (7)
\displaystyle  m_{1} \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(M\sigma_{1}\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right]\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\right)\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} \beta & \alpha\\ \delta & \gamma \end{array}\right]\right)\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \frac{\beta+\gamma}{2}\ \ \ \ \ (11)
\displaystyle  m_{2} \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(M\sigma_{2}\right)\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\right)\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} i\beta & -i\alpha\\ i\delta & -i\gamma \end{array}\right]\right)\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  i\frac{\beta-\gamma}{2}\ \ \ \ \ (15)
\displaystyle  m_{3} \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(M\sigma_{3}\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} \alpha & \beta\\ \gamma & \delta \end{array}\right]\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mbox{Tr}\left(\left[\begin{array}{cc} \alpha & -\beta\\ \gamma & -\delta \end{array}\right]\right)\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \frac{\alpha-\delta}{2} \ \ \ \ \ (19)

Thus, in more conventional notation

\displaystyle  M=\frac{1}{2}\left[\left(\alpha+\delta\right)I+\left(\beta+\gamma\right)\sigma_{x}+i\left(\beta-\gamma\right)\sigma_{y}+\left(\alpha-\delta\right)\sigma_{z}\right] \ \ \ \ \ (20)

Pauli matrices: a useful identity

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.4.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The three components of the spin operator {\mathbf{S}} for spin {\frac{1}{2}} can be expressed in terms of the Pauli matrices

\displaystyle \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{y}=\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (1)

We can derive an identity involving the Pauli matrices:

\displaystyle \left(\mathbf{A}\cdot\boldsymbol{\sigma}\right)\left(\mathbf{B}\cdot\boldsymbol{\sigma}\right)=\left(\mathbf{A}\cdot\mathbf{B}\right)I+i\left(\mathbf{A}\times\mathbf{B}\right)\cdot\boldsymbol{\sigma} \ \ \ \ \ (2)

 

One way of proving this is to use the commutation relations for the Pauli matrices. We have

\displaystyle \left[\sigma_{i},\sigma_{j}\right]_{+} \displaystyle = \displaystyle 2\delta_{ij}I\ \ \ \ \ (3)
\displaystyle \left[\sigma_{i},\sigma_{j}\right] \displaystyle = \displaystyle 2i\sum_{k}\varepsilon_{ijk}\sigma_{k} \ \ \ \ \ (4)

where {\varepsilon_{ijk}} is the Levi-Civita antisymmetric tensor.

We therefore have

\displaystyle \sigma_{i}\sigma_{j} \displaystyle = \displaystyle \frac{1}{2}\left(\left[\sigma_{i},\sigma_{j}\right]_{+}+\left[\sigma_{i},\sigma_{j}\right]\right)\ \ \ \ \ (5)
\displaystyle \displaystyle = \displaystyle \delta_{ij}I+i\sum_{k}\varepsilon_{ijk}\sigma_{k} \ \ \ \ \ (6)

Using the summation convention where repeated indices are summed from 1 to 3 (that is, over {x}, {y} and {z}):

\displaystyle \left(\mathbf{A}\cdot\boldsymbol{\sigma}\right)\left(\mathbf{B}\cdot\boldsymbol{\sigma}\right) \displaystyle = \displaystyle A_{i}\sigma_{i}B_{j}\sigma_{j}\ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle A_{i}B_{j}\sigma_{i}\sigma_{j}\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle A_{i}B_{j}\left(\delta_{ij}I+i\varepsilon_{ijk}\sigma_{k}\right)\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle A_{i}B_{i}I+i\varepsilon_{ijk}A_{i}B_{j}\sigma_{k}\ \ \ \ \ (10)
\displaystyle \displaystyle = \displaystyle \left(\mathbf{A}\cdot\mathbf{B}\right)I+i\left(\mathbf{A}\times\mathbf{B}\right)\cdot\boldsymbol{\sigma} \ \ \ \ \ (11)

where the last term on the RHS follows from writing the vector cross product in terms of {\varepsilon_{ijk}}. [Note that in the second line, we’ve assumed that {\mathbf{B}} commutes with {\boldsymbol{\sigma}}.]

Another way of deriving this result is as follows. First, we add the {2\times2} identity matrix {I} to the set of Pauli matrices, calling it {\sigma_{0}\equiv I}. Then, because we have four independent matrices (Shankar shows they are linearly independent in his equations 14.3.40-41) each with 4 entries, we can write any {2\times2} complex matrix as a linear combination of the {\sigma_{\alpha}} (where a Greek subscript ranges from 0 to 3). That is, for a general {2\times2} matrix {M}

\displaystyle M=\sum_{\alpha}m_{\alpha}\sigma_{\alpha} \ \ \ \ \ (12)

From the trace identities

\displaystyle \mbox{Tr}\left(\sigma_{\alpha}\sigma_{\beta}\right)=2\delta_{\alpha\beta} \ \ \ \ \ (13)

 

we can find {m_{\alpha}} by right-multiplying by {\sigma_{\beta}} and taking the trace:

\displaystyle \mbox{Tr}\left(M\sigma_{\beta}\right) \displaystyle = \displaystyle \sum_{\alpha}m_{\alpha}\mbox{Tr}\left(\sigma_{\alpha}\sigma_{\beta}\right)\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle 2\sum_{\alpha}m_{\alpha}\delta_{\alpha\beta}\ \ \ \ \ (15)
\displaystyle \displaystyle = \displaystyle 2m_{\beta} \ \ \ \ \ (16)

Thus

\displaystyle m_{\alpha}=\frac{1}{2}\mbox{Tr}\left(M\sigma_{\alpha}\right) \ \ \ \ \ (17)

Returning to 2, we can identify (again using the summation convention):

\displaystyle M \displaystyle = \displaystyle \left(\mathbf{A}\cdot\boldsymbol{\sigma}\right)\left(\mathbf{B}\cdot\boldsymbol{\sigma}\right)\ \ \ \ \ (18)
\displaystyle \displaystyle = \displaystyle A_{i}\sigma_{i}B_{j}\sigma_{j}\ \ \ \ \ (19)
\displaystyle \displaystyle = \displaystyle m_{\alpha}\sigma_{\alpha} \ \ \ \ \ (20)

For {\alpha=0} we have

\displaystyle m_{0} \displaystyle = \displaystyle \frac{1}{2}\mbox{Tr}\left(M\sigma_{0}\right)\ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}\mbox{Tr}\left(M\right)\ \ \ \ \ (22)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}A_{i}B_{j}\mbox{Tr}\left(\sigma_{i}\sigma_{j}\right)\ \ \ \ \ (23)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}A_{i}B_{j}\left(2\delta_{ij}\right)\ \ \ \ \ (24)
\displaystyle \displaystyle = \displaystyle A_{i}B_{i}\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \mathbf{A}\cdot\mathbf{B} \ \ \ \ \ (26)

where we used 13 to get the fourth line. This gives us the first term on the RHS of 2.

For the other three {\sigma_{i}} coefficients, we can use a similar argument. Consider {\sigma_{x}}.

\displaystyle m_{x} \displaystyle = \displaystyle \frac{1}{2}\mbox{Tr}\left(M\sigma_{x}\right)\ \ \ \ \ (27)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}A_{i}B_{j}\mbox{Tr}\left(\sigma_{i}\sigma_{j}\sigma_{x}\right) \ \ \ \ \ (28)

From 6 we see that {\sigma_{i}\sigma_{j}} can always be written as a single Pauli matrix {\sigma_{\alpha}}. Thus the product of 3 Pauli matrices {\sigma_{i}\sigma_{j}\sigma_{x}} can be reduced to a product of 2: {\pm\sigma_{\alpha}\sigma_{x}} (the plus or minus sign is determined by the order in which we multiply the two matrices {\sigma_{i}} and {\sigma_{j}}). However, from 13, we see that the trace of {\sigma_{\alpha}\sigma_{x}} is non-zero only if {\alpha=x}. The only way this can happen is if either {i=y} and {j=z} or {i=z} and {j=y}. Therefore we have

\displaystyle m_{x}=\frac{1}{2}A_{y}B_{z}\mbox{Tr}\left(\sigma_{y}\sigma_{z}\sigma_{x}\right)+\frac{1}{2}A_{z}B_{y}\mbox{Tr}\left(\sigma_{z}\sigma_{y}\sigma_{x}\right) \ \ \ \ \ (29)

(Repeated indices are not summed here!) From 3 we have

\displaystyle \sigma_{y}\sigma_{z}=-\sigma_{z}\sigma_{y}=i\sigma_{x} \ \ \ \ \ (30)

Thus

\displaystyle \mbox{Tr}\left(\sigma_{y}\sigma_{z}\sigma_{x}\right)=i\mbox{Tr}\left(\sigma_{x}^{2}\right)=2i

Therefore

\displaystyle m_{x}=i\left(A_{y}B_{z}-A_{z}B_{y}\right) \ \ \ \ \ (31)

and {m_{x}} is the {x} component of {i\left(\mathbf{A}\times\mathbf{B}\right)}. A similar argument gives {m_{y}} and {m_{z}}, so putting everything together we again arrive at 2.

Pauli matrices: trace

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Chapter 14, Exercise 14.3.3.

[If some equations are too small to read easily, use your browser’s magnifying option (Ctrl + on Chrome, probably something similar on other browsers).]

The three components of the spin operator {\mathbf{S}} for spin {\frac{1}{2}} can be expressed in terms of the Pauli matrices

\displaystyle  \sigma_{x}=\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right];\quad\sigma_{y}=\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right];\quad\sigma_{z}=\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (1)

as

\displaystyle  S_{i}=\frac{\hbar}{2}\sigma_{i} \ \ \ \ \ (2)

As the trace of a matrix is the sum of its diagonal elements, it’s obvious from their definitions that the {\sigma_{i}} are traceless, but for some reason Shankar wants us to show this by a roundabout method.

We can show by direct calculation that the Pauli matrices anticommute with each other. For example

\displaystyle   \sigma_{x}\sigma_{y} \displaystyle  = \displaystyle  \left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  \left[\begin{array}{cc} i & 0\\ 0 & -i \end{array}\right]\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  -\left[\begin{array}{cc} -i & 0\\ 0 & i \end{array}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  -\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  -\sigma_{y}\sigma_{x} \ \ \ \ \ (7)

In general, we have, for {i\ne j}:

\displaystyle   \left[\sigma_{i},\sigma_{j}\right]_{+} \displaystyle  = \displaystyle  0\ \ \ \ \ (8)
\displaystyle  \sigma_{i}\sigma_{j} \displaystyle  = \displaystyle  -\sigma_{j}\sigma_{i} \ \ \ \ \ (9)

Also, by direct calculation (or by using the commutation relations for {S_{i}}) we can show that

\displaystyle   \left[\sigma_{x},\sigma_{y}\right] \displaystyle  = \displaystyle  \sigma_{x}\sigma_{y}-\sigma_{y}\sigma_{x}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  2\sigma_{x}\sigma_{y}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  2\left[\begin{array}{cc} i & 0\\ 0 & -i \end{array}\right]\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  2i\sigma_{z} \ \ \ \ \ (13)

This gives the relation

\displaystyle  \sigma_{x}\sigma_{y}=i\sigma_{z} \ \ \ \ \ (14)

and also for cyclic permutations of {x}, {y} and {z}. Also by direct calculation we can see that

\displaystyle  \sigma_{i}^{2}=I \ \ \ \ \ (15)

We can write this more generally as

\displaystyle  \sigma_{i}\sigma_{j}=\delta_{ij}I+i\sum_{k}\varepsilon_{ijk}\sigma_{k} \ \ \ \ \ (16)

where {\varepsilon_{ijk}} is the Levi-Civita antisymmetric tensor.

Returning to the trace, we can use the theorem for the trace of a product:

\displaystyle  \mbox{Tr}\left(AB\right)=\mbox{Tr}\left(BA\right) \ \ \ \ \ (17)

Applying this to 9 we have

\displaystyle  \mbox{Tr}\left(\sigma_{x}\sigma_{y}\right)=\mbox{Tr}\left(\sigma_{y}\sigma_{x}\right)=-\mbox{Tr}\left(\sigma_{y}\sigma_{x}\right) \ \ \ \ \ (18)

Any quantity equal to its negative must be zero, so

\displaystyle  \mbox{Tr}\left(\sigma_{x}\sigma_{y}\right)=0 \ \ \ \ \ (19)

Thus from 14 we get

\displaystyle  \mbox{Tr}\sigma_{z}=0 \ \ \ \ \ (20)

We can use the same argument for {\sigma_{x}} and {\sigma_{y}} by cyclic permutation.