Featured post

# ‘Latex path not specified’ errors

I’ve been getting a few reports of readers seeing the message “Latex path not specified” in place of mathematical formulas in some of my posts. It appears that this problem is not unique to my site and that it started around 19 November (see here and here for discussions on WordPress blogs).

I don’t see the error myself so I can’t test any solutions. However, I view my pages  using a Windows 7 desktop and Windows 8.1 laptop only, and there have been some reports that this error occurs only on smaller devices like smart phones (which I don’t have). If you do see the error and want to report it, please let me know what device and operating system you are using.

In any case, it doesn’t look like there’s anything I can do to fix it. I haven’t changed anything in the way I generate equations in my posts recently, anyway. I guess we’ll just have to wait for someone at WordPress to fix it.

Update (24 November): it appears that this error has now been fixed by WordPress staff. If you still see “Latex path not specified” errors on my pages, try refreshing the page. If that doesn’t work, try clearing your browser’s cache (instructions here) and then refreshing the page.

Featured post

# Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are in the Welcome menu above.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. Details of the number of visits and distinct visitors are given on the hit statistics page.

Physicspages.com changed hosts around the middle of May, 2015. If you subscribed to get email notifications of new posts before that date, you’ll need to subscribe again as I couldn’t port the list of subscribers over to the new host. Please use the subscribe form in the sidebar on the right. Sorry for the inconvenience.

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

# Maxwell’s equations using the electromagnetic field tensor

References: Amitabha Lahiri & P. B. Pal, A First Book of Quantum Field Theory, Second Edition (Alpha Science International, 2004) – Chapter 1, Problem 1.7.

We can summarize the electromagnetic field in tensor form by means of the field tensor

$\displaystyle F_{\mu\nu}=\left[\begin{array}{cccc} 0 & -E_{x} & -E_{y} & -E_{z}\\ E_{x} & 0 & B_{z} & -B_{y}\\ E_{y} & -B_{z} & 0 & B_{x}\\ E_{z} & B_{y} & -B_{x} & 0 \end{array}\right] \ \ \ \ \ (1)$

[This tensor is written using relativistic units with ${c=1}$ so that ${\mathbf{E}}$ and ${\mathbf{B}}$ have the same dimensions.]

We’ve already seen that the pair of homogeneous Maxwell’s equations can be written in terms of this tensor as follows:

$\displaystyle \partial_{\mu}F_{\nu\sigma}+\partial_{\sigma}F_{\mu\nu}+\partial_{\nu}F_{\sigma\mu}=0 \ \ \ \ \ (2)$

With the usual ordering of coordinates ${\left(x^{0},x^{1},x^{2},x^{3}\right)=\left(t,x,y,z\right)}$, if we set ${\mu=2}$, ${\nu=1}$ and ${\sigma=3}$ we get

 $\displaystyle -\partial_{y}B_{y}-\partial_{z}B_{z}-\partial_{x}B_{x}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (3)$ $\displaystyle \nabla\cdot\mathbf{B}$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (4)$

Selecting ${\mu=2}$, ${\nu=1}$ and ${\sigma=0}$ gives

$\displaystyle \partial_{y}E_{x}-\partial_{t}B_{z}-\partial_{x}E_{y}=0 \ \ \ \ \ (5)$

This is the ${z}$ component of

$\displaystyle \nabla\times\mathbf{E}+\frac{\partial\mathbf{B}}{\partial t}=0 \ \ \ \ \ (6)$

We can get the ${x}$ component by choosing ${\mu=0}$, ${\nu=2}$ and ${\sigma=3}$:

$\displaystyle \partial_{t}B_{x}-\partial_{z}E_{y}+\partial_{y}E_{z}=0 \ \ \ \ \ (7)$

The ${y}$ component comes from ${\mu=0}$, ${\nu=1}$ and ${\sigma=3}$:

$\displaystyle -\partial_{t}B_{y}-\partial_{z}E_{x}+\partial_{x}E_{z}=0 \ \ \ \ \ (8)$

The two inhomogenous Maxwell’s equations are

 $\displaystyle \nabla\cdot\mathbf{E}$ $\displaystyle =$ $\displaystyle \frac{\rho}{\epsilon_{0}}\ \ \ \ \ (9)$ $\displaystyle \nabla\times\mathbf{B}$ $\displaystyle =$ $\displaystyle \mu_{0}\mathbf{J}+\mu_{0}\epsilon_{0}\frac{\partial\mathbf{E}}{\partial t}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{c^{2}}\left(\frac{\mathbf{J}}{\epsilon_{0}}+\frac{\partial\mathbf{E}}{\partial t}\right)\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\mathbf{J}}{\epsilon_{0}}+\frac{\partial\mathbf{E}}{\partial t} \ \ \ \ \ (12)$

where we used ${\mu_{0}\epsilon_{0}=1/c^{2}}$ and the last line uses relativistic units with ${c=1}$.

We need to introduce the four-current to put these in four-vector form. This is

$\displaystyle J^{\mu}=\left[\rho,\mathbf{J}\right] \ \ \ \ \ (13)$

where ${\rho}$ is the charge density and ${\mathbf{J}}$ is the three-current. Then if we look at Gauss’s law 9 we see that this can be written as

$\displaystyle \partial_{\nu}F^{0\nu}=\frac{J^{0}}{\epsilon_{0}} \ \ \ \ \ (14)$

where ${F^{\mu\nu}}$ is the raised version of the tensor

$\displaystyle F^{\mu\nu}=\left[\begin{array}{cccc} 0 & E_{x} & E_{y} & E_{z}\\ -E_{x} & 0 & B_{z} & -B_{y}\\ -E_{y} & -B_{z} & 0 & B_{x}\\ -E_{z} & B_{y} & -B_{x} & 0 \end{array}\right] \ \ \ \ \ (15)$

If we generalize this formula we get

$\displaystyle \partial_{\nu}F^{\mu\nu}=\frac{J^{\mu}}{\epsilon_{0}} \ \ \ \ \ (16)$

For ${\mu=1}$ we get

$\displaystyle -\partial_{t}E_{x}+\partial_{y}B_{z}-\partial_{z}B_{y}=\frac{J^{x}}{\epsilon_{0}} \ \ \ \ \ (17)$

This is the ${x}$ component of 12. Choosing ${\mu=2}$ and ${\mu=3}$ give the ${y}$ and ${z}$ components respectively.

From our examination of the electromagnetic tensor, we saw the four-vector form of the Lorentz force law for a charge ${q}$:

$\displaystyle \frac{dp^{\mu}}{d\tau}=qF^{\mu\nu}u_{\nu} \ \ \ \ \ (18)$

where ${\tau}$ is the proper time, ${p^{\mu}}$ is the four-momentum and ${u_{\nu}}$ is the four-velocity.

To summarize, Maxwell’s equations can be written as

 $\displaystyle \partial_{\mu}F_{\nu\sigma}+\partial_{\sigma}F_{\mu\nu}+\partial_{\nu}F_{\sigma\mu}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (19)$ $\displaystyle \partial_{\nu}F^{\mu\nu}$ $\displaystyle =$ $\displaystyle \frac{J^{\mu}}{\epsilon_{0}} \ \ \ \ \ (20)$

The Lorentz force law can be written as

$\displaystyle \frac{dp^{\mu}}{d\tau}=qF^{\mu\nu}u_{\nu} \ \ \ \ \ (21)$

# Lorentz transformation for infinitesimal relative velocity

References: Amitabha Lahiri & P. B. Pal, A First Book of Quantum Field Theory, Second Edition (Alpha Science International, 2004) – Chapter 1, Problems 1.5 – 1.6.

In special relativity, Lahiri & Pal use the opposite metric to the one we’ve been using so far, in that ${g_{\mu\nu}=\mbox{diag}\left(+1,-1,-1,-1\right)}$, that is, the time component is positive and the spatial components are negative. With this definition, lowering or raising the 0 index of a tensor has no effect on the sign, while lowering or raising index 1, 2 or 3 changes the sign.

With the usual spacetime four-vector

$\displaystyle x^{\mu}\equiv\left(x^{0},x^{i}\right)=\left(ct,\mathbf{x}\right) \ \ \ \ \ (1)$

the lowered version is

$\displaystyle x_{\mu}=g_{\mu\nu}x^{\nu}=\left(ct,-\mathbf{x}\right) \ \ \ \ \ (2)$

Under a Lorentz transformation, the ${x^{\mu}}$ transform as

$\displaystyle x^{\prime\mu}=\Lambda_{\;\nu}^{\mu}x^{\nu} \ \ \ \ \ (3)$

The transformation for ${x_{\mu}}$ is therefore

 $\displaystyle x_{\mu}^{\prime}$ $\displaystyle =$ $\displaystyle g_{\mu\nu}x^{\prime\nu}\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle g_{\mu\nu}\Lambda_{\;\sigma}^{\nu}x^{\sigma}\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \Lambda_{\mu\sigma}g^{\sigma\rho}x_{\rho}\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \Lambda_{\mu}^{\;\rho}x_{\rho} \ \ \ \ \ (7)$

The matrix ${\Lambda_{\mu}^{\;\rho}}$ is the original matrix ${\Lambda_{\;\rho}^{\mu}}$ with the first index lowered and second raised. If ${\mu=\rho=0}$ or if both ${\mu}$ and ${\rho}$ are spatial indices, the matrix element remains unchanged: ${\Lambda_{\mu}^{\;\rho}=\Lambda_{\;\rho}^{\mu}}$. If, however, exactly one index is zero (with the other index being spatial), the element changes sign: ${\Lambda_{\mu}^{\;\rho}=-\Lambda_{\;\rho}^{\mu}}$.

Infinitesimal relative velocity

In the standard case where the primed frame is moving relative to the unprimed frame at speed ${v}$ along the ${x}$ axis, the Lorentz transformations are

 $\displaystyle t^{\prime}$ $\displaystyle =$ $\displaystyle \gamma\left(t-\frac{vx}{c^{2}}\right)\ \ \ \ \ (8)$ $\displaystyle x^{\prime}$ $\displaystyle =$ $\displaystyle \gamma\left(x-vt\right)\ \ \ \ \ (9)$ $\displaystyle y^{\prime}$ $\displaystyle =$ $\displaystyle y\ \ \ \ \ (10)$ $\displaystyle z^{\prime}$ $\displaystyle =$ $\displaystyle z \ \ \ \ \ (11)$

with

$\displaystyle \gamma\equiv\frac{1}{\sqrt{1-v^{2}/c^{2}}} \ \ \ \ \ (12)$

If ${\frac{v}{c}}$ is very small we can expand these equations to first order in ${\beta\equiv\frac{v}{c}}$. To this order

 $\displaystyle \gamma$ $\displaystyle =$ $\displaystyle 1+\frac{\beta^{2}}{2}+\ldots\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle 1 \ \ \ \ \ (14)$

and

 $\displaystyle ct^{\prime}$ $\displaystyle =$ $\displaystyle ct-x\beta\ \ \ \ \ (15)$ $\displaystyle x^{\prime}$ $\displaystyle =$ $\displaystyle x-ct\beta \ \ \ \ \ (16)$

so

$\displaystyle \Lambda_{\;\nu}^{\mu}=\left[\begin{array}{cccc} 1 & -\beta & 0 & 0\\ -\beta & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right] \ \ \ \ \ (17)$

Lowering the first index we get

 $\displaystyle \Lambda_{\mu\nu}$ $\displaystyle =$ $\displaystyle g_{\mu\rho}\Lambda_{\;\nu}^{\rho}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1 \end{array}\right]\left[\begin{array}{cccc} 1 & -\beta & 0 & 0\\ -\beta & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right]\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & -\beta & 0 & 0\\ \beta & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1 \end{array}\right] \ \ \ \ \ (20)$

We can write this as the sum of ${g_{\mu\nu}}$ and an antisymmetric matrix ${\omega_{\mu\nu}=-\omega_{\nu\mu}}$:

 $\displaystyle \Lambda_{\mu\nu}$ $\displaystyle =$ $\displaystyle g_{\mu\nu}+\omega_{\mu\nu}\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1 \end{array}\right]+\left[\begin{array}{cccc} 0 & -\beta & 0 & 0\\ \beta & 0 & 0 & 0\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{array}\right] \ \ \ \ \ (22)$

# Creation and annihilation operators: commutators and anticommutators

References: Amitabha Lahiri & P. B. Pal, A First Book of Quantum Field Theory, Second Edition (Alpha Science International, 2004) – Chapter 1, Problems 1.1 – 1.2.

As a bit of background to the quantum field theoretic use of creation and annihilation operators we’ll look again at the harmonic oscillator. The creation and annihilation operators (called raising and lowering operators by Griffiths) are defined in terms of the position and momentum operators as

 $\displaystyle a^{\dagger}$ $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{2\hbar m\omega}}\left[-ip+m\omega x\right]\ \ \ \ \ (1)$ $\displaystyle a$ $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{2\hbar m\omega}}\left[ip+m\omega x\right] \ \ \ \ \ (2)$

From the commutator ${\left[x,p\right]=i\hbar}$ we can work out

 $\displaystyle \left[a,a^{\dagger}\right]$ $\displaystyle =$ $\displaystyle \frac{1}{2\hbar m\omega}\left(-im\omega\left[x,p\right]\right)\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1 \ \ \ \ \ (4)$

The annihilation operator ${a}$ acting on the vacuum or ground state ${\left|0\right\rangle }$ gives 0, and the creation operator ${a^{\dagger}}$ produces a state ${a^{\dagger}\left|0\right\rangle =\left|1\right\rangle }$ with energy eigenvalue ${\frac{3}{2}\hbar\omega}$. Successive applications of ${a^{\dagger}}$ produce states with higher energy, where each quantum of energy is ${\hbar\omega}$.

Normalization

Given that the ground state is normalized so that ${\left\langle \left.0\right|0\right\rangle =1}$, we can find the factor required to normalize higher states so that ${\left\langle \left.n\right|n\right\rangle =1}$. Consider ${n=2}$. We have

$\displaystyle a^{\dagger}a^{\dagger}\left|0\right\rangle =A\left|2\right\rangle \ \ \ \ \ (5)$

where ${A}$ is to be determined. We have

 $\displaystyle \left\langle 0\left|aaa^{\dagger}a^{\dagger}\right|0\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|a\left(1+a^{\dagger}a\right)a^{\dagger}\right|0\right\rangle \ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|aa^{\dagger}\right|0\right\rangle +\left\langle 0\left|aa^{\dagger}aa^{\dagger}\right|0\right\rangle \ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|\left(1+a^{\dagger}a\right)\right|0\right\rangle +\left\langle 0\left|aa^{\dagger}\left(1+a^{\dagger}a\right)\right|0\right\rangle \ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \left.0\right|0\right\rangle +\left\langle 0\left|aa^{\dagger}\right|0\right\rangle \ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \left.0\right|0\right\rangle +\left\langle 0\left|\left(1+a^{\dagger}a\right)\right|0\right\rangle \ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \left.0\right|0\right\rangle +\left\langle \left.0\right|0\right\rangle \ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{A^{2}}\ \ \ \ \ (13)$ $\displaystyle A$ $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{2}} \ \ \ \ \ (14)$

For ${n=3}$ we get ${\left\langle 0\left|aaaa^{\dagger}a^{\dagger}a^{\dagger}\right|0\right\rangle }$. We need to commute each ${a}$ through the ${a^{\dagger}}$ operators to its right. The first ${a}$ will generate the factor ${\left(1+a^{\dagger}a\right)}$ 3 times as it commutes with each ${a^{\dagger}}$ operator. Each of these terms will be ${\left\langle 0\left|aaa^{\dagger}a^{\dagger}\right|0\right\rangle }$ and we already know that this term produces a factor of 2. Therefore

$\displaystyle \left\langle 0\left|aaaa^{\dagger}a^{\dagger}a^{\dagger}\right|0\right\rangle =3\times2=6 \ \ \ \ \ (15)$

We can extend this result to the general case:

$\displaystyle \left\langle 0\left|a^{n}\left(a^{\dagger}\right)^{n}\right|0\right\rangle =n! \ \ \ \ \ (16)$

The normalization must then be

$\displaystyle \left|n\right\rangle =\frac{1}{\sqrt{n!}}\left(a^{\dagger}\right)^{n}\left|0\right\rangle \ \ \ \ \ (17)$

Number operator

We’ve met the number operator ${N}$ in the field case, but there is an analogous operator for the harmonic oscillator. We have

$\displaystyle N\equiv a^{\dagger}a \ \ \ \ \ (18)$

As with the field case, we can work out its commutators:

 $\displaystyle \left[N,a^{\dagger}\right]$ $\displaystyle =$ $\displaystyle a^{\dagger}aa^{\dagger}-a^{\dagger}a^{\dagger}a\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a^{\dagger}a^{\dagger}a+a^{\dagger}-a^{\dagger}a^{\dagger}a\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a^{\dagger}\ \ \ \ \ (21)$ $\displaystyle \left[N,a\right]$ $\displaystyle =$ $\displaystyle a^{\dagger}aa-aa^{\dagger}a\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a^{\dagger}aa-a+a^{\dagger}aa\ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -a \ \ \ \ \ (24)$

Applying this to ${\left|n\right\rangle }$ we get

$\displaystyle N\left|n\right\rangle =\frac{1}{\sqrt{n!}}N\left(a^{\dagger}\right)^{n}\left|0\right\rangle \ \ \ \ \ (25)$

We get

 $\displaystyle N\left(a^{\dagger}\right)^{n}$ $\displaystyle =$ $\displaystyle \left[a^{\dagger}+a^{\dagger}N\right]\left(a^{\dagger}\right)^{n-1}\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(a^{\dagger}\right)^{n}+\left(a^{\dagger}\right)^{2}\left(1+N\right)\left(a^{\dagger}\right)^{n-2}\ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \ldots\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle n\left(a^{\dagger}\right)^{n}+\left(a^{\dagger}\right)^{n}N\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle n\left(a^{\dagger}\right)^{n}+\left(a^{\dagger}\right)^{n}a^{\dagger}a \ \ \ \ \ (30)$

When operating on ${\left|0\right\rangle }$, the last term gives 0, so

$\displaystyle N\left|n\right\rangle =\frac{n}{\sqrt{n!}}\left(a^{\dagger}\right)^{n}\left|0\right\rangle \ \ \ \ \ (31)$

Multiple oscillators

If we now have a system of ${N}$ non-interacting harmonic oscillators with equal masses and frequencies ${\omega_{i}}$, ${i=1,\ldots,N}$, the Hamiltonian is

$\displaystyle H=\frac{1}{2m}\sum_{i}\left(p_{i}^{2}+m^{2}\omega_{i}^{2}x_{i}^{2}\right) \ \ \ \ \ (32)$

Since the oscillators are not coupled, the creation and annihilation operators for different operators all commute, so that

$\displaystyle \left[a_{i},a_{j}^{\dagger}\right]=\delta_{ij} \ \ \ \ \ (33)$

so the normalized state where oscillator ${i}$ is in the ${n_{i}}$th excited state is

$\displaystyle \left|n_{1}n_{2}\ldots n_{N}\right\rangle =\prod_{i=1}^{N}\frac{\left(a_{i}^{\dagger}\right)^{n_{i}}}{\sqrt{n_{i}!}}\left|0\right\rangle \ \ \ \ \ (34)$

The number operator in this case is

$\displaystyle \mathcal{N}=\sum_{i=1}^{N}\left(a_{i}^{\dagger}a_{i}\right) \ \ \ \ \ (35)$

This works because the commutation relation 33 allows each term ${a_{i}^{\dagger}a_{i}}$ in the sum to pick out the number of quanta of oscillator ${i}$.

Anticommutators

Now suppose that instead of the commutation relations 33 we have anticommutation relations as follows:

 $\displaystyle \left\{ a_{i},a_{j}^{\dagger}\right\}$ $\displaystyle \equiv$ $\displaystyle a_{i}a_{j}+a_{j}a_{i}=\delta_{ij}\ \ \ \ \ (36)$ $\displaystyle \left\{ a_{i}^{\dagger},a_{j}^{\dagger}\right\}$ $\displaystyle =$ $\displaystyle \left\{ a_{i},a_{j}\right\} =0 \ \ \ \ \ (37)$

If we start with the vacuum state ${\left|0\right\rangle }$ and require ${a_{i}^{\dagger}\left|0\right\rangle =\left|0\ldots1_{i}\ldots0\right\rangle }$ (that is, ${a_{i}^{\dagger}}$ creates one quantum in category ${i}$), then if we try to create another quantum in the same state, we get

 $\displaystyle \left\langle 0\left|a_{i}a_{i}a_{i}^{\dagger}a_{i}^{\dagger}\right|0\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|a_{i}\left(1-a_{i}^{\dagger}a_{i}\right)a_{i}^{\dagger}\right|0\right\rangle \ \ \ \ \ (38)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|a_{i}a_{i}^{\dagger}\right|0\right\rangle -\left\langle 0\left|a_{i}a_{i}^{\dagger}a_{i}a_{i}^{\dagger}\right|0\right\rangle \ \ \ \ \ (39)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|a_{i}a_{i}^{\dagger}\right|0\right\rangle -\left\langle 0\left|a_{i}a_{i}^{\dagger}\left(1-a_{i}^{\dagger}a_{i}\right)\right|0\right\rangle \ \ \ \ \ (40)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle 0\left|a_{i}a_{i}^{\dagger}\right|0\right\rangle -\left\langle 0\left|a_{i}a_{i}^{\dagger}\right|0\right\rangle +\left\langle 0\left|a_{i}a_{i}^{\dagger}a_{i}^{\dagger}a_{i}\right|0\right\rangle \ \ \ \ \ (41)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (42)$

Thus, attempting to create two quanta in the same state produces zero, so at most one quantum can occupy each state. The commutator case 33 thus behaves like bosons and the anticommutator case like fermions.

# Euler-Lagrange equations for particle & field theories; Lagrangian density

References: Robert D. Klauber, Student Friendly Quantum Field Theory, (Sandtrove Press, 2013) – Chapter 2, Problem 2.6.

It’s important to understand the distinction between a particle theory and a field theory. To see how this works, we’ll start by looking at classical theories of particles and fields using the Lagrangian formalism.

Classical particle theory

First, let’s revisit the Euler-Lagrange equations for a system of classical particles. Suppose we have ${N}$ particles in 3-d space, for a total of ${3N}$ degrees of freedom. If we define the Lagrangian as

$\displaystyle L\equiv T\left(\dot{q}_{i}\right)-V\left(q_{i}\right) \ \ \ \ \ (1)$

where ${T}$ is the kinetic energy (that depends only on velocities ${\dot{q}_{i}}$) and ${V}$ is the potential energy (that depends only on positions ${q_{i}}$). We want to find the path followed by the system between times ${t_{1}}$ and ${t_{2}}$, that is, we want to find ${q_{i}\left(t\right)}$ between those times, subject to the constraint that ${q_{i}\left(t_{1}\right)}$ and ${q_{i}\left(t_{2}\right)}$ are fixed at some known values. In general, there is an infinite number of paths the system could take between these two times, and each path is specified by choosing the functions ${q_{i}\left(t\right)}$ (which in turn determines ${\dot{q}_{i}\left(t\right)}$). Each choice of path gives a different form for the Lagrangian.

The principle of least action states that the action ${S}$, defined as a functional of the paths that can be followed, is an extremum (in practice, almost always a minimum, hence the principle of least action). The action is defined as

$\displaystyle S\equiv\int_{t_{1}}^{t_{2}}L\;dt \ \ \ \ \ (2)$

The condition that ${S}$ be an extremum is specified by requiring ${\delta S=0}$, which means that if the Lagrangian ${L_{0}}$ gives a minimum (we’ll assume the extremum is always a minimum from here on to avoid confusion), then any slight variation of the paths that make up ${L_{0}}$ increases ${S}$. Thus the condition ${\delta S=0}$ is just an extension of the usual condition that the first derivative of an ordinary function be zero in order for that function to have a minimum.

To calculate ${\delta S}$, we need to vary the paths slightly. Using the chain rule (actually, we should justify that the chain rule works when calculating variations in functions, but we’ll trust the mathematicians on this point) we get

 $\displaystyle \delta S$ $\displaystyle =$ $\displaystyle \delta\left[\int_{t_{1}}^{t_{2}}L\;dt\right]\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{t_{1}}^{t_{2}}\delta L\;dt\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{t_{1}}^{t_{2}}\left(\frac{\partial L}{\partial q_{i}}\delta q_{i}+\frac{\partial L}{\partial\dot{q}_{i}}\delta\dot{q}_{i}\right)dt\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int_{t_{1}}^{t_{2}}\left(\frac{\partial L}{\partial q_{i}}\delta q_{i}+\frac{\partial L}{\partial\dot{q}_{i}}\frac{d\left(\delta q_{i}\right)}{dt}\right)dt \ \ \ \ \ (6)$

where the repeated index ${i}$ is summed.

We can now integrate the second term by parts to get

$\displaystyle \delta S=\int_{t_{1}}^{t_{2}}\frac{\partial L}{\partial q_{i}}\delta q_{i}dt+\left.\frac{\partial L}{\partial\dot{q}_{i}}\delta q_{i}\right|_{t_{1}}^{t_{2}}-\int_{t_{1}}^{t_{2}}\frac{d}{dt}\left(\frac{\partial L}{\partial\dot{q}_{i}}\right)\delta q_{i}dt \ \ \ \ \ (7)$

The requirement that ${q_{i}\left(t_{1}\right)}$ and ${q_{i}\left(t_{2}\right)}$ are fixed means that ${\delta q_{i}=0}$ at the limits of integration, so the middle term is zero. We’re then left with

$\displaystyle \delta S=\int_{t_{1}}^{t_{2}}\left[\frac{\partial L}{\partial q_{i}}-\frac{d}{dt}\left(\frac{\partial L}{\partial\dot{q}_{i}}\right)\right]\delta q_{i}dt=0 \ \ \ \ \ (8)$

This must be true for all possible variations ${\delta q_{i}}$ so the quantity in brackets must be zero, which gives us the Euler-Lagrange equations:

$\displaystyle \frac{\partial L}{\partial q_{i}}-\frac{d}{dt}\left(\frac{\partial L}{\partial\dot{q}_{i}}\right)=0 \ \ \ \ \ (9)$

It’s worth digressing at this point to explain why we can take ${q_{i}}$ and ${\dot{q}_{i}}$ as independent variables. It would seem that they are not independent since once you’ve specified ${q_{i}}$ you can get ${\dot{q}_{i}}$ by just taking the derivative. The point is that when we specify the Lagrangian ${L}$, we don’t know what ${q_{i}\left(t\right)}$ is; all we have is the function ${L}$ which depends on both ${q_{i}}$ and ${\dot{q}_{i}}$. The goal of minimizing the action is to find the curves ${q_{i}\left(t\right)}$ such that these curves together with their derivatives minimize the integral of ${L\left(q_{i},\dot{q}_{i}\right)}$. The physics comes in specifying the Lagrangian; the mathematics then allows us to determine the paths ${q_{i}\left(t\right)}$ followed by the particles. In principle, we can specify ${L}$ to be any old function of ${q_{i}}$ and ${\dot{q}_{i}}$, but once we’ve done this, the form of ${L}$ is fixed and we can then solve the Euler-Lagrange equations to find the particle paths. In other words, the Euler-Lagrange equations specify the ${q_{i}}$ so that the ${q_{i}}$ together with their derivatives ${\dot{q}_{i}}$ minimize the action ${S}$.

In an alternative universe, we could conceive of a Lagrangian that depended on ${q_{i}}$, ${\dot{q}_{i}}$ and ${\ddot{q}_{i}}$, say. In that case all of ${q_{i}}$, ${\dot{q}_{i}}$ and ${\ddot{q}_{i}}$ would be independent variables in the derivation above, and we’d end up with a different form of the Euler-Lagrange equations. The fact that physical Lagrangians depend only on ${q_{i}}$, and ${\dot{q}_{i}}$ is a consequence of Newton’s second law ${F=ma}$, since this allows only the positions and velocities to be specified as independent variables.

All this is fine, except how do we know that these equations, when solved for ${q_{i}\left(t\right)}$, actually do give the path followed by the system? The key is to look back at the definition of ${L}$ in 1. Then

 $\displaystyle \frac{\partial L}{\partial q_{i}}$ $\displaystyle =$ $\displaystyle -\frac{\partial V}{\partial q_{i}}\ \ \ \ \ (10)$ $\displaystyle \frac{d}{dt}\left(\frac{\partial L}{\partial\dot{q}_{i}}\right)$ $\displaystyle =$ $\displaystyle \frac{d}{dt}\left(\frac{\partial T}{\partial\dot{q}_{i}}\right)\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{d}{dt}\left(\frac{\partial}{\partial\dot{q}_{i}}\sum_{j}\frac{1}{2}m_{j}\dot{q}_{j}^{2}\right)\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{d}{dt}m_{i}\dot{q}_{i}\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \dot{p}_{i} \ \ \ \ \ (14)$

where ${p_{i}}$ is the momentum of degree of freedom ${i}$. Therefore, the Euler-Lagrange equations are equivalent to

$\displaystyle \dot{p}_{i}=-\frac{\partial V}{\partial q_{i}}=F_{i} \ \ \ \ \ (15)$

where ${F_{i}}$ is the force acting on degree of freedom ${i}$. This is just Newton’s second law, so the Euler-Lagrange formulation is indeed equivalent to Newton’s laws.

Classical field theory

The main difference between particle theory and field theory is that the variables ${q_{i}}$ no longer describe the motion of anything, that is, they are no longer functions of time. Rather, they become fixed labels for points in space. The position variables ${q_{i}}$ become independent variables in the same way that the time ${t}$ is independent. Taken together, they label points in spacetime.

A field is some quantity that has a value for each point in spacetime, and it is this quantity that can change as we move from place to place or forward in time. For a scalar field such as temperature or density, the field consists of a single value ${\phi\left(q^{\mu}\right)}$ attached to each point in spacetime, where we now use the notation ${q^{\mu}}$ to represent the space components together with time. (That is, ${q^{\mu}}$ is a four-vector in special relativity, with ${q^{0}=t}$, ${q^{1}=x}$ and so on.) A vector field, such as the electric field ${\mathbf{E}}$, is actually composed of three separate fields, one for each spatial coordinate. Each of these fields again has a single value for each point ${q^{\mu}}$.

To work out the Euler-Lagrange equations for classical field theory, we need to think about what is meant by a ‘path’ that the system follows. Because the spacetime coordinates ${q^{\mu}}$ are no longer dynamical variables, it doesn’t make sense to ask how ${q^{\mu}}$ changes with time. What does change is the value of the field ${\phi}$ (or ${\phi^{r}}$ if we have more than one field, as with the electric field, in which case the index ${r}$ ranges over all the fields), so it is the field ${\phi}$ that is the dynamical variable. As such, the path followed is determined by a function of the field values. By analogy with the Lagrangian in the particle case, we define the Lagrangian density ${\mathcal{L}\left(\phi^{r},\phi_{,\mu}^{r},q^{\mu}\right)}$. The notation ${\phi_{,\mu}^{r}}$ is defined as

$\displaystyle \phi_{,\mu}^{r}\equiv\frac{\partial\phi^{r}}{\partial q^{\mu}} \ \ \ \ \ (16)$

The Lagrangian density is the Lagrangian per unit volume, and each infinitesimal volume element ${d^{3}x=dq^{1}dq^{2}dq^{3}}$ follows a path through time, so the action element of this volume element between times ${t_{1}}$ and ${t_{2}}$ is

$\displaystyle dS=\int_{t_{1}}^{t_{2}}\mathcal{L}\left(\phi^{r},\phi_{,\mu}^{r},q^{\mu}\right)dt \ \ \ \ \ (17)$

The total action of the entire system is the integral of this over some spacetime volume ${\Omega}$ that encloses the entire system spatially during the time interval, so

$\displaystyle S=\int_{\Omega}\mathcal{L}\left(\phi^{r},\phi_{,\mu}^{r},q^{\mu}\right)d^{4}q \ \ \ \ \ (18)$

The idea now is to apply the calculus of variations to this integral and require ${\delta S=0}$ as in the particle case. Remember that we’re varying the fields at each spacetime point and not the coordinates ${q^{\mu}}$. Therefore (I’ll drop the superscript ${r}$ to avoid confusion with the summation convention, so the following should be taken to apply to each field ${\phi^{r}}$ separately. A summation over ${\mu}$ is implied):

$\displaystyle \delta S=\int_{\Omega}\left[\frac{\partial\mathcal{L}}{\partial\phi}\delta\phi+\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\delta\phi_{,\mu}\right]d^{4}q \ \ \ \ \ (19)$

To work out the second term, we write out the derivative explicitly:

 $\displaystyle \frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\delta\phi_{,\mu}$ $\displaystyle =$ $\displaystyle \frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\frac{\partial\left(\delta\phi\right)}{\partial q^{\mu}}\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\partial}{\partial q^{\mu}}\left[\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\delta\phi\right]-\frac{\partial}{\partial q^{\mu}}\left[\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\right]\delta\phi \ \ \ \ \ (21)$

where the last line follows from the product rule. We therefore get

$\displaystyle \delta S=\int_{\Omega}\left[\frac{\partial\mathcal{L}}{\partial\phi}-\frac{\partial}{\partial q^{\mu}}\left(\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\right)\right]\delta\phi d^{4}q+\int_{\Omega}\frac{\partial}{\partial q^{\mu}}\left[\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\delta\phi\right]d^{4}q \ \ \ \ \ (22)$

The last term is the integral of a 4-d divergence over a 4-d volume and (trusting the mathematicians again) we can use a 4-d analog of Gauss’s theorem to convert this to a surface integral over a 3-d surface ${\Sigma}$ that bounds ${\Omega}$. Making the usual assumption that this surface can be removed to infinity and that our system is finite so that ${\mathcal{L}\rightarrow0}$ at infinity, this integral goes to zero. We’re left with

$\displaystyle \delta S=\int_{\Omega}\left[\frac{\partial\mathcal{L}}{\partial\phi}-\frac{\partial}{\partial q^{\mu}}\left(\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}}\right)\right]\delta\phi d^{4}q=0 \ \ \ \ \ (23)$

The requirement that this is valid for all variations ${\delta\phi}$ in the field gives us the field theory version of the Euler-Lagrange equations (where I’ve restored the index ${r}$ indicating which field we’re talking about; note that ${\mu}$ is still summed):

$\displaystyle \frac{\partial\mathcal{L}}{\partial\phi^{r}}-\frac{\partial}{\partial q^{\mu}}\left(\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}^{r}}\right)=0 \ \ \ \ \ (24)$

Example Suppose we have a classical, non-relativistic field of dust particles. The dust is sparse enough that there is no appreciable inter-particle interaction and there is no external force (such as gravity). Thus the potential energy density is ${\mathcal{V}\left(q^{\mu}\right)=0}$. Further, suppose that the particle mass density is ${\rho\left(q^{\mu}\right)}$ and is constant in time.

Suppose the dust particles move about their initial positions. We can describe this motion as a displacement field ${\phi^{r}\left(q^{\mu}\right)}$, where ${r=1,2,3}$ describes the direction of displacement. Note that ${q^{\mu}}$ still describes a fixed point in spacetime, while ${\phi^{r}}$ can vary with space and time. The fields ${\phi^{r}}$ have the dimensions of length, since they measure the displacement of a particle from its initial position.

The kinetic energy density ${\mathcal{T}}$ can be described in terms of ${\phi^{r}}$ as

$\displaystyle \mathcal{T}=\frac{1}{2}\rho\dot{\phi}^{r}\dot{\phi}_{r} \ \ \ \ \ (25)$

where here there is a sum over ${r}$, since we’re adding up the kinetic energy contributions from the three spatial directions. A dot indicates a derivative with respect to time, as usual.

Taking

$\displaystyle \mathcal{L}=\mathcal{T}-\mathcal{V}=\mathcal{T}=\frac{1}{2}\rho\dot{\phi}^{r}\dot{\phi}_{r} \ \ \ \ \ (26)$

the Euler-Lagrange equations 24 give us

 $\displaystyle \frac{\partial\mathcal{L}}{\partial\phi^{r}}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (27)$ $\displaystyle \frac{\partial\mathcal{L}}{\partial\phi_{,\mu}^{r}}$ $\displaystyle =$ $\displaystyle \rho\dot{\phi}^{r}\delta_{\mu t}\ \ \ \ \ (28)$ $\displaystyle -\frac{\partial}{\partial q^{\mu}}\left(\frac{\partial\mathcal{L}}{\partial\phi_{,\mu}^{r}}\right)$ $\displaystyle =$ $\displaystyle -\frac{\partial}{\partial t}\left(\frac{\partial\mathcal{L}}{\partial\phi_{,t}^{r}}\right)\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\rho\ddot{\phi}^{r}\ \ \ \ \ (30)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (31)$

So

$\displaystyle \rho\ddot{\phi}^{r}=0 \ \ \ \ \ (32)$

In other words, the acceleration of the field (and hence of the dust particles) is zero, so they move with constant velocity. This is just an expression of Newton’s law ${F=\frac{dp}{dt}}$ applied to a continuous medium.

# Hamilton’s equations and Poisson brackets

References: Robert D. Klauber, Student Friendly Quantum Field Theory, (Sandtrove Press, 2013) – Chapter 2.

Because Klauber’s approach to QFT depends on generalizing classical mechanics, it’s worth seeing how the basic equations of classical mechanics are derived. We’ve already seen that the Euler-Lagrange equation is derived from the principle of least action using the calculus of variations. The equation for a single particle in one dimension is

$\displaystyle \frac{\partial L}{\partial q}-\frac{d}{dt}\frac{\partial L}{\partial\dot{q}}=0 \ \ \ \ \ (1)$

where ${L}$ is the Lagrangian

$\displaystyle L=L\left(q,\dot{q}\right)=T-V \ \ \ \ \ (2)$

We can generalize this to a system with ${d}$ degrees of freedom (the number of degrees of freedom is the number of particles multiplied by the number of dimensions) as

$\displaystyle \frac{\partial L}{\partial q_{i}}-\frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}=0 \ \ \ \ \ (3)$

where ${i=1,\ldots,d}$. Solving this system of differential equations gives the particle trajectories as functions of time.

The Euler-Lagrange equations can be put into a different form by means of a Legendre transformation as follows. We define the conjugate momenta as

$\displaystyle p_{k}\equiv\frac{\partial L}{\partial\dot{q}_{k}} \ \ \ \ \ (4)$

We also define the Hamiltonian

$\displaystyle H\equiv\sum_{k}p_{k}\dot{q}_{k}-L \ \ \ \ \ (5)$

Taking derivatives of ${H}$ gives (treating ${p_{k}}$ and ${q_{k}}$ as the independent variables):

 $\displaystyle \frac{\partial H}{\partial p_{i}}$ $\displaystyle =$ $\displaystyle \dot{q}_{i}+\sum_{k}p_{k}\frac{\partial\dot{q}_{k}}{\partial p_{i}}-\sum_{k}\frac{\partial L}{\partial\dot{q}_{k}}\frac{\partial\dot{q}_{k}}{\partial p_{i}}\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \dot{q}_{i} \ \ \ \ \ (7)$

where we used 4 to cancel off the last two sums.

Similarly, we get

 $\displaystyle \frac{\partial H}{\partial q_{i}}$ $\displaystyle =$ $\displaystyle \sum_{k}p_{k}\frac{\partial\dot{q}_{k}}{\partial q_{k}}-\frac{\partial L}{\partial q_{i}}-\sum_{k}\frac{\partial L}{\partial\dot{q}_{k}}\frac{\partial\dot{q}_{k}}{\partial q_{i}}\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\partial L}{\partial q_{i}} \ \ \ \ \ (9)$

Comparing this with 4 and 3 we see that

 $\displaystyle \frac{\partial L}{\partial q_{i}}-\frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}$ $\displaystyle =$ $\displaystyle -\frac{\partial H}{\partial q_{i}}-\frac{d}{dt}p_{i}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (11)$ $\displaystyle \dot{p}_{i}$ $\displaystyle =$ $\displaystyle -\frac{\partial H}{\partial q_{i}} \ \ \ \ \ (12)$

We thus get Hamilton’s equations which are equivalent to the Euler-Lagrange equations:

 $\displaystyle \dot{p}_{i}$ $\displaystyle =$ $\displaystyle -\frac{\partial H}{\partial q_{i}}\ \ \ \ \ (13)$ $\displaystyle \dot{q}_{i}$ $\displaystyle =$ $\displaystyle \frac{\partial H}{\partial p_{i}} \ \ \ \ \ (14)$

For a general function ${u\left(q_{i},p_{i},t\right)}$ of the generalized coordinates ${q_{i}}$, conjugate momenta ${p_{i}}$ and time ${t}$, its time derivative is

 $\displaystyle \frac{du}{dt}$ $\displaystyle =$ $\displaystyle \sum_{k}\frac{\partial u}{\partial q_{k}}\dot{q}_{k}+\sum_{k}\frac{\partial u}{\partial p_{k}}\dot{p}_{k}+\frac{\partial u}{\partial t}\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial u}{\partial q_{k}}\frac{\partial H}{\partial p_{k}}-\frac{\partial u}{\partial p_{k}}\frac{\partial H}{\partial q_{k}}\right)+\frac{\partial u}{\partial t} \ \ \ \ \ (16)$

The sum in the last line is called the Poisson bracket and written as

$\displaystyle \left\{ u,H\right\} \equiv\sum_{k}\left(\frac{\partial u}{\partial q_{k}}\frac{\partial H}{\partial p_{k}}-\frac{\partial u}{\partial p_{k}}\frac{\partial H}{\partial q_{k}}\right) \ \ \ \ \ (17)$

Hamilton’s equations can be written in terms of Poisson brackets (remember that the independent variables are ${p_{i}}$ and ${q_{i}}$):

 $\displaystyle \dot{p}_{i}$ $\displaystyle =$ $\displaystyle \left\{ p_{i},H\right\} \ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial p_{i}}{\partial q_{k}}\frac{\partial H}{\partial p_{k}}-\frac{\partial p_{i}}{\partial p_{k}}\frac{\partial H}{\partial q_{k}}\right)\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\partial H}{\partial q_{i}}\ \ \ \ \ (20)$ $\displaystyle \dot{q}_{i}$ $\displaystyle =$ $\displaystyle \left\{ q_{i},H\right\} \ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial q_{i}}{\partial q_{k}}\frac{\partial H}{\partial p_{k}}-\frac{\partial q_{i}}{\partial p_{k}}\frac{\partial H}{\partial q_{k}}\right)\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\partial H}{\partial p_{i}} \ \ \ \ \ (23)$

Finally, we’ll have a look at the Poisson brackets for conjugate variables:

 $\displaystyle \left\{ q_{i},p_{j}\right\}$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial q_{i}}{\partial q_{k}}\frac{\partial p_{j}}{\partial p_{k}}-\frac{\partial q_{i}}{\partial p_{k}}\frac{\partial p_{j}}{\partial q_{k}}\right)\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \delta_{ij}\ \ \ \ \ (25)$ $\displaystyle \left\{ q_{i},q_{j}\right\}$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial q_{i}}{\partial q_{k}}\frac{\partial q_{j}}{\partial p_{k}}-\frac{\partial q_{i}}{\partial p_{k}}\frac{\partial q_{j}}{\partial q_{k}}\right)\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (27)$ $\displaystyle \left\{ p_{i},p_{j}\right\}$ $\displaystyle =$ $\displaystyle \sum_{k}\left(\frac{\partial p_{i}}{\partial q_{k}}\frac{\partial p_{j}}{\partial p_{k}}-\frac{\partial p_{i}}{\partial p_{k}}\frac{\partial p_{j}}{\partial q_{k}}\right)\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (29)$

The Poisson brackets for position ${q_{i}}$ and momentum ${p_{i}}$ bear an uncanny resemblance to the commutators of position and momentum in quantum mechanics:

 $\displaystyle \left[x_{i},p_{j}\right]$ $\displaystyle =$ $\displaystyle i\delta_{ij}\ \ \ \ \ (30)$ $\displaystyle \left[x_{i},x_{j}\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (31)$ $\displaystyle \left[p_{i},p_{j}\right]$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (32)$

In fact, the correspondence between Poisson brackets and commutators was used in the development of quantum mechanics as a guide to quantizing classical theory.

# Natural units

References: Robert D. Klauber, Student Friendly Quantum Field Theory, (Sandtrove Press) – Chapter 2, Problems 2.1-2.2.

In quantum field theory, the quantities ${c}$ (speed of light) and ${\hbar}$ (Planck’s constant ${h}$ divided by ${2\pi}$) occur frequently. Both of these quantities are (or at least are believed to be) absolute fundamental constants of nature, so expressing their values in terms of arbitrary units such as those in the MKS or CGS systems seems rather artificial and unnecessary. It is common practice in QFT, therefore, to take both ${c}$ and ${\hbar}$ as basic units by which everything else is measured. As such, we set ${c=1}$ (no units) and ${\hbar=1}$ (also no units). We’ve already seen that taking ${c=1}$ is common practice when using relativity theory, but there we weren’t concerned with quantum mechanics so no mention was made of ${\hbar}$. Here we’ll explore the consequences of setting ${c=\hbar=1}$, a system known as natural units.

To work out what this choice means, we need to relate these units to those with which we’re more familiar. Klauber treats the CGS system, but since I’ve used MKS for most of my posts, we’ll relate things back to that instead.

In MKS, ${c}$ is a velocity so it has dimensions of ${\mbox{length}\div\mbox{time}}$. Making ${c}$ dimensionless means that all velocities are dimensionless, so that the unit of length is also the unit of time.

What about ${\hbar}$? In MKS, its units are those of action, or ${\mbox{energy}\times\mbox{time}=\mbox{mass}\times\left(\mbox{length}\right)^{2}\div\mbox{time}}$. In natural units, the units of length are time are the same, so ${\hbar}$ has units of ${\mbox{mass}\times\mbox{length}}$. Making this dimensionless means that the dimension of length (and thus also time) is the inverse of the dimension of mass. We’ve thus managed to reduce the three distinct units (length, mass and time) in the MKS system to a single unit (mass). We therefore need some basic unit of mass. There are various ways we could choose the mass unit, but the most commonly used unit in QFT is the MeV (mega-electron-volt), which is the energy an electron gains by being accelerated through a potential difference of ${10^{6}}$ volts. This isn’t technically a ‘natural’ unit, since although the energy is expressed in terms of a fundamental constant (the charge on the electron), it also uses a unit (the volt) that is derived from the MKS system of units. However, it’s what’s in common use.

From these definitions, it’s possible to work out the units of any physical quantity entirely in terms of powers of mass. Klauber’s Wholeness Chart 2-1 shows many of these quantities. Energy, mass and acceleration all have units of mass, while length and time have units of inverse mass. Area has units of ${\left(\mbox{mass}\right)^{-2}}$ and volume of ${\left(\mbox{mass}\right)^{-3}}$.

To convert from natural units to so-called hybrid units, in which length and time have units taken from either the MKS or CGS systems, but mass is still given in terms of MeV, we first write out ${c}$ and ${\hbar}$ in hybrid units (here I’m using MKS):

 $\displaystyle c$ $\displaystyle =$ $\displaystyle 2.99\times10^{8}\mbox{ m s}^{-1}\ \ \ \ \ (1)$ $\displaystyle \hbar$ $\displaystyle =$ $\displaystyle 6.58\times10^{-22}\mbox{ MeV s}\ \ \ \ \ (2)$ $\displaystyle \hbar c$ $\displaystyle =$ $\displaystyle 1.973\times10^{-13}\mbox{ MeV m} \ \ \ \ \ (3)$

Next, we multiply the quantity in natural units by factors of ${c}$ and/or ${\hbar}$ to make the units come out right in the hybrid system. Note that in the hybrid system, energy (in MeV) is still a fundamental unit, so that mass is expressed in units of ${\mbox{energy}\div c^{2}=\mbox{MeV s}^{2}\mbox{m}^{-2}}$.

For example, in natural units, length has dimensions of ${\mbox{MeV}^{-1}}$, so to get a length in metres, we multiply it by ${\hbar c}$. Time also has dimensions of ${\mbox{MeV}^{-1}}$ so to get a time in seconds, multiply it by ${\hbar}$. Force has units of ${\mbox{energy}\div\mbox{length}}$ and energy is the same in natural and hybrid units, so to get force as ${\mbox{MeV m}^{-1}}$ we divide it ${\hbar c}$. And so on.

To make the final conversion to MKS units, we need the conversion

$\displaystyle 1\mbox{ MeV}=1.60218\times10^{-13}\mbox{ J} \ \ \ \ \ (4)$

Thus any hybrid quantity containing a power of MeV gets multiplied by the same power of this conversion factor to get the final result in MKS.

We can in fact define a system of units based on any appropriate set of physical constants that we like. The MKS system uses the metre (ultimately based on the size of the Earth; one early definition was that 1 metre is ${10^{-7}}$ times the distance from the north pole to the equator), the kilogram (the weight of 1 litre of water at ${4^{\circ}\mbox{ C}}$, which might sound fundamental, but the litre, of course, is defined from the metre, so again, this unit depends on properties of the Earth), and the second (a unit of time ultimately based on the Earth’s rotation period). From the point of view of fundamental physics, all three of these units are arbitrary as none of them are based on any fundamental constants of nature.

We could, for example, define a system of units in which one of the ‘fundamental’ units is the speed of sound. We would need to define precisely how the speed of sound is to be measured, however, since it depends on the substance transmitting the sound. In general, sound travels faster through denser materials. Suppose we define the material in which the speed is to be measured (and its temperature and pressure), and set this speed to be ${s=1}$ (dimensionless). As with ${c=1}$ above, setting a velocity to be a dimensionless quantity implies that length and time have the same units. If we retained the second as the unit of time, then length is also measured in seconds.

As another example, the fine structure constant that turns up in the spectrum of hydrogen is (in MKS):

$\displaystyle \alpha=\frac{e^{2}}{4\pi\epsilon_{0}\hbar c}=\frac{1}{137.036} \ \ \ \ \ (5)$

To convert this to natural units, we need to know how to handle electric charge. There are two commonly used ways of doing this. In strict CGS or Gaussian units, the factor ${4\pi\epsilon_{0}}$ is defined to be 1 and dimensionless, which makes Coulomb’s law take the form

$\displaystyle F=\frac{q_{1}q_{2}}{r^{2}} \ \ \ \ \ (6)$

for the force ${F}$ between two charges separated by a distance ${r}$. This means that the units of charge can actually be expressed in terms of length, mass and time, since

 $\displaystyle \frac{\mbox{mass}\times\mbox{length}}{\left(\mbox{time}\right)^{2}}$ $\displaystyle =$ $\displaystyle \frac{\left(\mbox{charge}\right)^{2}}{\left(\mbox{length}\right)^{2}}\ \ \ \ \ (7)$ $\displaystyle \mbox{charge}$ $\displaystyle =$ $\displaystyle \sqrt{\frac{\mbox{mass}\times\left(\mbox{length}\right)^{3}}{\left(\mbox{time}\right)^{2}}} \ \ \ \ \ (8)$

However, another system defines just ${\epsilon_{0}}$ on its own (without the ${4\pi}$) to be 1 and dimensionless. The units of charge come out the same in terms of mass, length and time, but the numerical values are, of course, different. This latter system is the more common in QFT, so in those units

$\displaystyle \alpha=\frac{e^{2}}{4\pi\hbar c}=\frac{1}{137.036} \ \ \ \ \ (9)$

In natural units, this becomes

$\displaystyle \alpha=\frac{e^{2}}{4\pi}=\frac{1}{137.036} \ \ \ \ \ (10)$

which gives a value for the electron charge of

$\displaystyle e=\sqrt{\frac{4\pi}{137.036}}=0.3028\mbox{ (dimensionless)} \ \ \ \ \ (11)$

# Number operator

References: Mark Srednicki, Quantum Field Theory, (Cambridge University Press, 2007) – Chapter 1, Problem 1.3.

The number operator is defined as

$\displaystyle N\equiv\int d^{3}x\;a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right) \ \ \ \ \ (1)$

Applied to a quantum state, it counts the number of particles in that state:

$\displaystyle Na^{\dagger}\left(\mathbf{x}_{1}\right)\ldots a^{\dagger}\left(\mathbf{x}_{n}\right)\left|0\right\rangle =na^{\dagger}\left(\mathbf{x}_{1}\right)\ldots a^{\dagger}\left(\mathbf{x}_{n}\right)\left|0\right\rangle \ \ \ \ \ (2)$

Another property of ${N}$ is that it commutes with any other operator that contains an equal number of creation and annihilation operators. To see this, look at the individual commutators as follows (where ${a_{i}\equiv a\left(\mathbf{x}_{i}\right)}$).

 $\displaystyle \left[N,a_{i}^{\dagger}\right]$ $\displaystyle =$ $\displaystyle \int d^{3}x\;\left(a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)a_{i}^{\dagger}-a_{i}^{\dagger}a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)\right)\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int d^{3}x\;\left[a^{\dagger}\left(\mathbf{x}\right)\left(\delta\left(\mathbf{x}-\mathbf{x}_{i}\right)+a_{i}^{\dagger}a\left(\mathbf{x}\right)\right)-a_{i}^{\dagger}a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)\right]\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int d^{3}x\;a^{\dagger}\left(\mathbf{x}\right)\delta\left(\mathbf{x}-\mathbf{x}_{i}\right)\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle a_{i}^{\dagger}\ \ \ \ \ (6)$ $\displaystyle \left[N,a_{i}\right]$ $\displaystyle =$ $\displaystyle \int d^{3}x\;\left(a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)a_{i}-a_{i}a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)\right)\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \int d^{3}x\;\left[a^{\dagger}\left(\mathbf{x}\right)a\left(\mathbf{x}\right)a_{i}-\left(\delta\left(\mathbf{x}-\mathbf{x}_{i}\right)+a^{\dagger}\left(\mathbf{x}\right)a_{i}\right)a\left(\mathbf{x}\right)\right]\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\int d^{3}x\;a\left(\mathbf{x}\right)\delta\left(\mathbf{x}-\mathbf{x}_{i}\right)\ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -a_{i} \ \ \ \ \ (10)$

Here we’ve used the commutation relations

 $\displaystyle \left[a\left(\mathbf{x}\right),a\left(\mathbf{x}^{\prime}\right)\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (11)$ $\displaystyle \left[a^{\dagger}\left(\mathbf{x}\right),a^{\dagger}\left(\mathbf{x}^{\prime}\right)\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (12)$ $\displaystyle \left[a\left(\mathbf{x}\right),a^{\dagger}\left(\mathbf{x}^{\prime}\right)\right]$ $\displaystyle =$ $\displaystyle \delta^{3}\left(\mathbf{x}-\mathbf{x}^{\prime}\right) \ \ \ \ \ (13)$

Now suppose we have an operator ${X}$ which contains ${n}$ creation operators ${a_{i}^{\dagger}}$, ${i=1,\ldots,n}$ and ${m}$ annihiliation operators ${a_{j}}$, ${j=1,\ldots,m}$:

$\displaystyle X=a_{i1}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm} \ \ \ \ \ (14)$

Then

 $\displaystyle \left[N,X\right]$ $\displaystyle =$ $\displaystyle Na_{i1}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}-a_{i1}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}N\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(a_{i1}^{\dagger}N+a_{i1}^{\dagger}\right)a_{i2}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}-a_{i1}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}N\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X+a_{i1}^{\dagger}\left[N,a_{i2}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}\right] \ \ \ \ \ (17)$

We can see that the commutator in the last line can be worked out recursively until we’ve processed all the creation operators up to ${a_{in}^{\dagger}}$, giving

$\displaystyle \left[N,X\right]=nX+a_{i1}^{\dagger}\ldots a_{in}^{\dagger}\left[N,a_{j1}\ldots a_{jm}\right] \ \ \ \ \ (18)$

The last commutator gives us

 $\displaystyle \left[N,a_{j1}\ldots a_{jm}\right]$ $\displaystyle =$ $\displaystyle Na_{j1}\ldots a_{jm}-a_{j1}\ldots a_{jm}N\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(a_{j1}N-a_{j1}\right)a_{j2}\ldots a_{jm}-a_{j1}\ldots a_{jm}N\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -a_{j1}\ldots a_{jm}+a_{j1}\left[N,a_{j2}\ldots a_{jm}\right]\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -m\left(a_{j1}\ldots a_{jm}\right) \ \ \ \ \ (22)$

Therefore

 $\displaystyle a_{i1}^{\dagger}\ldots a_{in}^{\dagger}\left[N,a_{j1}\ldots a_{jm}\right]$ $\displaystyle =$ $\displaystyle -m\left(a_{i1}^{\dagger}\ldots a_{in}^{\dagger}a_{j1}\ldots a_{jm}\right)\ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -mX\ \ \ \ \ (24)$ $\displaystyle \left[N,X\right]$ $\displaystyle =$ $\displaystyle \left(n-m\right)X \ \ \ \ \ (25)$

So if ${n=m}$ (the numbers of creation and annihiliation operators are equal), the operator ${X}$ commutes with ${N}$. In particular, the hamiltonian we met last time satisfies this criterion, so ${\left[N,H\right]=0}$ and this hamiltonian conserves particle numbers.

# Dirac equation

References: Mark Srednicki, Quantum Field Theory, (Cambridge University Press, 2007) – Chapter 1, Problem 1.1.

The Klein-Gordon equation is an early attempt at a relativistic quantum theory, but it contains a second-order time derivative which leads to probability note being conserved over time. Dirac proposed another equation that attempts to solve this problem for particles of spin 1/2. The Dirac equation is essentially a modification of the Schrödinger equation:

$\displaystyle i\hbar\frac{\partial}{\partial t}\psi_{a}\left(x\right)=\left[-i\hbar c\left(\alpha^{j}\right)_{ab}\partial_{j}+mc^{2}\left(\beta\right)_{ab}\right]\psi_{b}\left(x\right) \ \ \ \ \ (1)$

Here, ${\psi}$ is now a vector in spin space with components ${\psi_{a}}$. The objects ${\beta}$ and ${\alpha^{j}}$ (for ${j=1,2,3}$) are square matrices (where the subscript ${ab}$ indicates the component of the matrix being considered), also in spin space, and repeated indices are summed over spatial coordinates only. [We won’t worry about how Dirac arrived at this equation for now; we’ll just accept it and see where it leads.]

To make this equation formally equivalent to the Schrödinger equation, the hamiltonian operator ${H}$ on the RHS must now be a matrix. We can also use the definition of the momentum operator ${P_{j}=-i\hbar\partial_{j}}$ to get

$\displaystyle H_{ab}=cP_{j}\left(\alpha^{j}\right)_{ab}+mc^{2}\left(\beta\right)_{ab} \ \ \ \ \ (2)$

This might not look much like the relativistic energy:

$\displaystyle E=\sqrt{p^{2}c^{2}+m^{2}c^{4}} \ \ \ \ \ (3)$

but if we square 2 (remembering that matrix products need not commute), we have

$\displaystyle \left(H^{2}\right)_{ab}=c^{2}P_{j}P_{k}\left(\alpha^{j}\alpha^{k}\right)_{ab}+mc^{3}P_{j}\left(\alpha^{j}\beta+\beta\alpha^{j}\right)_{ab}+m^{2}c^{4}\left(\beta^{2}\right)_{ab} \ \ \ \ \ (4)$

We can define the anticommutator as

$\displaystyle \left\{ A,B\right\} \equiv AB+BA \ \ \ \ \ (5)$

We can write the first term on the RHS of 4 as

$\displaystyle c^{2}P_{j}P_{k}\left(\alpha^{j}\alpha^{k}\right)_{ab}=\frac{1}{2}c^{2}P_{j}P_{k}\left\{ \alpha^{j},\alpha^{k}\right\} \ \ \ \ \ (6)$

so we get

$\displaystyle \left(H^{2}\right)_{ab}=\frac{1}{2}c^{2}P_{j}P_{k}\left\{ \alpha^{j},\alpha^{k}\right\} +mc^{3}P_{j}\left\{ \alpha^{j},\beta\right\} +m^{2}c^{4}\left(\beta^{2}\right)_{ab} \ \ \ \ \ (7)$

In order to make this equal to ${E^{2}}$, we need the matrices ${\alpha^{j}}$ and ${\beta}$ to satisfy the conditions:

 $\displaystyle \left\{ \alpha^{j},\alpha^{k}\right\}$ $\displaystyle =$ $\displaystyle 2\delta^{jk}\delta_{ab}\ \ \ \ \ (8)$ $\displaystyle \left\{ \alpha^{j},\beta\right\}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (9)$ $\displaystyle \left(\beta^{2}\right)_{ab}$ $\displaystyle =$ $\displaystyle \delta_{ab} \ \ \ \ \ (10)$

The first condition requires the anticommutator of ${\alpha^{j}}$ and ${\alpha^{k}}$ to be zero unless ${j=k}$, in which case the anticommutator gives the identity matrix. Remember that the superscripts ${j}$ and ${k}$ specify which matrix we’re talking about, while the subscripts ${ab}$ indicate the component of the matrix. The conditions aren’t derived; rather they are imposed to make the energy come out right. With these conditions, we have

$\displaystyle \left(H^{2}\right)_{ab}=\left(\mathbf{P}^{2}c^{2}+m^{2}c^{4}\right)\delta_{ab} \ \ \ \ \ (11)$

which gives the correct operator for the square of the energy.

The question arises as to what these matrices ${\alpha^{j}}$ and ${\beta}$ are. One candidate is the set of three Pauli spin matrices

 $\displaystyle \sigma_{x}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (12)$ $\displaystyle \sigma_{y}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\ \ \ \ \ (13)$ $\displaystyle \sigma_{z}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (14)$

By direct calculation, we see that ${\left\{ \sigma^{i},\sigma^{j}\right\} =2\delta^{ij}}$. For example

 $\displaystyle \left\{ \sigma_{x},\sigma_{y}\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]+\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} i & 0\\ 0 & -i \end{array}\right]+\left[\begin{array}{cc} -i & 0\\ 0 & i \end{array}\right]\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (17)$ $\displaystyle \left\{ \sigma_{x},\sigma_{x}\right\}$ $\displaystyle =$ $\displaystyle 2\sigma_{x}^{2}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2\left[\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right] \ \ \ \ \ (19)$

and so on. However, in order to satisfy 9, we need to find a single matrix that anticommutes with all 3 spin matrices. We get

 $\displaystyle \left\{ \sigma_{x},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} \beta_{21} & \beta_{22}\\ \beta_{11} & \beta_{12} \end{array}\right]+\left[\begin{array}{cc} \beta_{12} & \beta_{11}\\ \beta_{22} & \beta_{21} \end{array}\right]\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (21)$

This gives

 $\displaystyle \beta_{12}$ $\displaystyle =$ $\displaystyle -\beta_{21}\equiv\gamma\ \ \ \ \ (22)$ $\displaystyle \beta_{11}$ $\displaystyle =$ $\displaystyle -\beta_{22}\equiv\epsilon \ \ \ \ \ (23)$

We then get

 $\displaystyle \left\{ \sigma_{z},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\left[\begin{array}{cc} \epsilon & \gamma\\ -\gamma & -\epsilon \end{array}\right]+\left[\begin{array}{cc} \epsilon & \gamma\\ -\gamma & -\epsilon \end{array}\right]\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} \epsilon & \gamma\\ \gamma & \epsilon \end{array}\right]+\left[\begin{array}{cc} \epsilon & -\gamma\\ -\gamma & \epsilon \end{array}\right]\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (26)$

This gives

$\displaystyle \epsilon=0 \ \ \ \ \ (27)$

So finally

 $\displaystyle \left\{ \sigma_{y},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{cc} 0 & \gamma\\ -\gamma & 0 \end{array}\right]+\left[\begin{array}{cc} 0 & \gamma\\ -\gamma & 0 \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} i\gamma & 0\\ 0 & i\gamma \end{array}\right]+\left[\begin{array}{cc} i\gamma & 0\\ 0 & i\gamma \end{array}\right]\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (30)$

So ${\gamma=0}$ resulting in ${\left(\beta\right)_{ab}=0}$. Thus there is no non-zero matrix ${\beta}$ that anticommutes with all 3 of the Pauli spin matrices.

So what can we say about the Dirac matrices? From 10, we see that the eigenvalues of ${\beta^{2}=I}$ are all 1, so the eigenvalues of ${\beta}$ must be ${\pm1}$.

The trace (sum of the diagonal elements) of a matrix is equal to the sum of its eigenvalues (theorem from matrix algebra). To find the trace of ${\beta}$, we can use the anticommutators 8 and 9, together with another theorem from matrix algebra which states that ${\mbox{tr}\left(AB\right)=\mbox{tr}\left(BA\right)}$ for any square matrices ${A}$ and ${B}$ of the same order.

 $\displaystyle \mbox{tr}\left(\alpha_{1}^{2}\beta\right)$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\alpha_{1}\left(\alpha_{1}\beta\right)\right)\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\left(\alpha_{1}\beta\right)\alpha_{1}\right) \ \ \ \ \ (32)$

However, from 9, ${\alpha_{1}\beta=-\beta\alpha_{1}}$ and from 8, ${\alpha_{1}^{2}=I}$ (the identity matrix), so

 $\displaystyle \mbox{tr}\left(\alpha_{1}^{2}\beta\right)$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\beta\right)\ \ \ \ \ (33)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\left(\alpha_{1}\beta\right)\alpha_{1}\right)\ \ \ \ \ (34)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\left(\beta\alpha_{1}\right)\alpha_{1}\right)\ \ \ \ \ (35)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\beta\alpha_{1}^{2}\right)\ \ \ \ \ (36)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\beta\right) \ \ \ \ \ (37)$

Hence ${\mbox{tr}\left(\beta\right)=-\mbox{tr}\left(\beta\right)=0}$, so ${\beta}$ must have an equal number of ${+1}$ and ${-1}$ eigenvalues. In other words, ${\beta}$ must be even dimensional, so the smallest size is ${4\times4}$.

We can also find the trace of ${\alpha^{j}}$ by starting with ${\mbox{tr}\left(\alpha^{j}\beta^{2}\right)}$ and following through the same steps as above (using ${\beta^{2}=I}$) to show that ${\mbox{tr}\left(\alpha^{j}\right)=-\mbox{tr}\left(\alpha^{j}\right)=0}$.

# Klein-Gordon equation

References: Mark Srednicki, Quantum Field Theory, (Cambridge University Press, 2007) – Chapter 1.

One possibility for converting the non-relativistic Schrödinger equation for a free particle to a relativistic equation is to replace the classical energy ${H=\frac{\mathbf{p}^{2}}{2m}}$ by the relativistic energy

$\displaystyle H=\sqrt{\mathbf{p}^{2}c^{2}+m^{2}c^{4}} \ \ \ \ \ (1)$

Using the quantum mechanical momentum operator ${\mathbf{p}=-i\hbar\nabla}$, the new equation becomes

$\displaystyle i\hbar\frac{\partial\psi\left(\mathbf{x},t\right)}{\partial t}=\sqrt{-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}}\psi\left(\mathbf{x},t\right) \ \ \ \ \ (2)$

This equation as it stands involves an operator on the RHS that produces a differential equation that is very difficult to solve, and in fact gives rise to a number of other problems we won’t go into here.

However, this equation in its simplest form says that the operator ${i\hbar\frac{\partial}{\partial t}}$ on the LHS is formally equivalent to the operator ${\sqrt{-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}}}$ on the RHS. So we can, in effect, multiply this equation through by the same operator which has the effect of squaring the operators on both sides of the equation, giving

$\displaystyle -\hbar^{2}\frac{\partial^{2}}{\partial t^{2}}\psi\left(\mathbf{x},t\right)=\left(-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}\right)\psi\left(\mathbf{x},t\right) \ \ \ \ \ (3)$

This is the Klein-Gordon equation. It is consistent with special relativity because it is Lorentz invariant. To see this, we need to show that it is invariant under a Lorentz transformation. The most general Lorentz transformation is

$\displaystyle \overline{x}^{\mu}=\Lambda_{\;\nu}^{\mu}x^{\nu}+a^{\mu} \ \ \ \ \ (4)$

where ${\Lambda}$ is the usual Lorentz transformation matrix which depends on the relative velocity of the two inertial frames and the vector ${a^{\mu}}$ is a constant translation of coordinates. The notation ${x^{\mu}}$ denotes an event in 4-d spacetime, so that

$\displaystyle x^{\mu}=\left(ct,x,y,z\right) \ \ \ \ \ (5)$

One of the principles of relativity is that physics should look the same in all inertial frames. This means that the wave function ${\psi}$ must have the same value for a given event (a specific set of time and space coordinate values) in both inertial frames. In other words, if ${\bar{\psi}\left(\bar{x}\right)}$ is the wave function in the barred system at an event with coordinates ${\bar{x}}$ in the barred system, we must have

$\displaystyle \bar{\psi}\left(\bar{x}\right)=\psi\left(x\right) \ \ \ \ \ (6)$

where ${x}$ consists of the spacetime coordinates for that event in the unbarred system.

Since we’re dealing only with special relativity, the metric tensor is the flat space metric ${\eta_{\mu\nu}}$

$\displaystyle \eta_{\mu\nu}=\left[\begin{array}{cccc} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right] \ \ \ \ \ (7)$

which we’ve seen is invariant under Lorentz transformations. That is

 $\displaystyle \eta_{\rho\sigma}$ $\displaystyle =$ $\displaystyle \eta_{\mu\nu}\Lambda_{\;\rho}^{\mu}\Lambda_{\;\sigma}^{\nu}\ \ \ \ \ (8)$ $\displaystyle \eta^{\mu\nu}$ $\displaystyle =$ $\displaystyle \eta^{\rho\sigma}\Lambda_{\;\rho}^{\mu}\Lambda_{\;\sigma}^{\nu} \ \ \ \ \ (9)$

This condition is derived from the requirement that the interval ${ds^{2}=\eta_{\mu\nu}dx^{\mu}dx^{\nu}}$ is invariant.

To show that 3 is Lorentz invariant, we need to see how the derivatives transform. Using 5 we can rewrite the Klein-Gordon equation in a more compact form:

$\displaystyle \hbar^{2}\partial_{\mu}\partial^{\mu}\psi\left(x\right)=m^{2}c^{2}\psi\left(x\right) \ \ \ \ \ (10)$

where the ${x}$ in ${\psi\left(x\right)}$ stands for the four components ${x^{\mu}}$ and

 $\displaystyle \partial_{\mu}$ $\displaystyle \equiv$ $\displaystyle \frac{\partial}{\partial x^{\mu}}=\left[\frac{\partial}{c\partial t},\nabla\right]\ \ \ \ \ (11)$ $\displaystyle \partial^{\mu}$ $\displaystyle =$ $\displaystyle \eta^{\mu\nu}\partial_{\nu}=\left[-\frac{\partial}{c\partial t},\nabla\right] \ \ \ \ \ (12)$

If we transform 10 into the barred frame, we have (since ${\hbar}$, ${c}$ and ${m}$ are all invariants)

$\displaystyle \hbar^{2}\bar{\partial}_{\mu}\bar{\partial}^{\mu}\bar{\psi}\left(\bar{x}\right)=m^{2}c^{2}\bar{\psi}\left(\bar{x}\right) \ \ \ \ \ (13)$

Using 6 we have

$\displaystyle \hbar^{2}\bar{\partial}_{\mu}\bar{\partial}^{\mu}\psi\left(x\right)=m^{2}c^{2}\psi\left(x\right) \ \ \ \ \ (14)$

so in order for this equation to be invariant, we need to show that ${\bar{\partial}_{\mu}\bar{\partial}^{\mu}=\partial_{\mu}\partial^{\mu}}$. Suppose that a single derivative transforms in the same way as the spacetime vector, that is

$\displaystyle \bar{\partial}^{\mu}=\Lambda_{\;\nu}^{\mu}\partial^{\nu} \ \ \ \ \ (15)$

Then

 $\displaystyle \bar{\partial}^{\rho}\bar{x}^{\sigma}$ $\displaystyle =$ $\displaystyle \left(\Lambda_{\;\mu}^{\rho}\partial^{\mu}\right)\left(\Lambda_{\;\nu}^{\sigma}x^{\nu}+a^{\sigma}\right)\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\partial^{\mu}x^{\nu} \ \ \ \ \ (17)$

[Since ${\Lambda}$ and ${a}$ are constants, their derivatives are zero.] Now we can use ${\partial^{\mu}x^{\nu}=\eta^{\mu\nu}}$ (remember the minus sign in ${\partial^{0}}$ from 12) and 9:

 $\displaystyle \bar{\partial}^{\rho}\bar{x}^{\sigma}$ $\displaystyle =$ $\displaystyle \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\partial^{\mu}x^{\nu}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\eta^{\mu\nu}\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \eta^{\rho\sigma} \ \ \ \ \ (20)$

Therefore, the relation 15 gives the correct value for ${\bar{\partial}^{\rho}\bar{x}^{\sigma}}$. This means

 $\displaystyle \bar{\partial}_{\mu}\bar{\partial}^{\mu}$ $\displaystyle =$ $\displaystyle \eta_{\mu\nu}\bar{\partial}^{\nu}\bar{\partial}^{\mu}\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \eta_{\mu\nu}\Lambda_{\;\rho}^{\nu}\partial^{\rho}\Lambda_{\;\sigma}^{\mu}\partial^{\sigma}\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \eta_{\mu\nu}\Lambda_{\;\rho}^{\nu}\Lambda_{\;\sigma}^{\mu}\partial^{\rho}\partial^{\sigma}\ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \eta_{\rho\sigma}\partial^{\rho}\partial^{\sigma}\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \partial_{\sigma}\partial^{\sigma} \ \ \ \ \ (25)$

Thus the transformed Klein-Gordon equation 14 is equivalent to the original version 3, and the equation is Lorentz invariant.

The problem is that because of the second-order time derivative on the LHS, the equation doesn’t have the same form as the Schrödinger equation, where the time derivative is first order. One consequence of this is that the normalization condition of the wave function isn’t conserved. If you review the notion of probability current you’ll see that the rate of change of the probability of a particle being in the interval ${x\in\left[a,b\right]}$ is

$\displaystyle \frac{dP_{ab}}{dt}=\frac{i\hbar}{2m}\left[\left.\frac{\partial\Psi}{\partial x}\Psi^*\right|_{a}^{b}\left.-\frac{\partial\Psi^*}{\partial x}\Psi\right|_{a}^{b}\right] \ \ \ \ \ (26)$

In any physical situation, the wave function goes to zero at infinity, so as we extend ${a\rightarrow-\infty}$ and ${b\rightarrow+\infty}$, we get ${\frac{dP}{dt}=0}$ which says simply that the probability of the particle being somewhere is constant (that is, 1). If you review the derivation of this result, it came about because we could replace the first order derivative with respect to time by the second order derivative with respect to ${x}$ by using the Schrödinger equation. With the Klein-Gordon equation and its second order time derivative, this derivation doesn’t work any more, with the result that we can’t state categorically that ${\frac{dP}{dt}=0}$. This is a fundamental violation of the statistical interpretation of the wave function.