# Dirac equation

References: Mark Srednicki, Quantum Field Theory, (Cambridge University Press, 2007) – Chapter 1, Problem 1.1.

The Klein-Gordon equation is an early attempt at a relativistic quantum theory, but it contains a second-order time derivative which leads to probability not being conserved over time. Dirac proposed another equation that attempts to solve this problem for particles of spin 1/2. The Dirac equation is essentially a modification of the Schrödinger equation:

$\displaystyle i\hbar\frac{\partial}{\partial t}\psi_{a}\left(x\right)=\left[-i\hbar c\left(\alpha^{j}\right)_{ab}\partial_{j}+mc^{2}\left(\beta\right)_{ab}\right]\psi_{b}\left(x\right) \ \ \ \ \ (1)$

Here, ${\psi}$ is now a vector in spin space with components ${\psi_{a}}$. The objects ${\beta}$ and ${\alpha^{j}}$ (for ${j=1,2,3}$) are square matrices (where the subscript ${ab}$ indicates the component of the matrix being considered), also in spin space, and repeated indices are summed over spatial coordinates only. [We won’t worry about how Dirac arrived at this equation for now; we’ll just accept it and see where it leads.]

To make this equation formally equivalent to the Schrödinger equation, the hamiltonian operator ${H}$ on the RHS must now be a matrix. We can also use the definition of the momentum operator ${P_{j}=-i\hbar\partial_{j}}$ to get

$\displaystyle H_{ab}=cP_{j}\left(\alpha^{j}\right)_{ab}+mc^{2}\left(\beta\right)_{ab} \ \ \ \ \ (2)$

This might not look much like the relativistic energy:

$\displaystyle E=\sqrt{p^{2}c^{2}+m^{2}c^{4}} \ \ \ \ \ (3)$

but if we square 2 (remembering that matrix products need not commute), we have

$\displaystyle \left(H^{2}\right)_{ab}=c^{2}P_{j}P_{k}\left(\alpha^{j}\alpha^{k}\right)_{ab}+mc^{3}P_{j}\left(\alpha^{j}\beta+\beta\alpha^{j}\right)_{ab}+m^{2}c^{4}\left(\beta^{2}\right)_{ab} \ \ \ \ \ (4)$

We can define the anticommutator as

$\displaystyle \left\{ A,B\right\} \equiv AB+BA \ \ \ \ \ (5)$

We can write the first term on the RHS of 4 as

$\displaystyle c^{2}P_{j}P_{k}\left(\alpha^{j}\alpha^{k}\right)_{ab}=\frac{1}{2}c^{2}P_{j}P_{k}\left\{ \alpha^{j},\alpha^{k}\right\} \ \ \ \ \ (6)$

so we get

$\displaystyle \left(H^{2}\right)_{ab}=\frac{1}{2}c^{2}P_{j}P_{k}\left\{ \alpha^{j},\alpha^{k}\right\} +mc^{3}P_{j}\left\{ \alpha^{j},\beta\right\} +m^{2}c^{4}\left(\beta^{2}\right)_{ab} \ \ \ \ \ (7)$

In order to make this equal to ${E^{2}}$, we need the matrices ${\alpha^{j}}$ and ${\beta}$ to satisfy the conditions:

 $\displaystyle \left\{ \alpha^{j},\alpha^{k}\right\}$ $\displaystyle =$ $\displaystyle 2\delta^{jk}\delta_{ab}\ \ \ \ \ (8)$ $\displaystyle \left\{ \alpha^{j},\beta\right\}$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (9)$ $\displaystyle \left(\beta^{2}\right)_{ab}$ $\displaystyle =$ $\displaystyle \delta_{ab} \ \ \ \ \ (10)$

The first condition requires the anticommutator of ${\alpha^{j}}$ and ${\alpha^{k}}$ to be zero unless ${j=k}$, in which case the anticommutator gives the identity matrix. Remember that the superscripts ${j}$ and ${k}$ specify which matrix we’re talking about, while the subscripts ${ab}$ indicate the component of the matrix. The conditions aren’t derived; rather they are imposed to make the energy come out right. With these conditions, we have

$\displaystyle \left(H^{2}\right)_{ab}=\left(\mathbf{P}^{2}c^{2}+m^{2}c^{4}\right)\delta_{ab} \ \ \ \ \ (11)$

which gives the correct operator for the square of the energy.

The question arises as to what these matrices ${\alpha^{j}}$ and ${\beta}$ are. One candidate is the set of three Pauli spin matrices

 $\displaystyle \sigma_{x}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (12)$ $\displaystyle \sigma_{y}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\ \ \ \ \ (13)$ $\displaystyle \sigma_{z}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right] \ \ \ \ \ (14)$

By direct calculation, we see that ${\left\{ \sigma^{i},\sigma^{j}\right\} =2\delta^{ij}}$. For example

 $\displaystyle \left\{ \sigma_{x},\sigma_{y}\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]+\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right]\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} i & 0\\ 0 & -i \end{array}\right]+\left[\begin{array}{cc} -i & 0\\ 0 & i \end{array}\right]\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (17)$ $\displaystyle \left\{ \sigma_{x},\sigma_{x}\right\}$ $\displaystyle =$ $\displaystyle 2\sigma_{x}^{2}\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2\left[\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right] \ \ \ \ \ (19)$

and so on. However, in order to satisfy 9, we need to find a single matrix that anticommutes with all 3 spin matrices. We get

 $\displaystyle \left\{ \sigma_{x},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} \beta_{21} & \beta_{22}\\ \beta_{11} & \beta_{12} \end{array}\right]+\left[\begin{array}{cc} \beta_{12} & \beta_{11}\\ \beta_{22} & \beta_{21} \end{array}\right]\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (21)$

This gives

 $\displaystyle \beta_{12}$ $\displaystyle =$ $\displaystyle -\beta_{21}\equiv\gamma\ \ \ \ \ (22)$ $\displaystyle \beta_{11}$ $\displaystyle =$ $\displaystyle -\beta_{22}\equiv\epsilon \ \ \ \ \ (23)$

We then get

 $\displaystyle \left\{ \sigma_{z},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\left[\begin{array}{cc} \epsilon & \gamma\\ -\gamma & -\epsilon \end{array}\right]+\left[\begin{array}{cc} \epsilon & \gamma\\ -\gamma & -\epsilon \end{array}\right]\left[\begin{array}{cc} 1 & 0\\ 0 & -1 \end{array}\right]\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} \epsilon & \gamma\\ \gamma & \epsilon \end{array}\right]+\left[\begin{array}{cc} \epsilon & -\gamma\\ -\gamma & \epsilon \end{array}\right]\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (26)$

This gives

$\displaystyle \epsilon=0 \ \ \ \ \ (27)$

So finally

 $\displaystyle \left\{ \sigma_{y},\beta\right\}$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\left[\begin{array}{cc} 0 & \gamma\\ -\gamma & 0 \end{array}\right]+\left[\begin{array}{cc} 0 & \gamma\\ -\gamma & 0 \end{array}\right]\left[\begin{array}{cc} 0 & -i\\ i & 0 \end{array}\right]\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} i\gamma & 0\\ 0 & i\gamma \end{array}\right]+\left[\begin{array}{cc} i\gamma & 0\\ 0 & i\gamma \end{array}\right]\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\begin{array}{cc} 0 & 0\\ 0 & 0 \end{array}\right] \ \ \ \ \ (30)$

So ${\gamma=0}$ resulting in ${\left(\beta\right)_{ab}=0}$. Thus there is no non-zero matrix ${\beta}$ that anticommutes with all 3 of the Pauli spin matrices.

So what can we say about the Dirac matrices? From 10, we see that the eigenvalues of ${\beta^{2}=I}$ are all 1, so the eigenvalues of ${\beta}$ must be ${\pm1}$.

The trace (sum of the diagonal elements) of a matrix is equal to the sum of its eigenvalues (theorem from matrix algebra). To find the trace of ${\beta}$, we can use the anticommutators 8 and 9, together with another theorem from matrix algebra which states that ${\mbox{tr}\left(AB\right)=\mbox{tr}\left(BA\right)}$ for any square matrices ${A}$ and ${B}$ of the same order.

 $\displaystyle \mbox{tr}\left(\alpha_{1}^{2}\beta\right)$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\alpha_{1}\left(\alpha_{1}\beta\right)\right)\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\left(\alpha_{1}\beta\right)\alpha_{1}\right) \ \ \ \ \ (32)$

However, from 9, ${\alpha_{1}\beta=-\beta\alpha_{1}}$ and from 8, ${\alpha_{1}^{2}=I}$ (the identity matrix), so

 $\displaystyle \mbox{tr}\left(\alpha_{1}^{2}\beta\right)$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\beta\right)\ \ \ \ \ (33)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \mbox{tr}\left(\left(\alpha_{1}\beta\right)\alpha_{1}\right)\ \ \ \ \ (34)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\left(\beta\alpha_{1}\right)\alpha_{1}\right)\ \ \ \ \ (35)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\beta\alpha_{1}^{2}\right)\ \ \ \ \ (36)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\mbox{tr}\left(\beta\right) \ \ \ \ \ (37)$

Hence ${\mbox{tr}\left(\beta\right)=-\mbox{tr}\left(\beta\right)=0}$, so ${\beta}$ must have an equal number of ${+1}$ and ${-1}$ eigenvalues. In other words, ${\beta}$ must be even dimensional, so the smallest size is ${4\times4}$.

We can also find the trace of ${\alpha^{j}}$ by starting with ${\mbox{tr}\left(\alpha^{j}\beta^{2}\right)}$ and following through the same steps as above (using ${\beta^{2}=I}$) to show that ${\mbox{tr}\left(\alpha^{j}\right)=-\mbox{tr}\left(\alpha^{j}\right)=0}$.