Klein-Gordon equation

References: Mark Srednicki, Quantum Field Theory, (Cambridge University Press, 2007) – Chapter 1.

One possibility for converting the non-relativistic Schrödinger equation for a free particle to a relativistic equation is to replace the classical energy {H=\frac{\mathbf{p}^{2}}{2m}} by the relativistic energy

\displaystyle  H=\sqrt{\mathbf{p}^{2}c^{2}+m^{2}c^{4}} \ \ \ \ \ (1)

Using the quantum mechanical momentum operator {\mathbf{p}=-i\hbar\nabla}, the new equation becomes

\displaystyle  i\hbar\frac{\partial\psi\left(\mathbf{x},t\right)}{\partial t}=\sqrt{-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}}\psi\left(\mathbf{x},t\right) \ \ \ \ \ (2)

This equation as it stands involves an operator on the RHS that produces a differential equation that is very difficult to solve, and in fact gives rise to a number of other problems we won’t go into here.

However, this equation in its simplest form says that the operator {i\hbar\frac{\partial}{\partial t}} on the LHS is formally equivalent to the operator {\sqrt{-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}}} on the RHS. So we can, in effect, multiply this equation through by the same operator which has the effect of squaring the operators on both sides of the equation, giving

\displaystyle  -\hbar^{2}\frac{\partial^{2}}{\partial t^{2}}\psi\left(\mathbf{x},t\right)=\left(-\hbar^{2}c^{2}\nabla^{2}+m^{2}c^{4}\right)\psi\left(\mathbf{x},t\right) \ \ \ \ \ (3)

This is the Klein-Gordon equation. It is consistent with special relativity because it is Lorentz invariant. To see this, we need to show that it is invariant under a Lorentz transformation. The most general Lorentz transformation is

\displaystyle  \overline{x}^{\mu}=\Lambda_{\;\nu}^{\mu}x^{\nu}+a^{\mu} \ \ \ \ \ (4)

where {\Lambda} is the usual Lorentz transformation matrix which depends on the relative velocity of the two inertial frames and the vector {a^{\mu}} is a constant translation of coordinates. The notation {x^{\mu}} denotes an event in 4-d spacetime, so that

\displaystyle  x^{\mu}=\left(ct,x,y,z\right) \ \ \ \ \ (5)

One of the principles of relativity is that physics should look the same in all inertial frames. This means that the wave function {\psi} must have the same value for a given event (a specific set of time and space coordinate values) in both inertial frames. In other words, if {\bar{\psi}\left(\bar{x}\right)} is the wave function in the barred system at an event with coordinates {\bar{x}} in the barred system, we must have

\displaystyle  \bar{\psi}\left(\bar{x}\right)=\psi\left(x\right) \ \ \ \ \ (6)

where {x} consists of the spacetime coordinates for that event in the unbarred system.

Since we’re dealing only with special relativity, the metric tensor is the flat space metric {\eta_{\mu\nu}}

\displaystyle  \eta_{\mu\nu}=\left[\begin{array}{cccc} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right] \ \ \ \ \ (7)

which we’ve seen is invariant under Lorentz transformations. That is

\displaystyle   \eta_{\rho\sigma} \displaystyle  = \displaystyle  \eta_{\mu\nu}\Lambda_{\;\rho}^{\mu}\Lambda_{\;\sigma}^{\nu}\ \ \ \ \ (8)
\displaystyle  \eta^{\mu\nu} \displaystyle  = \displaystyle  \eta^{\rho\sigma}\Lambda_{\;\rho}^{\mu}\Lambda_{\;\sigma}^{\nu} \ \ \ \ \ (9)

This condition is derived from the requirement that the interval {ds^{2}=\eta_{\mu\nu}dx^{\mu}dx^{\nu}} is invariant.

To show that 3 is Lorentz invariant, we need to see how the derivatives transform. Using 5 we can rewrite the Klein-Gordon equation in a more compact form:

\displaystyle  \hbar^{2}\partial_{\mu}\partial^{\mu}\psi\left(x\right)=m^{2}c^{2}\psi\left(x\right) \ \ \ \ \ (10)

where the {x} in {\psi\left(x\right)} stands for the four components {x^{\mu}} and

\displaystyle   \partial_{\mu} \displaystyle  \equiv \displaystyle  \frac{\partial}{\partial x^{\mu}}=\left[\frac{\partial}{c\partial t},\nabla\right]\ \ \ \ \ (11)
\displaystyle  \partial^{\mu} \displaystyle  = \displaystyle  \eta^{\mu\nu}\partial_{\nu}=\left[-\frac{\partial}{c\partial t},\nabla\right] \ \ \ \ \ (12)

If we transform 10 into the barred frame, we have (since {\hbar}, {c} and {m} are all invariants)

\displaystyle  \hbar^{2}\bar{\partial}_{\mu}\bar{\partial}^{\mu}\bar{\psi}\left(\bar{x}\right)=m^{2}c^{2}\bar{\psi}\left(\bar{x}\right) \ \ \ \ \ (13)

Using 6 we have

\displaystyle  \hbar^{2}\bar{\partial}_{\mu}\bar{\partial}^{\mu}\psi\left(x\right)=m^{2}c^{2}\psi\left(x\right) \ \ \ \ \ (14)

so in order for this equation to be invariant, we need to show that {\bar{\partial}_{\mu}\bar{\partial}^{\mu}=\partial_{\mu}\partial^{\mu}}. Suppose that a single derivative transforms in the same way as the spacetime vector, that is

\displaystyle  \bar{\partial}^{\mu}=\Lambda_{\;\nu}^{\mu}\partial^{\nu} \ \ \ \ \ (15)

Then

\displaystyle   \bar{\partial}^{\rho}\bar{x}^{\sigma} \displaystyle  = \displaystyle  \left(\Lambda_{\;\mu}^{\rho}\partial^{\mu}\right)\left(\Lambda_{\;\nu}^{\sigma}x^{\nu}+a^{\sigma}\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\partial^{\mu}x^{\nu} \ \ \ \ \ (17)

[Since {\Lambda} and {a} are constants, their derivatives are zero.] Now we can use {\partial^{\mu}x^{\nu}=\eta^{\mu\nu}} (remember the minus sign in {\partial^{0}} from 12) and 9:

\displaystyle   \bar{\partial}^{\rho}\bar{x}^{\sigma} \displaystyle  = \displaystyle  \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\partial^{\mu}x^{\nu}\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \Lambda_{\;\mu}^{\rho}\Lambda_{\;\nu}^{\sigma}\eta^{\mu\nu}\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \eta^{\rho\sigma} \ \ \ \ \ (20)

Therefore, the relation 15 gives the correct value for {\bar{\partial}^{\rho}\bar{x}^{\sigma}}. This means

\displaystyle   \bar{\partial}_{\mu}\bar{\partial}^{\mu} \displaystyle  = \displaystyle  \eta_{\mu\nu}\bar{\partial}^{\nu}\bar{\partial}^{\mu}\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \eta_{\mu\nu}\Lambda_{\;\rho}^{\nu}\partial^{\rho}\Lambda_{\;\sigma}^{\mu}\partial^{\sigma}\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \eta_{\mu\nu}\Lambda_{\;\rho}^{\nu}\Lambda_{\;\sigma}^{\mu}\partial^{\rho}\partial^{\sigma}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \eta_{\rho\sigma}\partial^{\rho}\partial^{\sigma}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \partial_{\sigma}\partial^{\sigma} \ \ \ \ \ (25)

Thus the transformed Klein-Gordon equation 14 is equivalent to the original version 3, and the equation is Lorentz invariant.

The problem is that because of the second-order time derivative on the LHS, the equation doesn’t have the same form as the Schrödinger equation, where the time derivative is first order. One consequence of this is that the normalization condition of the wave function isn’t conserved. If you review the notion of probability current you’ll see that the rate of change of the probability of a particle being in the interval {x\in\left[a,b\right]} is

\displaystyle  \frac{dP_{ab}}{dt}=\frac{i\hbar}{2m}\left[\left.\frac{\partial\Psi}{\partial x}\Psi^*\right|_{a}^{b}\left.-\frac{\partial\Psi^*}{\partial x}\Psi\right|_{a}^{b}\right] \ \ \ \ \ (26)

In any physical situation, the wave function goes to zero at infinity, so as we extend {a\rightarrow-\infty} and {b\rightarrow+\infty}, we get {\frac{dP}{dt}=0} which says simply that the probability of the particle being somewhere is constant (that is, 1). If you review the derivation of this result, it came about because we could replace the first order derivative with respect to time by the second order derivative with respect to {x} by using the Schrödinger equation. With the Klein-Gordon equation and its second order time derivative, this derivation doesn’t work any more, with the result that we can’t state categorically that {\frac{dP}{dt}=0}. This is a fundamental violation of the statistical interpretation of the wave function.