# The uncertainty principle

Required math: calculus, complex numbers

Required physics: basics of quantum mechanics

Reference: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Sec 3.5.

The uncertainty principle is probably the most famous of the predictions of quantum mechanics, and it is usually known in the specific case of position and momentum: it is impossible to measure both position and momentum exactly at the same time.

In fact, this is just one example of a wider uncertainty principle, which can be derived algebraically. The principle relies on the generalized statistical interpretation of the wave function, which is that, for an operator ${\hat{A}}$, the expectation value (that is, the average value over a large (essentially infinite) number of measurements) is given by the integral

$\displaystyle \left\langle A\right\rangle =\int\Psi^*\hat{A}\Psi dx \ \ \ \ \ (1)$

where the integral extends over all space (or at least over all space accessible to the system). This integral is the one-dimensional case, but the three-dimensional case is easily written down by integrating over all three spatial coordinates.

This integral is usually written in the bra-ket notation to save space:

$\displaystyle \int\Psi^*\hat{A}\Psi dx\equiv\left\langle \Psi|\hat{A}\Psi\right\rangle \ \ \ \ \ (2)$

Any function placed in the ‘bra’ side (the left part of the bra-ket) is the complex conjugate of what’s written there, while the function in the ‘ket’ side is the unmodified function.

For an observable, the operator ${\hat{A}}$ is Hermitian, which means that the integral above is equivalent to

 $\displaystyle \int\Psi^*\hat{A}\Psi dx$ $\displaystyle =$ $\displaystyle \int(\hat{A}\Psi)^*\Psi dx\ \ \ \ \ (3)$ $\displaystyle \left\langle \Psi|\hat{A}\Psi\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle \hat{A}\Psi|\Psi\right\rangle \ \ \ \ \ (4)$

Using this principle, we can write down an expression for the variance of an observable. In statistics, the variance is defined as the average of the square of the difference from the mean. That is

$\displaystyle \sigma_{A}^{2}\equiv\left\langle (\hat{A}-\left\langle A\right\rangle )^{2}\right\rangle \ \ \ \ \ (5)$

Note that here the angle brackets denote the average, as opposed to the bra-ket notation where the angle brackets denote an integral. Averages can be distinguished from bra-kets since in the latter there is always a vertical bar in the middle to separate the bra from the ket.

By assumption, this can be calculated as

 $\displaystyle \sigma_{A}^{2}$ $\displaystyle =$ $\displaystyle \left\langle \Psi|(\hat{A}-\left\langle A\right\rangle )^{2}\Psi\right\rangle \ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle (\hat{A}-\left\langle A\right\rangle )\Psi|(\hat{A}-\left\langle A\right\rangle )\Psi\right\rangle \ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle \equiv$ $\displaystyle \left\langle f|f\right\rangle \ \ \ \ \ (8)$

where the function ${f}$ is defined by this equation.

Similarly, we can define the variance for another observable ${\hat{B}}$:

 $\displaystyle \sigma_{B}^{2}$ $\displaystyle =$ $\displaystyle \left\langle \Psi|(\hat{B}-\left\langle B\right\rangle )^{2}\Psi\right\rangle \ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle (\hat{B}-\left\langle B\right\rangle )\Psi|(\hat{B}-\left\langle B\right\rangle )\Psi\right\rangle \ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle \equiv$ $\displaystyle \left\langle g|g\right\rangle \ \ \ \ \ (11)$

Using the integral form of the Schwarz inequality (the proof of which would take us too far afield here), we can write

 $\displaystyle \sigma_{A}^{2}\sigma_{B}^{2}$ $\displaystyle =$ $\displaystyle \left\langle f|f\right\rangle \left\langle g|g\right\rangle \ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle \ge$ $\displaystyle |\left\langle f|g\right\rangle |^{2} \ \ \ \ \ (13)$

For any complex number ${z=x+iy}$, we have

 $\displaystyle \left|z\right|^{2}$ $\displaystyle =$ $\displaystyle x^{2}+y^{2}\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle \ge$ $\displaystyle y^{2}\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\frac{1}{2i}(z-z^*)\right]^{2} \ \ \ \ \ (16)$

Letting ${z=\left\langle f|g\right\rangle }$, we can combine this with the Schwarz inequality and get

$\displaystyle \sigma_{A}^{2}\sigma_{B}^{2}\ge\left[\frac{1}{2i}(\left\langle f|g\right\rangle -\left\langle g|f\right\rangle )\right]^{2} \ \ \ \ \ (17)$

Now we need to work out the two bra-ket terms in terms of the original operators. Remember that the mean value ${\left\langle A\right\rangle }$ of an operator is just a number, so it can be taken outside the bra-ket.

 $\displaystyle \left\langle f|g\right\rangle$ $\displaystyle =$ $\displaystyle \left\langle (\hat{A}-\left\langle A\right\rangle )\Psi|(\hat{B}-\left\langle B\right\rangle )\Psi\right\rangle \ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \Psi|(\hat{A}-\left\langle A\right\rangle )(\hat{B}-\left\langle B\right\rangle )\Psi\right\rangle \ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \Psi|(\hat{A}\hat{B}-\hat{A}\left\langle B\right\rangle -\left\langle A\right\rangle \hat{B}+\left\langle A\right\rangle \left\langle B\right\rangle )\Psi\right\rangle \ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \hat{A}\hat{B}\right\rangle -\left\langle A\right\rangle \left\langle B\right\rangle -\left\langle A\right\rangle \left\langle B\right\rangle +\left\langle A\right\rangle \left\langle B\right\rangle \ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\langle \hat{A}\hat{B}\right\rangle -\left\langle A\right\rangle \left\langle B\right\rangle \ \ \ \ \ (22)$

By the same argument, we can work out ${\left\langle g|f\right\rangle }$, or we can obtain it merely by switching ${A}$ and ${B}$ in the result above.

$\displaystyle \left\langle g|f\right\rangle =\left\langle \hat{B}\hat{A}\right\rangle -\left\langle A\right\rangle \left\langle B\right\rangle \ \ \ \ \ (23)$

Note that ${\left\langle \hat{B}\hat{A}\right\rangle }$ is not necessarily the same as ${\left\langle \hat{A}\hat{B}\right\rangle }$ since in general the two operators do not commute. In fact, this is nub of the argument, since plugging these results back into the Schwarz inequality we get

 $\displaystyle \sigma_{A}^{2}\sigma_{B}^{2}$ $\displaystyle \ge$ $\displaystyle \left[\frac{1}{2i}\left(\left\langle f|g\right\rangle -\left\langle g|f\right\rangle \right)\right]^{2}\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\frac{1}{2i}\left(\left\langle \hat{A}\hat{B}\right\rangle -\left\langle \hat{B}\hat{A}\right\rangle \right)\right]^{2} \ \ \ \ \ (25)$

In terms of the commutator of the two operators:

$\displaystyle \left[\hat{A},\hat{B}\right]\equiv\hat{A}\hat{B}-\hat{B}\hat{A} \ \ \ \ \ (26)$

we have the generalized uncertainty principle:

$\displaystyle \sigma_{A}^{2}\sigma_{B}^{2}\ge\left(\frac{1}{2i}\left\langle [\hat{A},\hat{B}]\right\rangle \right)^{2} \ \ \ \ \ (27)$

Note that since the mean value of a commutator is the difference between a quantity ${\left\langle f|g\right\rangle }$ and its complex conjugate ${\left\langle g|f\right\rangle }$, it is always a pure imaginary number with zero real part, so the quantity ${\frac{1}{2i}\left\langle [\hat{A},\hat{B}]\right\rangle }$ is always real, and its square is therefore always non-negative (it could be zero if the two operators commute). Thus this inequality says that for any two observable operators that do not commute, there will always be a lower limit on the accuracy with which they can be measured simultaneously.

As an example, we can work out the most fundamental commutator of all: that of position and momentum: ${x}$ and ${p}$. Since ${p}$ is a differential operator, we need a dummy function for it to operate on.

 $\displaystyle \left[x,p\right]f$ $\displaystyle =$ $\displaystyle x\frac{\hbar}{i}\frac{df}{dx}-\frac{\hbar}{i}\frac{d(xf)}{dx}\ \ \ \ \ (28)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\hbar}{i}\left(x\frac{df}{dx}-x\frac{df}{dx}-f\right)\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar f \ \ \ \ \ (30)$

Thus the commutator on its own is

$\displaystyle \left[x,p\right]=i\hbar \ \ \ \ \ (31)$

Plugging this into the uncertainty principle, we get the well-known result

 $\displaystyle \sigma_{x}^{2}\sigma_{p}^{2}$ $\displaystyle \ge$ $\displaystyle \left(\frac{1}{2i}\left\langle [x,p]\right\rangle \right)^{2}\ \ \ \ \ (32)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\hbar^{2}}{4} \ \ \ \ \ (33)$

or in terms of the standard deviation (the square root of the variance):

$\displaystyle \sigma_{x}\sigma_{p}\ge\frac{\hbar}{2} \ \ \ \ \ (34)$

Thus Planck’s constant (divided by ${4\pi}$) servers as a lower bound on the accuracy with which position and momentum can be measured at the same time.

Similar relations exist for any pair of observables that don’t commute, and in any attempt to conceptualize a result in quantum mechanics, it must be remembered that any such pair cannot be visualized simultaneously. In the case of angular momentum, for example, any pair of its components does not commute, which means that the three components of angular momentum cannot be visualized, meaning that there is no such thing as a strict angular momentum vector.