# Poisson brackets to commutators: classical to quantum

Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 7.4, Exercise 7.4.7.

The postulates of quantum mechanics that we described earlier included specifications for the matrix elements of position ${X}$ and momentum ${P}$ in position space:

 $\displaystyle \left\langle x\left|X\right|x^{\prime}\right\rangle$ $\displaystyle =$ $\displaystyle x\delta\left(x-x^{\prime}\right)\ \ \ \ \ (1)$ $\displaystyle \left\langle x\left|P\right|x^{\prime}\right\rangle$ $\displaystyle =$ $\displaystyle -i\hbar\delta^{\prime}\left(x-x^{\prime}\right) \ \ \ \ \ (2)$

A more fundamental form of this postulate is to specify the commutation relation between ${X}$ and ${P}$, which is independent of the basis and is

$\displaystyle \left[X,P\right]=i\hbar \ \ \ \ \ (3)$

This allows the construction of explicit forms of the operators in other bases, such as the momentum basis, where

 $\displaystyle X$ $\displaystyle =$ $\displaystyle i\hbar\frac{d}{dp}\ \ \ \ \ (4)$ $\displaystyle P$ $\displaystyle =$ $\displaystyle p \ \ \ \ \ (5)$

We can verify this by calculating the commutator by applying it to a function ${f\left(p\right)}$:

 $\displaystyle \left[X,P\right]f$ $\displaystyle =$ $\displaystyle i\hbar\frac{d}{dp}\left(pf\left(p\right)\right)-i\hbar p\frac{d}{dp}f\left(p\right)\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar f\left(p\right)+i\hbar p\frac{d}{dp}f\left(p\right)-i\hbar p\frac{d}{dp}f\left(p\right)\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar f\left(p\right) \ \ \ \ \ (8)$

Thus 3 is satisfied in the momentum basis as well.

The standard recipe for converting a classical system to a quantum one is to first calculate the Poisson bracket for two physical quantities in the classical system, which gives

$\displaystyle \left\{ \omega,\lambda\right\} =\sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\lambda}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\lambda}{\partial q_{i}}\right) \ \ \ \ \ (9)$

where ${q_{i}}$ and ${p_{i}}$ are the canonical coordinates and momenta. To convert to a quantum commutator, we replace the classical quantities by their quantum operator equivalents and the Poisson bracket by ${i\hbar}$ times the corresponding commutator. That is

$\displaystyle \left[\Omega,\Lambda\right]=i\hbar\left\{ \omega,\lambda\right\} \ \ \ \ \ (10)$

For the case of ${X}$ and ${P}$, we have, in classical mechanics in one dimension

$\displaystyle \left\{ x,p\right\} =\frac{\partial x}{\partial x}\frac{\partial p}{\partial p}-\frac{\partial x}{\partial p}\frac{\partial p}{\partial x}=1 \ \ \ \ \ (11)$

so the quantum commutator is given by 3.

For other quantities, we can use the theorems on the Poisson brackets to reduce them:

 $\displaystyle \left\{ \omega,\lambda\right\}$ $\displaystyle =$ $\displaystyle -\left\{ \lambda,\omega\right\} \ \ \ \ \ (12)$ $\displaystyle \left\{ \omega,\lambda+\sigma\right\}$ $\displaystyle =$ $\displaystyle \left\{ \omega,\lambda\right\} +\left\{ \omega,\sigma\right\} \ \ \ \ \ (13)$ $\displaystyle \left\{ \omega,\lambda\sigma\right\}$ $\displaystyle =$ $\displaystyle \left\{ \omega,\lambda\right\} \sigma+\left\{ \omega,\sigma\right\} \lambda \ \ \ \ \ (14)$

Quantum commutators obey similar rules

 $\displaystyle \left[\Omega,\Lambda\right]$ $\displaystyle =$ $\displaystyle -\left[\Lambda,\Omega\right]\ \ \ \ \ (15)$ $\displaystyle \left[\Omega,\Lambda+\Gamma\right]$ $\displaystyle =$ $\displaystyle \left[\Omega,\Lambda\right]+\left[\Omega,\Gamma\right]\ \ \ \ \ (16)$ $\displaystyle \left[\Omega\Lambda,\Gamma\right]$ $\displaystyle =$ $\displaystyle \Omega\left[\Lambda,\Gamma\right]+\left[\Omega,\Gamma\right]\Lambda \ \ \ \ \ (17)$

The main difference between Poisson brackets and commutators is that, for the latter, the order of the operators in the last equation can make a difference. That is, in 14 we could also have written

$\displaystyle \left\{ \omega,\lambda\sigma\right\} =\sigma\left\{ \omega,\lambda\right\} +\lambda\left\{ \omega,\sigma\right\} \ \ \ \ \ (18)$

since all three quantities are numerical (not operators), so multiplication commutes. In 17 it is not true in general that, for example

$\displaystyle \Omega\left[\Lambda,\Gamma\right]+\left[\Omega,\Gamma\right]\Lambda=\left[\Lambda,\Gamma\right]\Omega+\left[\Omega,\Gamma\right]\Lambda \ \ \ \ \ (19)$

The conversion from classical to quantum mechanics can then be achieved in general by replacing

$\displaystyle \left\{ \omega\left(x,p\right),\lambda\left(x,p\right)\right\} =\gamma\left(x,p\right) \ \ \ \ \ (20)$

by

$\displaystyle \left[\Omega\left(X,P\right),\Lambda\left(X,P\right)\right]=i\hbar\Gamma\left(X,P\right) \ \ \ \ \ (21)$

where each of the operators in the last equation is obtained by replacing ${x}$ in the first equation by ${X}$ and ${p}$ by ${P}$. We do need to be careful with the ordering of the operators in the quantum version, however.

As an example, suppose we have

 $\displaystyle \Omega$ $\displaystyle =$ $\displaystyle X\ \ \ \ \ (22)$ $\displaystyle \Lambda$ $\displaystyle =$ $\displaystyle X^{2}+P^{2} \ \ \ \ \ (23)$

In the classical version, we calculate the Poisson bracket

 $\displaystyle \left\{ \omega,\lambda\right\}$ $\displaystyle =$ $\displaystyle \left\{ x,x^{2}+p^{2}\right\} \ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left\{ x,x^{2}\right\} +\left\{ x,p^{2}\right\} \ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0+2\left\{ x,p\right\} p\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2p \ \ \ \ \ (27)$

Thus, by our rule above, the quantum version should be

$\displaystyle \left[\Omega,\Lambda\right]=2i\hbar P \ \ \ \ \ (28)$

We can verify this using 17

 $\displaystyle \left[X,X^{2}+P^{2}\right]$ $\displaystyle =$ $\displaystyle \left[X,X^{2}\right]+\left[X,P^{2}\right]\ \ \ \ \ (29)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0-\left[P^{2},X\right]\ \ \ \ \ (30)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -P\left[P,X\right]-\left[P,X\right]P\ \ \ \ \ (31)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -P\left(-i\hbar\right)-\left(-i\hbar\right)P\ \ \ \ \ (32)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2i\hbar P \ \ \ \ \ (33)$

In this case, there is no ordering ambiguity in the quantum version, since ${\left[X,P\right]=i\hbar}$ is just a number.

For a second example, suppose we have

 $\displaystyle \Omega$ $\displaystyle =$ $\displaystyle X^{2}\ \ \ \ \ (34)$ $\displaystyle \Lambda$ $\displaystyle =$ $\displaystyle P^{2} \ \ \ \ \ (35)$

The classical version gives us, using the relations 14, 11 and 27

 $\displaystyle \left\{ x^{2},p^{2}\right\}$ $\displaystyle =$ $\displaystyle -\left\{ p^{2},x^{2}\right\} \ \ \ \ \ (36)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -2\left\{ p^{2},x\right\} x\ \ \ \ \ (37)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 2\left\{ x,p^{2}\right\} x\ \ \ \ \ (38)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 4px \ \ \ \ \ (39)$

In the classical case, this result is the same as ${4xp}$, but because ${X}$ and ${P}$ don’t commute in the quantum form, we need to be careful about the ordering.

We can do the calculation:

$\displaystyle \left[X^{2},P^{2}\right]=X\left[X,P^{2}\right]+\left[X,P^{2}\right]X \ \ \ \ \ (40)$

From 33 we have

$\displaystyle \left[X,P^{2}\right]=2i\hbar P \ \ \ \ \ (41)$

so we get

$\displaystyle \left[X^{2},P^{2}\right]=2i\hbar\left(XP+PX\right) \ \ \ \ \ (42)$

Thus if the Poisson bracket involves a product of ${p}$ and ${x}$, this should be replaced by

$\displaystyle xp\mbox{ or }px\rightarrow\frac{1}{2}\left(XP+PX\right) \ \ \ \ \ (43)$

in the quantum version.

# Linear operators & commutators

References: edX online course MIT 8.05.1x Week 3.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 3.

Having looked at some of the properties of a vector space, we can now look at linear maps. A linear map ${T}$ is defined as a function that maps one vector space ${V}$ into another (possibly the same) vector space ${W}$, written as

$\displaystyle T:V\rightarrow W \ \ \ \ \ (1)$

The linear map ${T}$ must satisfy the two properties

1. Additivity: ${T\left(u+v\right)=Tu+Tv}$ for all ${u,v\in V}$.
2. Homogeneity: ${T\left(\lambda v\right)=\lambda\left(Tv\right)}$ for all ${\lambda\in\mathbb{F}}$ and all ${v\in V}$. As usual, the field ${\mathbb{F}}$ is either the set of real or complex numbers.

This definition of a linear map is general in the sense that the two vector spaces ${V}$ and ${W}$ can be any two vector spaces. In physics, it’s more common to have ${V=W}$, and in such a case, the linear map ${T}$ is called a linear operator.

With a couple of extra definitions, the set ${\mathcal{L}\left(V\right)}$ of all linear operators on ${V}$ is itself a vector space, with the operators being the vectors. In order for this to be true, we need the following:

1. Zero operator: A zero operator, written as just 0 (the same symbol now being used for three distinct objects: the scalar 0, the vector 0 and the operator 0; again the correct meaning is usually easy to deduce from the context) which has the property that the result of acting with 0 on any vector produces the 0 vector. That is ${0v=0}$, where the 0 on the LHS is the zero operator and the 0 on the RHS is the zero vector.
2. Identity operator: An identity operator ${I}$ (sometimes written as 1) leaves any vector unchanged, so that ${Iv=v}$ for all ${v\in V}$.

With these definitions, ${\mathcal{L}\left(V\right)}$ is now a vector space, since it satisfies the distributive (additivity) and scalar multiplication (homogeneity) properties, contains an additive identity (the zero operator) and a multiplicative identity (the identity operator).

In addition, there is a natural definition of the multiplication of two linear operators ${S}$ and ${T}$, written as ${ST}$. When a product operates on a vector ${v\in V}$, we just operate from right to left in succession, so that

$\displaystyle \left(ST\right)v=S\left(Tv\right) \ \ \ \ \ (2)$

The product of two operators produces another operator also in ${\mathcal{L}\left(V\right)}$, since this product also satisfies additivity and homogeneity:

 $\displaystyle \left(ST\right)\left(u+v\right)$ $\displaystyle =$ $\displaystyle S\left(T\left(u+v\right)\right)\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle S\left(Tu+Tv\right)\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle STu+STv\ \ \ \ \ (5)$ $\displaystyle \left(ST\right)\left(\lambda v\right)$ $\displaystyle =$ $\displaystyle S\left(T\left(\lambda v\right)\right)\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle S\left(\lambda Tv\right)\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \lambda S\left(Tv\right)\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \lambda\left(ST\right)v \ \ \ \ \ (9)$

A very important property of operator multiplication is that it is not commutative. We’ve already seen many examples of this in our journey through quantum mechanics with operators such as position and momentum, angular momentum and so on. The non-commutativity is a fundamental mathematical property however, and can be seen in other examples that have nothing to do with quantum theory.

For example, consider the left shift operator ${L}$ and right shift operator ${R}$, defined to act on the vector space consisting of infinite sequences of numbers. That is, our vector space ${V}$ is such that

$\displaystyle v=\left(x_{1},x_{2},x_{3},\ldots\right) \ \ \ \ \ (10)$

where ${x_{i}\in\mathbb{F}}$. The shift operators have the following effects:

 $\displaystyle Lv$ $\displaystyle =$ $\displaystyle \left(x_{2},x_{3},\ldots\right)\ \ \ \ \ (11)$ $\displaystyle Rv$ $\displaystyle =$ $\displaystyle \left(0,x_{1},x_{2},x_{3},\ldots\right) \ \ \ \ \ (12)$

The ${L}$ operator removes the first element in the sequence, while the ${R}$ operator inserts a 0 (number!) as the new first element in the sequence. Note that 0 is the only number we could insert into a sequence in order that ${R}$ be a linear operator, since from additivity above, we must have ${R0=0}$. That is, if we start with ${v=0}$ (the vector all of whose elements ${x_{i}=0}$), then ${R0}$ must also give the zero vector.

The two products ${LR}$ and ${RL}$ produce different results:

 $\displaystyle LRv$ $\displaystyle =$ $\displaystyle L\left(0,x_{1},x_{2},x_{3},\ldots\right)=\left(x_{1},x_{2},x_{3},\ldots\right)=v\ \ \ \ \ (13)$ $\displaystyle RLv$ $\displaystyle =$ $\displaystyle R\left(x_{2},x_{3},\ldots\right)=\left(0,x_{2},x_{3},\ldots\right)\ne v \ \ \ \ \ (14)$

The difference ${\left[L,R\right]\equiv LR-RL}$ is called the commutator of the two operators ${L}$ and ${R}$. If we introduce the operator which projects out the first element in the sequence:

 $\displaystyle P_{1}v$ $\displaystyle \equiv$ $\displaystyle P_{1}\left(x_{1},x_{2},x_{3},\ldots\right)=\left(x_{1},0,0,\ldots\right)\ \ \ \ \ (15)$ $\displaystyle Iv-P_{1}v$ $\displaystyle =$ $\displaystyle \left(x_{1},x_{2},x_{3},\ldots\right)-\left(x_{1},0,0,\ldots\right)\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(0,x_{2},x_{3},\ldots\right) \ \ \ \ \ (17)$

then we have

 $\displaystyle \left[L,R\right]v$ $\displaystyle =$ $\displaystyle Iv-\left(Iv-P_{1}v\right)\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle P_{1}v \ \ \ \ \ (19)$

# Lie rotations in higher dimensions

References: Anthony Zee, Einstein Gravity in a Nutshell, (Princeton University Press, 2013) – Chapter I.3, Problem 4.

We can generalize Lie’s method of generating rotations to any number ${D}$ of dimensions. [This post follows Appendix 2 in Zee’s chapter I.3, which I believe contains a few typos. I’ll try to correct them here.]

To review the last post, a rotation should leave the dot product of two vectors unchanged. If we consider an infinitesimal rotation given by the matrix ${R=I+A}$, where ${I}$ is the identity matrix and ${A}$ is a matrix containing infinitesimal quantities, then the invariance of the dot product leads to the conclusion that ${R^{T}R=I}$ which requires (to first order in ${A}$) that ${A^{T}=-A}$, that is, that ${A}$ is antisymmetric. In ${D}$ dimensions, there are ${\frac{1}{2}D\left(D-1\right)}$ independent antisymmetric matrices, which can be written down by choosing a row ${m}$ and a column ${n (since diagonal elements are all zero in an antisymmetric matrix), setting the element ${\mathcal{J}^{mn}=1}$ and the element ${\mathcal{J}^{nm}=-1}$. Then any antisymmetric matrix ${A}$ can be decomposed into a linear combination of the ${\mathcal{J}}$ matrices. To distinguish the ${\mathcal{J}}$s, we’ll given them a subscript ${\left(mn\right)}$ to label which row and column are non-zero. For example in 3-d, we have

 $\displaystyle \mathcal{J}_{\left(32\right)}$ $\displaystyle =$ $\displaystyle \left(\begin{array}{ccc} 0 & 0 & 0\\ 0 & 0 & -1\\ 0 & 1 & 0 \end{array}\right)\ \ \ \ \ (1)$ $\displaystyle \mathcal{J}_{\left(31\right)}$ $\displaystyle =$ $\displaystyle \left(\begin{array}{ccc} 0 & 0 & -1\\ 0 & 0 & 0\\ 1 & 0 & 0 \end{array}\right)\ \ \ \ \ (2)$ $\displaystyle \mathcal{J}_{\left(21\right)}$ $\displaystyle =$ $\displaystyle \left(\begin{array}{ccc} 0 & -1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 0 \end{array}\right) \ \ \ \ \ (3)$

[Note that these matrices aren’t quite the same as those we used in the last post, since ${\mathcal{J}_{\left(32\right)}=-\mathcal{J}_{x}}$ and ${\mathcal{J}_{\left(21\right)}=-\mathcal{J}_{z}}$.] The subscript labels the entire matrix and not just a single element within a matrix. To pick out an individual element, we’ll use superscript indices, so that, for example, ${\mathcal{J}_{\left(32\right)}^{23}=-1}$ is the element of ${\mathcal{J}_{\left(32\right)}}$ in row 2 and column 3.

Because these matrices are related to the angular momentum operators in quantum mechanics, it’s customary to make them into hermitian operators, which, for a matrix, means that the complex conjugate of the transpose (the hermitian conjuage) is the same as the original matrix. That is, for a matrix ${J_{\left(mn\right)}}$ we have ${\left(J_{\left(mn\right)}^{T}\right)^*=J_{\left(mn\right)}}$. Since the ${\mathcal{J}_{\left(mn\right)}}$ are antisymmetric and real, their hermitian conjugates are antisymmetric, so they aren’t hermitian matrices. We can convert them into hermitian matrices by multiplying them by a multiple of ${i=\sqrt{-1}}$; by convention this multiple is taken to be ${-i}$. Thus we have the hermitian matrices

$\displaystyle J_{\left(mn\right)}\equiv-i\mathcal{J}_{\left(mn\right)} \ \ \ \ \ (4)$

We can write out the ${J_{\left(mn\right)}}$ in a general formula using the Kronecker delta:

$\displaystyle J_{\left(mn\right)}^{ij}=-i\left(\delta^{mi}\delta^{nj}-\delta^{mj}\delta^{ni}\right) \ \ \ \ \ (5)$

To verify this formula, remember that ${J_{\left(mn\right)}^{mn}=-J_{\left(mn\right)}^{nm}=-i}$ with all other elements being zero. The first term in 5 is non-zero only if ${m=i}$ and ${n=j}$, while the second term is non-zero only if ${m=j}$ and ${n=i}$, so the formula works.

[In the paragraph following Zee’s equation 19, he says “there are only ${\frac{1}{2}D\left(D-1\right)}$ real antisymmetric ${D}$-by-${D}$ matrices ${J_{\left(mn\right)}}$“. The last ${J_{\left(mn\right)}}$ should be ${\mathcal{J}_{\left(mn\right)}}$ since the ${J_{\left(mn\right)}}$ contain purely imaginary elements, not real ones.]

To generate a rotation in ${D}$ dimensions, we can use Lie’s method of considering infinitesimal rotations ${R=I+A}$, where ${A}$ is an infinitesimal linear combination of the ${J_{\left(mn\right)}}$. Since ${A}$ is real, we have

$\displaystyle A=i\sum_{i}\theta_{i}J_{i} \ \ \ \ \ (6)$

for some real values ${\theta_{i}}$.

The next stage in the argument isn’t entirely clear to me, probably because I haven’t seen the use to which rotations are put in the rest of Zee’s book. However, let’s plow on for the moment.

We saw in the previous post that in 3-d, rotations about different axes do not commute; a rotation about ${x}$ and then ${y}$ will leave you in a different orientation than a rotation about ${y}$ and then ${x}$. Generalizing to ${D}$ dimensions, suppose we have two infinitesimal rotations ${R_{1}=I+A+\mathcal{O}\left(A^{2}\right)}$ and ${R_{2}=I+B+\mathcal{O}\left(B^{2}\right)}$, where ${A}$ and ${B}$ are infinitesimal, antisymmetric matrices as before. Then if we apply ${R_{2}}$ first, then ${R_{1}}$, the overall rotation is given by

 $\displaystyle R_{1}R_{2}$ $\displaystyle =$ $\displaystyle \left(I+A+\mathcal{O}\left(A^{2}\right)\right)\left(I+B+\mathcal{O}\left(B^{2}\right)\right)\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I+A+B+AB+\mathcal{O}\left(A^{2},B^{2}\right) \ \ \ \ \ (8)$

Switching the order [note that Zee has a typo here: he says ${R_{2}R_{1}\simeq\left(I+A\right)\left(I+B\right)}$; it should be ${R_{2}R_{1}\simeq\left(I+B\right)\left(I+A\right)}$] we get

 $\displaystyle R_{2}R_{1}$ $\displaystyle =$ $\displaystyle \left(I+B+\mathcal{O}\left(B^{2}\right)\right)\left(I+A+\mathcal{O}\left(A^{2}\right)\right)\ \ \ \ \ (9)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I+B+A+BA+\mathcal{O}\left(A^{2},B^{2}\right) \ \ \ \ \ (10)$

Taking the difference, we get

 $\displaystyle R_{1}R_{2}-R_{2}R_{1}$ $\displaystyle =$ $\displaystyle AB-BA\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[A,B\right] \ \ \ \ \ (12)$

where ${\left[A,B\right]\equiv AB-BA}$ is the commutator of ${A}$ and ${B}$ (the same commutators that show up in quantum mechanics). This derivation seems to be a bit of a fudge, since the commutator is, by definition, of second order in ${A}$ and ${B}$, so by separating it out from the general ${\mathcal{O}\left(A^{2},B^{2}\right)}$ term, it seems we’re implicitly assuming that the ${\mathcal{O}\left(A^{2},B^{2}\right)}$ is the same in both ${R_{1}R_{2}}$ and ${R_{2}R_{1}}$, so it cancels out when taking the difference.

Zee gives a second argument for measuring the difference between the two compound rotations ${R_{1}R_{2}}$ and ${R_{2}R_{1}}$. If the two orders of rotation commuted, then the inverse of one rotation should also be the inverse of the other. That is, we should have ${\left(R_{2}R_{1}\right)^{-1}\left(R_{1}R_{2}\right)=I}$. A measure of how different the two orders of rotation are from each other can then be found by seeing how much ${\left(R_{2}R_{1}\right)^{-1}\left(R_{1}R_{2}\right)}$ differs from ${I}$.

For an infinitesimal rotation ${R=I+A}$, then to first order ${R^{-1}=I-A}$ since ${R^{-1}R=I+A-A-A^{2}=I+\mathcal{O}\left(A^{2}\right)}$. Therefore

 $\displaystyle \left(R_{2}R_{1}\right)^{-1}\left(R_{1}R_{2}\right)$ $\displaystyle =$ $\displaystyle \left[I-\left(B+A+BA+\mathcal{O}\left(A^{2},B^{2}\right)\right)\right]\left[I+A+B+AB+\mathcal{O}\left(A^{2},B^{2}\right)\right]\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle I+\left[A,B\right]-\left(A+B\right)^{2}+\mathcal{O}\left(A^{2},B^{2}\right) \ \ \ \ \ (14)$

Again, the ${\left(A+B\right)^{2}}$ term is neglected along with the ${\mathcal{O}\left(A^{2},B^{2}\right)}$ terms, even though it contains the terms ${AB+BA}$ which are the same terms that appear in ${\left[A,B\right]}$. It’s not clear to me how we can justify ignoring ${AB+BA}$ but not ${\left[A,B\right]}$.

In any case, we can observe that the transpose of a commutator gives

 $\displaystyle \left[A,B\right]^{T}$ $\displaystyle =$ $\displaystyle \left(AB-BA\right)^{T}\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(BA-AB\right)\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\left[A,B\right] \ \ \ \ \ (17)$

so a commutator is always an antisymmetric matrix. In particular, for the ${J_{i}}$ matrices in 5, we have

$\displaystyle \left[J_{i},J_{j}\right]=ic_{ijk}J_{k} \ \ \ \ \ (18)$

with an implied sum over ${k}$ on the RHS. This follows because any antisymmetric matrix can be written as a linear combination of the ${J_{k}}$s.

Since the ${J_{i}}$s are purely imaginary, a product of two of them is always real. Therefore the ${c_{ijk}}$ coefficients must be real, since ${iJ_{k}}$ is a real matrix.

We can find the ${c_{ijk}}$ coefficients by a brute force calculation starting with 5. We get (with an implied sum over the index ${i}$):

 $\displaystyle \left[J_{\left(mn\right)},J_{\left(pq\right)}\right]^{k\ell}$ $\displaystyle =$ $\displaystyle J_{\left(mn\right)}^{ki}J_{\left(pq\right)}^{i\ell}-J_{\left(pq\right)}^{ki}J_{\left(mn\right)}^{i\ell}\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\left[\delta^{mk}\delta^{ni}-\delta^{mi}\delta^{nk}\right]\left[\delta^{pi}\delta^{q\ell}-\delta^{qi}\delta^{p\ell}\right]\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle$ $\displaystyle +\left[\delta^{pk}\delta^{qi}-\delta^{pi}\delta^{qk}\right]\left[\delta^{mi}\delta^{n\ell}-\delta^{ni}\delta^{m\ell}\right]\nonumber$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\delta^{mk}\delta^{np}\delta^{q\ell}+\delta^{mk}\delta^{nq}\delta^{p\ell}+\delta^{mp}\delta^{nk}\delta^{q\ell}-\delta^{mq}\delta^{nk}\delta^{p\ell}\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle$ $\displaystyle +\delta^{pk}\delta^{mq}\delta^{n\ell}-\delta^{mp}\delta^{qk}\delta^{n\ell}-\delta^{pk}\delta^{qn}\delta^{m\ell}+\delta^{np}\delta^{qk}\delta^{m\ell}\nonumber$ $\displaystyle$ $\displaystyle =$ $\displaystyle \delta^{np}\left(\delta^{qk}\delta^{m\ell}-\delta^{mk}\delta^{q\ell}\right)+\delta^{nq}\left(\delta^{mk}\delta^{p\ell}-\delta^{m\ell}\delta^{pk}\right)\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle$ $\displaystyle +\delta^{mp}\left(\delta^{nk}\delta^{q\ell}-\delta^{qk}\delta^{n\ell}\right)+\delta^{mq}\left(\delta^{pk}\delta^{n\ell}-\delta^{nk}\delta^{p\ell}\right)\nonumber$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\left[\delta^{mp}J_{\left(nq\right)}+\delta^{nq}J_{\left(mp\right)}-\delta^{np}J_{\left(mq\right)}-\delta^{mq}J_{\left(np\right)}\right]^{k\ell} \ \ \ \ \ (23)$

So the commutator is

$\displaystyle \left[J_{\left(mn\right)},J_{\left(pq\right)}\right]=i\left[\delta^{mp}J_{\left(nq\right)}+\delta^{nq}J_{\left(mp\right)}-\delta^{np}J_{\left(mq\right)}-\delta^{mq}J_{\left(np\right)}\right] \ \ \ \ \ (24)$

For general infinitesimal rotations from 6

 $\displaystyle A$ $\displaystyle =$ $\displaystyle i\sum_{i}\theta_{i}J_{i}\ \ \ \ \ (25)$ $\displaystyle B$ $\displaystyle =$ $\displaystyle i\sum_{j}\theta_{j}^{\prime}J_{j} \ \ \ \ \ (26)$

The commutator is therefore

 $\displaystyle \left[A,B\right]$ $\displaystyle =$ $\displaystyle i^{2}\left[\sum_{i,j}\theta_{i}\theta_{j}^{\prime}J_{i}J_{j}-\sum_{i,j}\theta_{i}\theta_{j}^{\prime}J_{j}J_{i}\right]\ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\sum_{i,j}\theta_{i}\theta_{j}^{\prime}\left[J_{i},J_{j}\right] \ \ \ \ \ (28)$

Thus if we know the commutators of the generator matrices ${J_{i}}$ we can work out the commutators of any antisymmetric matrix pair.

# Riemann tensor – commutator of rank 2 tensor

Required math: algebra, calculus

Required physics: none

Reference: d’Inverno, Ray, Introducing Einstein’s Relativity (1992), Oxford Uni Press. – Section 6.5; Problem 6.10.

The covariant derivative of a contravariant vector is defined as

$\displaystyle \nabla_{b}V^{a}\equiv V_{\;;b}^{a}\equiv\frac{\partial V^{a}}{\partial x^{b}}+V^{c}\Gamma_{cb}^{a} \ \ \ \ \ (1)$

This is generalized to the covariant derivative of a higher-rank tensor by the formula

$\displaystyle T_{cd\ldots;e}^{ab\ldots}=\partial_{e}T_{cd\ldots}^{ab\ldots}+T_{cd\ldots}^{fb\ldots}\Gamma_{fe}^{a}+T_{cd\ldots}^{af\ldots}\Gamma_{fe}^{b}+\ldots-T_{fd\ldots}^{ab\ldots}\Gamma_{ce}^{f}-T_{cf\ldots}^{ab\ldots}\Gamma_{de}^{f}-\ldots \ \ \ \ \ (2)$

Ordinary partial derivatives, for a continuously differentiable function ${f\left(x^{a}\right)}$, are commutative, that is

$\displaystyle \frac{\partial}{\partial x^{b}}\left(\frac{\partial f}{\partial x^{a}}\right)=\frac{\partial}{\partial x^{a}}\left(\frac{\partial f}{\partial x^{b}}\right) \ \ \ \ \ (3)$

The covariant derivative, however, is not in general commutative, as we can verify by direct calculation. We want to find

$\displaystyle X_{\;\; b;c;d}^{a}-X_{\;\; b;d;c}^{a} \ \ \ \ \ (4)$

which is known as the commutator of the tensor ${X_{\;\; b}^{a}}$. For the first term, we get, using 2

 $\displaystyle X_{\;\; b;c;d}^{a}$ $\displaystyle =$ $\displaystyle \partial_{d}X_{\;\; b;c}^{a}+X_{\;\; b:c}^{e}\Gamma_{ed}^{a}-X_{\;\; e:c}^{a}\Gamma_{bd}^{e}-X_{\;\; b:e}^{a}\Gamma_{cd}^{e}\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \partial_{d}\left(\partial_{c}X_{\;\; b}^{a}+X_{\;\; b}^{e}\Gamma_{ec}^{a}-X_{\;\; e}^{a}\Gamma_{bc}^{e}\right)+\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{ed}^{a}\left(\partial_{c}X_{\;\; b}^{e}+X_{\;\; b}^{f}\Gamma_{fc}^{e}-X_{\;\; f}^{e}\Gamma_{bc}^{f}\right)-\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{bd}^{e}\left(\partial_{c}X_{\;\; e}^{a}+X_{\;\; e}^{f}\Gamma_{fc}^{a}-X_{\;\; f}^{a}\Gamma_{ec}^{f}\right)-\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{cd}^{e}\left(\partial_{e}X_{\;\; b}^{a}+X_{\;\; b}^{f}\Gamma_{fe}^{a}-X_{\;\; f}^{a}\Gamma_{be}^{f}\right) \ \ \ \ \ (9)$

The other term can be obtained by simply swapping the indices ${c}$ and ${d}$:

 $\displaystyle X_{\;\; b;d;c}^{a}$ $\displaystyle =$ $\displaystyle \partial_{c}X_{\;\; b;d}^{a}+X_{\;\; b:d}^{e}\Gamma_{ec}^{a}-X_{\;\; e:d}^{a}\Gamma_{bc}^{e}-X_{\;\; b:e}^{a}\Gamma_{dc}^{e}\ \ \ \ \ (10)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \partial_{c}\left(\partial_{d}X_{\;\; b}^{a}+X_{\;\; b}^{e}\Gamma_{ed}^{a}-X_{\;\; e}^{a}\Gamma_{bd}^{e}\right)+\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{ec}^{a}\left(\partial_{d}X_{\;\; b}^{e}+X_{\;\; b}^{f}\Gamma_{fd}^{e}-X_{\;\; f}^{e}\Gamma_{bd}^{f}\right)-\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{bc}^{e}\left(\partial_{d}X_{\;\; e}^{a}+X_{\;\; e}^{f}\Gamma_{fd}^{a}-X_{\;\; f}^{a}\Gamma_{ed}^{f}\right)-\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle$ $\displaystyle \Gamma_{dc}^{e}\left(\partial_{e}X_{\;\; b}^{a}+X_{\;\; b}^{f}\Gamma_{fe}^{a}-X_{\;\; f}^{a}\Gamma_{be}^{f}\right) \ \ \ \ \ (14)$

Now we need to take the difference. Assuming the ordinary partial derivatives commute and using the product rule, we get

 $\displaystyle X_{\;\; b;c;d}^{a}-X_{\;\; b;d;c}^{a}$ $\displaystyle =$ $\displaystyle X_{\;\; b}^{e}\left(\partial_{d}\Gamma_{ec}^{a}-\partial_{c}\Gamma_{ed}^{a}\right)-X_{\;\; e}^{a}\left(\partial_{d}\Gamma_{bc}^{e}-\partial_{c}\Gamma_{bd}^{e}\right)+\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle$ $\displaystyle X_{\;\; b}^{f}\left(\Gamma_{ed}^{a}\Gamma_{fc}^{e}-\Gamma_{ec}^{a}\Gamma_{fd}^{e}\right)-X_{\;\; f}^{e}\left(\Gamma_{ed}^{a}\Gamma_{bc}^{f}-\Gamma_{ec}^{a}\Gamma_{bd}^{f}\right)-\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle$ $\displaystyle X_{\;\; e}^{f}\left(\Gamma_{bd}^{e}\Gamma_{fc}^{a}-\Gamma_{bc}^{e}\Gamma_{fd}^{a}\right)+X_{\;\; f}^{a}\left(\Gamma_{bd}^{e}\Gamma_{ec}^{f}-\Gamma_{bc}^{e}\Gamma_{ed}^{f}\right)-\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle$ $\displaystyle \left(\partial_{e}X_{\;\; b}^{a}+X_{\;\; b}^{f}\Gamma_{fe}^{a}-X_{\;\; f}^{a}\Gamma_{be}^{f}\right)\left(\Gamma_{cd}^{e}-\Gamma_{dc}^{e}\right) \ \ \ \ \ (18)$

We can now swap the indices ${e}$ and ${f}$ in the first term in the third line (since they are both dummy indices) to get

$\displaystyle X_{\;\; e}^{f}\left(\Gamma_{bd}^{e}\Gamma_{fc}^{a}-\Gamma_{bc}^{e}\Gamma_{fd}^{a}\right)=X_{\;\; f}^{e}\left(\Gamma_{bd}^{f}\Gamma_{ec}^{a}-\Gamma_{bc}^{f}\Gamma_{ed}^{a}\right) \ \ \ \ \ (19)$

We can now see that this term cancels the last term on the second line. If we also assume that the affine connections are symmetric, so that

$\displaystyle \Gamma_{cd}^{e}=\Gamma_{dc}^{e} \ \ \ \ \ (20)$

then the last line disappears and we are left with

 $\displaystyle X_{\;\; b;c;d}^{a}-X_{\;\; b;d;c}^{a}$ $\displaystyle =$ $\displaystyle X_{\;\; b}^{e}\left(\partial_{d}\Gamma_{ec}^{a}-\partial_{c}\Gamma_{ed}^{a}\right)-X_{\;\; e}^{a}\left(\partial_{d}\Gamma_{bc}^{e}-\partial_{c}\Gamma_{bd}^{e}\right)+\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle$ $\displaystyle X_{\;\; b}^{e}\left(\Gamma_{fd}^{a}\Gamma_{ec}^{f}-\Gamma_{fc}^{a}\Gamma_{ed}^{f}\right)-X_{\;\; e}^{a}\left(\Gamma_{bc}^{f}\Gamma_{fd}^{e}-\Gamma_{bd}^{f}\Gamma_{fc}^{e}\right)\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X_{\;\; b}^{e}\left(\partial_{d}\Gamma_{ec}^{a}-\partial_{c}\Gamma_{ed}^{a}+\Gamma_{fd}^{a}\Gamma_{ec}^{f}-\Gamma_{fc}^{a}\Gamma_{ed}^{f}\right)-\ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle$ $\displaystyle X_{\;\; e}^{a}\left(\partial_{d}\Gamma_{bc}^{e}-\partial_{c}\Gamma_{bd}^{e}+\Gamma_{bc}^{f}\Gamma_{fd}^{e}-\Gamma_{bd}^{f}\Gamma_{fc}^{e}\right) \ \ \ \ \ (24)$

where again we have swapped ${e}$ and ${f}$ in the second line.

The two terms in parentheses have the same form, and they are known as the Riemann tensor or curvature tensor, defined by

$\displaystyle R_{\;\; edc}^{a}\equiv\partial_{d}\Gamma_{ec}^{a}-\partial_{c}\Gamma_{ed}^{a}+\Gamma_{fd}^{a}\Gamma_{ec}^{f}-\Gamma_{fc}^{a}\Gamma_{ed}^{f} \ \ \ \ \ (25)$

In terms of the Riemann tensor, we get for the commutator:

$\displaystyle X_{\;\; b;c;d}^{a}-X_{\;\; b;d;c}^{a}=X_{\;\; b}^{e}R_{\;\; edc}^{a}-X_{\;\; e}^{a}R_{\;\; bdc}^{e} \ \ \ \ \ (26)$

This is actually the same result as given in d’Inverno’s problem 6.10, with ${c}$ and ${d}$ swapped around; I just took the original covariant derivatives in the opposite order to d’Inverno and can’t be bothered going through the whole derivation again to change it.

# Angular momentum – commutators with position and momentum

Required math: calculus

Required physics: 3-d Schrödinger equation

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education Problem 4.19.

We’ve worked out the commutators of the components of angular momentum with each other, but it’s also instructive to see what the commutators of angular momentum with position and linear momentum are.

The commutators are all derived similarly, so here are a couple of them:

 $\displaystyle \left[L_{z},x\right]$ $\displaystyle =$ $\displaystyle \left[xp_{y}-yp_{x},x\right]\ \ \ \ \ (1)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -y\left[p_{x},x\right]\ \ \ \ \ (2)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar y\ \ \ \ \ (3)$ $\displaystyle \left[L_{z},p_{x}\right]$ $\displaystyle =$ $\displaystyle \left[xp_{y}-yp_{x},p_{x}\right]\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[xp_{y},p_{x}\right]\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[x,p_{x}\right]p_{y}\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar p_{y} \ \ \ \ \ (7)$

We’ve used the position-momentum commutator: ${\left[x,p_{x}\right]=i\hbar}$, with similar expressions for ${y}$ and ${z}$. The complete set of results is

 $\displaystyle \left[L_{z},x\right]$ $\displaystyle =$ $\displaystyle i\hbar y\ \ \ \ \ (8)$ $\displaystyle \left[L_{z},y\right]$ $\displaystyle =$ $\displaystyle -i\hbar x\ \ \ \ \ (9)$ $\displaystyle \left[L_{z},z\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (10)$ $\displaystyle \left[L_{z},p_{x}\right]$ $\displaystyle =$ $\displaystyle i\hbar p_{y}\ \ \ \ \ (11)$ $\displaystyle \left[L_{z},p_{y}\right]$ $\displaystyle =$ $\displaystyle -i\hbar p_{x}\ \ \ \ \ (12)$ $\displaystyle \left[L_{z},p_{z}\right]$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (13)$

We can use these results to derive the original commutator:

 $\displaystyle \left[L_{z},L_{x}\right]$ $\displaystyle =$ $\displaystyle \left[L_{z},yp_{z}-zp_{y}\right]\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[L_{z},y\right]p_{z}-z\left[L_{z},p_{y}\right]\ \ \ \ \ (15)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -i\hbar xp_{z}+i\hbar zp_{x}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle i\hbar L_{y} \ \ \ \ \ (17)$

We can now find the commutator of ${L_{z}}$ with the square of the position ${r^{2}}$. To find the commutator, we apply it to some function ${f}$. Subscripts on ${f}$ indicate derivatives w.r.t. that variable; thus ${\partial f/\partial x\equiv f_{x}}$, etc.

 $\displaystyle \left[L_{z},r^{2}\right]f$ $\displaystyle =$ $\displaystyle \left[xp_{y}-yp_{x},r^{2}\right]f\ \ \ \ \ (18)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -i\hbar\left(2xyf+xr^{2}f_{y}-2xyf-yr^{2}f_{x}-r^{2}xf_{y}+r^{2}yf_{x}\right)\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (20)$

For ${p^{2}}$ we have

 $\displaystyle \left[L_{z},p^{2}\right]$ $\displaystyle =$ $\displaystyle \left[xp_{y}-yp_{x},p^{2}\right]f\ \ \ \ \ (21)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[xp_{y}-yp_{x},p_{x}^{2}\right]f+\left[xp_{y}-yp_{x},p_{y}^{2}\right]f\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[xp_{y},p_{x}^{2}\right]f-\left[yp_{x},p_{y}^{2}\right]f \ \ \ \ \ (23)$

In the second line we have eliminated ${p_{z}^{2}}$ since it commutes with ${L_{z}}$ as ${L_{z}}$ contains no reference to ${z}$. Similarly, to get the third line, we have eliminated those terms in the second line that have a zero commutator.

To evaluate the last line, we note that

 $\displaystyle \left[xp_{y},p_{x}^{2}\right]f$ $\displaystyle =$ $\displaystyle -\frac{\hbar^{3}}{i}\left(xf_{yxx}-\frac{\partial^{2}}{\partial x^{2}}(xf_{y})\right)\ \ \ \ \ (24)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\hbar^{3}}{i}\left(xf_{yxx}-f_{yx}-f_{xy}-xf_{yxx}\right)\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\hbar^{3}}{i}\left(-f_{yx}-f_{xy}\right)\ \ \ \ \ (26)$ $\displaystyle \left[yp_{x},p_{y}^{2}\right]f$ $\displaystyle =$ $\displaystyle -\frac{\hbar^{3}}{i}\left(yf_{xyy}-f_{yx}-f_{xy}-yf_{xyy}\right)\ \ \ \ \ (27)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\hbar^{3}}{i}\left(-f_{yx}-f_{xy}\right) \ \ \ \ \ (28)$

Combining all the terms we get

$\displaystyle \left[L_{z},p^{2}\right]=0 \ \ \ \ \ (29)$

By symmetry, the same argument shows that ${L_{x}}$ and ${L_{y}}$ also commute with ${r^{2}}$ and ${p^{2}}$so it follows that all components of ${\mathbf{L}}$ commute with ${H=p^{2}/2m+V}$, if ${V}$ depends only on ${r}$.

# Uncertainty principle in three dimensions

Required math: calculus

Required physics: Schrödinger equation

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 4.1.

In three dimensions, the position and momentum operators are generalizations of their one-dimensional form:

 $\displaystyle r_{x}$ $\displaystyle =$ $\displaystyle x\ \ \ \ \ (1)$ $\displaystyle r_{y}$ $\displaystyle =$ $\displaystyle y\ \ \ \ \ (2)$ $\displaystyle r_{z}$ $\displaystyle =$ $\displaystyle z\ \ \ \ \ (3)$ $\displaystyle p_{x}$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial x}\ \ \ \ \ (4)$ $\displaystyle p_{y}$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial y}\ \ \ \ \ (5)$ $\displaystyle p_{z}$ $\displaystyle =$ $\displaystyle -i\hbar\frac{\partial}{\partial z} \ \ \ \ \ (6)$

All the position operators commute with each other since they simply multipliers. Thus

$\displaystyle \left[r_{i},r_{j}\right]=0 \ \ \ \ \ (7)$

Similarly, all the components of momentum commute with each other, since they are derivatives, and there are no occurrences of the variables on which they operate

$\displaystyle \left[p_{i},p_{j}\right]=0 \ \ \ \ \ (8)$

Mixtures of momentum and position will interact the same way as in one dimension if the two components are along the same axis. Mixtures of momentum and position that lie along different axes will commute, since the derivative in the momentum is with respect to a component not found in the position. Thus

$\displaystyle \left[r_{i},p_{j}\right]=i\hbar\delta_{ij} \ \ \ \ \ (9)$

Earlier, we derived an equation for the rate of change of an observable ${Q}$:

$\displaystyle \frac{d}{dt}\left\langle Q\right\rangle =\frac{i}{\hbar}\left\langle \left[H,Q\right]\right\rangle +\left\langle \frac{\partial Q}{\partial t}\right\rangle \ \ \ \ \ (10)$

where ${H}$ is the hamiltonian.

Examining the derivation of this equation shows that nothing depends on the calculation being done in one, two or three dimensions (the wave function and hamiltonian in the derivation could be in any number of dimensions), so it can be applied to each component of ${\mathbf{r}}$ and ${\mathbf{p}}$ separately. Because of this we can use the results we worked out earlier for the rates of change of position and momentum in one dimension, applied to each of the three axes, and get the corresponding result (Ehrenfest’s theorem) in three dimensions:

 $\displaystyle \frac{d\langle\mathbf{r}\rangle}{dt}$ $\displaystyle =$ $\displaystyle \frac{\langle\mathbf{p}\rangle}{m}\ \ \ \ \ (11)$ $\displaystyle \frac{d\langle\mathbf{p}\rangle}{dt}$ $\displaystyle =$ $\displaystyle \left\langle -\nabla V\right\rangle \ \ \ \ \ (12)$

To work out the uncertainty principle in three dimensions, we can use the relation derived earlier:

$\displaystyle \sigma_{A}^{2}\sigma_{B}^{2}\ge\left(\frac{1}{2i}\left\langle [\hat{A},\hat{B}]\right\rangle \right)^{2} \ \ \ \ \ (13)$

Again, the derivation of this equation does not depend on the number of dimensions. We can therefore use it to calculate the uncertainty principle for the three components of position and momentum. From the commutators above, we have

$\displaystyle \sigma_{r_{i}}\sigma_{p_{j}}\ge\frac{\hbar}{2}\delta_{ij} \ \ \ \ \ (14)$

The ${\delta_{ij}}$ indicates that a position component and a perpendicular momentum component can both be measured precisely at the same time, since these components commute, as we saw above. For position and momentum components along the same direction, the uncertainty principle is the same as it is in one dimension.

# Lie brackets (commutators)

Required math: algebra, calculus

Required physics: none

Reference: d’Inverno, Ray, Introducing Einstein’s Relativity (1992), Oxford Uni Press. – Section 5.9 and Problems 5.15, 5.16 (v).

When we began looking at quantum mechanics, we encountered the commutator of two operators, defined as

$\displaystyle \left[A,B\right]\equiv AB-BA \ \ \ \ \ (1)$

In quantum mechanics, some operators (the most famous being the position and momentum operators) do not commute, and in fact, the generalized uncertainty principle says that only operators that commute can be measured simultaneously with arbitrary precision.

In tensor analysis, we’ve seen that the tangent vector field to a manifold can be written as the operator

$\displaystyle X=X^{a}\partial_{a} \ \ \ \ \ (2)$

Since this operator involves derivatives, we might expect that the commutator of two such operators would be non-zero (since that’s what happend with the position and momentum operators in quantum mechanics). The commutator of two vector fields is also known as a Lie bracket, (where ‘Lie’ is pronounced ‘lee’) but is defined in the same way as in quantum mechanics.

The commutator of two vector fields is again a vector field, as can be verified by direct calculation. As always with operators involving derivatives, we need a dummy function ${f}$ on which to operate, so we get

 $\displaystyle \left[X,Y\right]f$ $\displaystyle =$ $\displaystyle X^{a}\partial_{a}\left(Y^{b}\partial_{b}f\right)-Y^{a}\partial_{a}\left(X^{b}\partial_{b}f\right)\ \ \ \ \ (3)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X^{a}\left(\partial_{a}Y^{b}\right)\left(\partial_{b}f\right)+X^{a}Y^{b}\partial_{ab}^{2}f-Y^{a}\left(\partial_{a}X^{b}\right)\left(\partial_{b}f\right)-Y^{a}X^{b}\partial_{ab}^{2}f\ \ \ \ \ (4)$ $\displaystyle$ $\displaystyle =$ $\displaystyle X^{a}\left(\partial_{a}Y^{b}\right)\left(\partial_{b}f\right)-Y^{a}\left(\partial_{a}X^{b}\right)\left(\partial_{b}f\right) \ \ \ \ \ (5)$

Removing the dummy function, we get

$\displaystyle \left[X,Y\right]=X^{a}\left(\partial_{a}Y^{b}\right)\partial_{b}-Y^{a}\left(\partial_{a}X^{b}\right)\partial_{b} \ \ \ \ \ (6)$

which is a vector field with components

$\displaystyle \left[X,Y\right]^{b}=X^{a}\partial_{a}Y^{b}-Y^{a}\partial_{a}X^{b} \ \ \ \ \ (7)$

It’s obvious from the definition that

 $\displaystyle \left[X,X\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (8)$ $\displaystyle \left[X,Y\right]$ $\displaystyle =$ $\displaystyle -\left[Y,X\right] \ \ \ \ \ (9)$

There is a third identity known as Jacobi’s identity that is less obvious:

$\displaystyle \left[X,\left[Y,Z\right]\right]+\left[Z,\left[X,Y\right]\right]+\left[Y,\left[Z,X\right]\right]=0 \ \ \ \ \ (10)$

This is true for commutators in general, and not just for vector fields as defined here. It can be proved by writing out the terms.

 $\displaystyle \left[X,\left[Y,Z\right]\right]$ $\displaystyle =$ $\displaystyle XYZ-XZY-YZX+ZYX$ $\displaystyle \left[Z,\left[X,Y\right]\right]$ $\displaystyle =$ $\displaystyle ZXY-ZYX-XYZ+YXZ$ $\displaystyle \left[Y,\left[Z,X\right]\right]$ $\displaystyle =$ $\displaystyle YZX-YXZ-ZXY+XZY$

Adding up the right hand side, we see that the terms cancel in pairs.

As an example of Lie brackets, we can look at the operators ${X}$, ${Y}$ and ${Z}$ that we used in the post on tangent space. In rectangular coordinates, these operators are

 $\displaystyle X$ $\displaystyle =$ $\displaystyle \partial_{x}\ \ \ \ \ (11)$ $\displaystyle Y$ $\displaystyle =$ $\displaystyle \partial_{y}\ \ \ \ \ (12)$ $\displaystyle Z$ $\displaystyle =$ $\displaystyle -y\partial_{x}+x\partial_{y} \ \ \ \ \ (13)$

To work out the commutators, we can use equation 7 above. For that, we need the components of the vectors, which are ${X^{a}=\left(1,0\right)}$, ${Y^{a}=\left(0,1\right)}$ and ${Z^{a}=\left(-y,x\right)}$.

 $\displaystyle \left[X,Y\right]^{b}$ $\displaystyle =$ $\displaystyle X^{a}\partial_{a}Y^{b}-Y^{a}\partial_{a}X^{b}\ \ \ \ \ (14)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left(0,0\right)\ \ \ \ \ (15)$ $\displaystyle \left[X,Z\right]^{1}$ $\displaystyle =$ $\displaystyle X^{a}\partial_{a}Z^{1}-Z^{a}\partial_{a}X^{1}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \partial_{x}\left(-y\right)+0-\left(0+0\right)\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (18)$ $\displaystyle \left[X,Z\right]^{2}$ $\displaystyle =$ $\displaystyle X^{a}\partial_{a}Z^{2}-Z^{a}\partial_{a}X^{2}\ \ \ \ \ (19)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \partial_{x}x+0-\left(0+0\right)\ \ \ \ \ (20)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 1\ \ \ \ \ (21)$ $\displaystyle \left[Y,Z\right]^{1}$ $\displaystyle =$ $\displaystyle Y^{a}\partial_{a}Z^{1}-Z^{a}\partial_{a}Y^{1}\ \ \ \ \ (22)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0+\partial_{y}\left(-y\right)-\left(0+0\right)\ \ \ \ \ (23)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -1\ \ \ \ \ (24)$ $\displaystyle \left[Y,Z\right]^{2}$ $\displaystyle =$ $\displaystyle Y^{a}\partial_{a}Z^{2}-Z^{a}\partial_{a}Y^{2}\ \ \ \ \ (25)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0+\partial_{y}x-\left(0+0\right)\ \ \ \ \ (26)$ $\displaystyle$ $\displaystyle =$ $\displaystyle 0 \ \ \ \ \ (27)$

Thus the commutator operators are

 $\displaystyle \left[X,Y\right]$ $\displaystyle =$ $\displaystyle 0\ \ \ \ \ (28)$ $\displaystyle \left[X,Z\right]$ $\displaystyle =$ $\displaystyle \partial_{y}=Y\ \ \ \ \ (29)$ $\displaystyle \left[Y,Z\right]$ $\displaystyle =$ $\displaystyle -\partial_{x}=-X \ \ \ \ \ (30)$

# Anti-hermitian operators

Required math: calculus, vectors

Required physics: none

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 3.26.

A hermitian operator is equal to its hermitian conjugate (which, remember, is the complex conjugate of the transpose of the matrix representing the operator). That is,

$\displaystyle \hat{Q}^{\dagger}=\hat{Q} \ \ \ \ \ (1)$

This has the consequence that for inner products

 $\displaystyle \langle f|\hat{Q}g\rangle$ $\displaystyle =$ $\displaystyle \langle\hat{Q}^{\dagger}f|g\rangle\ \ \ \ \ (2)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \langle\hat{Q}f|g\rangle \ \ \ \ \ (3)$

An anti-hermitian operator is equal to the negative of its hermitian conjugate, that is

$\displaystyle \hat{Q}^{\dagger}=-\hat{Q} \ \ \ \ \ (4)$

In inner products, this means

 $\displaystyle \langle f|\hat{Q}g\rangle$ $\displaystyle =$ $\displaystyle \langle\hat{Q}^{\dagger}f|g\rangle\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\langle\hat{Q}f|g\rangle \ \ \ \ \ (6)$

The expectation value of an anti-hermitian operator is:

 $\displaystyle \langle f|\hat{Q}f\rangle$ $\displaystyle =$ $\displaystyle \langle\hat{Q}^{\dagger}f|f\rangle\ \ \ \ \ (7)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\langle\hat{Q}f|f\rangle\ \ \ \ \ (8)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\langle Q\rangle^* \ \ \ \ \ (9)$

But ${\langle f|\hat{Q}f\rangle=\langle Q\rangle}$ so ${\langle Q\rangle=-\langle Q\rangle^*}$, which means the expectation value must be pure imaginary.

For two hermitian operators ${\hat{Q}}$ and ${\hat{R}}$ we have

 $\displaystyle \left[\hat{Q},\hat{R}\right]$ $\displaystyle =$ $\displaystyle \hat{Q}\hat{R}-\hat{R}\hat{Q}\ \ \ \ \ (10)$ $\displaystyle {}[\hat{Q},\hat{R}]^{\dagger}$ $\displaystyle =$ $\displaystyle \hat{R}^{\dagger}\hat{Q}^{\dagger}-\hat{Q}^{\dagger}\hat{R}^{\dagger}\ \ \ \ \ (11)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \hat{R}\hat{Q}-\hat{Q}\hat{R}\ \ \ \ \ (12)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \left[\hat{R},\hat{Q}\right]\ \ \ \ \ (13)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -[\hat{Q},\hat{R}] \ \ \ \ \ (14)$

where we have used the hermitian property ${\hat{Q}^{\dagger}=\hat{Q}}$ to get the third line. Thus the commutator of two hermitian operators is anti-hermitian.

If two operators ${\hat{S}}$ and ${\hat{T}}$ are anti-hermitian, a similar derivation shows that ${[\hat{S},\hat{T}]^{\dagger}=-[\hat{S},\hat{T}]}$ also:

 $\displaystyle \left[\hat{S},\hat{T}\right]$ $\displaystyle =$ $\displaystyle \hat{S}\hat{T}-\hat{T}\hat{S}\ \ \ \ \ (15)$ $\displaystyle {}[\hat{S},\hat{T}]^{\dagger}$ $\displaystyle =$ $\displaystyle \hat{T}^{\dagger}\hat{S}^{\dagger}-\hat{S}^{\dagger}\hat{T}^{\dagger}\ \ \ \ \ (16)$ $\displaystyle$ $\displaystyle =$ $\displaystyle (-\hat{T})(-\hat{S})-(-\hat{S})(-\hat{T})\ \ \ \ \ (17)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -[\hat{S},\hat{T}] \ \ \ \ \ (18)$

# Uncertainty principle: rates of change of operators

Required math: calculus

Required physics: Schrödinger equation

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 3.17.

The rate of change of the expectation value of an operator ${Q}$ is

$\displaystyle \frac{d}{dt}\left\langle Q\right\rangle =\frac{i}{\hbar}\left\langle \left[H,Q\right]\right\rangle +\left\langle \frac{\partial Q}{\partial t}\right\rangle \ \ \ \ \ (1)$

where ${H}$ is the hamiltonian of the system (assumed time-independent).

We’ll look at a few applications of this formula.

(a) With ${Q=1}$, since any operator commutes with a constant, and ${Q}$ has no explicit time dependence,

$\displaystyle \frac{d\langle Q\rangle}{dt}=0 \ \ \ \ \ (2)$

This is another way of stating the fact that the integral of the square modulus of the wave function remains 1 for all time (i.e. the square modulus of the wave function remains as a probability density), since the integral of the square modulus is essentially the “expectation value of the unit operator”.

(b) With ${Q=H}$, since any operator commutes with itself, we have

$\displaystyle \frac{d\langle H\rangle}{dt}=\left\langle \frac{\partial H}{\partial t}\right\rangle \ \ \ \ \ (3)$

If the energy has no explicit time dependence (that is, the potential function does not depend on time), then ${\partial H/\partial t=0}$ and this result expresses the conservation of energy.

(c) With ${Q=x}$, we can use the commutator worked out earlier:

\hat{x}}=-\frac{i\hbar}{m}p=-\frac{i\hbar}{m}\frac{\partial}{\partial x}

Since ${x}$ and ${t}$ are independent variables, ${\partial x/\partial t=0}$, so we get:

$\displaystyle \frac{d\langle x\rangle}{dt}=\frac{i}{\hbar}\langle[\hat{H,}\hat{x}]\rangle=\frac{\langle p\rangle}{m} \ \ \ \ \ (4)$

This is the quantum equivalent of the classical equation ${p=mv}$.

(d) With ${Q=p}$, we need to work out the commutator ${[\hat{H},\hat{p}].}$ Using an auxiliary function ${g}$ on which this commutator can operate, we get:

 ,\hat{p}}g $\displaystyle =$ $\displaystyle \left[-\frac{\hbar^{2}}{2m}\frac{\partial^{2}}{\partial x^{2}}+V,\frac{\hbar}{i}\frac{\partial}{\partial x}\right]g\ \ \ \ \ (5)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\hbar}{i}\left(V\frac{\partial g}{\partial x}-\frac{\partial}{\partial x}(Vg)\right)\ \ \ \ \ (6)$ $\displaystyle$ $\displaystyle =$ $\displaystyle -\frac{\hbar}{i}\frac{\partial V}{\partial x}g \ \ \ \ \ (7)$

Therefore, the commutator is ${[\hat{H},\hat{p}]=-\frac{\hbar}{i}\frac{\partial V}{\partial x}}$, and we get

$\displaystyle \frac{d\langle p\rangle}{dt}=-\left\langle \frac{\partial V}{\partial x}\right\rangle \ \ \ \ \ (8)$

This is Ehrenfest’s theorem.

# Hermitian operators: common eigenfunctions implies they commute

Required math: calculus

Required physics: Schrödinger equation

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 3.15.

The eigenfunctions of a hermitian operator form a complete set, in the sense that any function can be expressed as a linear combination of these eigenfunctions. A consequence of this is the theorem that, if two hermitian operators have the same set of eigenfunctions (though possibly with different eigenvalues), then these two operators must commute.

If the two operators ${\hat{P}}$ and ${\hat{Q}}$ have the same complete set of common eigenfunctions, then a function ${f}$ in Hilbert space can be written as a series in terms of these eigenfunctions

$\displaystyle f=\sum c_{n}f_{n} \ \ \ \ \ (1)$

where ${f_{n}}$ are the eigenfunctions, and ${c_{n}}$ are the coefficients.

Now for operator ${\hat{P}}$ we have ${\hat{P}f_{n}=p_{n}f_{n}}$ where ${p_{n}}$ is the eigenvalue when ${\hat{P}}$ operates on ${f_{n}}$. Similarly for ${\hat{Q}}$ we have ${\hat{Q}f_{n}=q_{n}f_{n}}$. Since the two operators share the same set of eigenfunctions, applying the operators in either order to ${f}$ will result in each term in the series being multiplied by the product of the two eigenvalues for each of the operators:

$\displaystyle \hat{P}\hat{Q}f=\hat{Q}\hat{P}f=\sum c_{n}p_{n}q_{n}f_{n} \ \ \ \ \ (2)$

Thus the commutator will be zero if two operators share the same set of eigenfunctions. QED.