Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are on the welcome page.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. Details of the number of visits and distinct visitors are given on the hit statistics page.

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

I should point out that although I did study physics at the university level, this was back in the 1970s and by the time I started this blog in December 2010, I had forgotten pretty much everything I had learned back then. This blog represents my journey back to some level of literacy in physics. I am by no means a professional physicist or an authority on any aspect of the subject. I offer this blog as a record of my own notes and problem solutions as I worked through various books, in the hope that it will help, and possibly even inspire, others to explore this wonderful subject.

Before leaving a comment, you may find it useful to read the “Instructions for commenters“.

Canonical transformations: a few more examples

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.7; Exercises 02.07.06 – 02.07.07, 02.07.08(4).

Here are a few more examples of canonical variable transformations.

Example 1 First, we revisit the two-body problem, in which we simplified the problem by transforming from the coordinates {\mathbf{r}_{1}} and {\mathbf{r}_{2}} of the masses {m_{1}} and {m_{2}} to two new position vectors:

\displaystyle   \mathbf{r} \displaystyle  \equiv \displaystyle  \mathbf{r}_{1}-\mathbf{r}_{2}\ \ \ \ \ (1)
\displaystyle  \mathbf{r}_{CM} \displaystyle  \equiv \displaystyle  \frac{m_{1}\mathbf{r}_{1}+m_{2}\mathbf{r}_{2}}{M} \ \ \ \ \ (2)

Here {M\equiv m_{1}+m_{2}} is the total mass, {\mathbf{r}} is the relative position, and {\mathbf{r}_{CM}} is the position of the centre of mass. The conjugate momenta in the original system are

\displaystyle  \mathbf{p}_{i}=m\dot{\mathbf{r}}_{i} \ \ \ \ \ (3)

The conjugate momenta transform according to

\displaystyle   \mathbf{p}_{CM} \displaystyle  = \displaystyle  M\mathbf{r}_{CM}=\mathbf{p}_{1}+\mathbf{p}_{2}\ \ \ \ \ (4)
\displaystyle  \mathbf{p} \displaystyle  = \displaystyle  \mu\dot{\mathbf{r}}\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \frac{m_{2}\mathbf{p}_{1}-m_{1}\mathbf{p}_{2}}{M} \ \ \ \ \ (6)

where {\mu=m_{1}m_{2}/M} is the reduced mass.

To check that this is a canonical transformation, we need to calculate the Poisson brackets. To make things easier, note that the new coordinates depend only on the old coordinates (and not on the momenta), and conversely, the new momenta depend only on the old momenta (and not on the coordinates). Since the Poisson brackets {\left\{ \overline{q}_{i},\overline{q}_{j}\right\} }and {\left\{ \overline{p}_{i},\overline{p}_{j}\right\} }all involve taking derivatives of coordinates with respect to momenta (in the first case) or momenta with respect to coordinates (in the second case), all these brackets are zero. We need, therefore, to check only the mixed brackets between coordinates and momenta.

Because we’re dealing with 3-d vector equations, there are 3 components to each vector and to be thorough, we need to calculate all possible brackets between all pairs of components. However, if we do the {x} component of each, it should be obvious that the {y} and {z} components behave in the same way.

First, consider

\displaystyle  \left\{ r_{x},p_{x}\right\} =\sum_{i}\left(\frac{\partial r_{x}}{\partial q_{i}}\frac{\partial p_{x}}{\partial p_{i}}-\frac{\partial r_{x}}{\partial p_{i}}\frac{\partial p_{x}}{\partial q_{i}}\right) \ \ \ \ \ (7)

In the RHS, the term {q_{i}} stands for all 6 components of the original position vectors, that is {q_{i}=\left\{ r_{1x},r_{1y},\ldots,r_{2z}\right\} } and the term {p_{i}} in the denominators refers to all 6 components of the original momentum vectors. The {p_{x}} in the numerators refers to the {x} component of {\mathbf{p}} in 6. Hopefully this won’t cause too much confusion.

The second term on the RHS is zero because it involves derivatives of coordinates with respect to momenta (and vice versa). In the first term, {r_{x}} depends only the {x} components of {\mathbf{r}_{1}} and {\mathbf{r}_{2}}, and {p_{x}} depends only on the {x} components of {\mathbf{p}_{1}}and {\mathbf{p}_{2}}, so we have

\displaystyle   \left\{ r_{x},p_{x}\right\} \displaystyle  = \displaystyle  \frac{\partial r_{x}}{\partial r_{1x}}\frac{\partial p_{x}}{\partial p_{1x}}+\frac{\partial r_{x}}{\partial r_{2x}}\frac{\partial p_{x}}{\partial p_{2x}}\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \left(1\right)\frac{m_{2}}{M}+\left(-1\right)\left(-\frac{m_{1}}{M}\right)\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{m_{1}+m_{2}}{M}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  1 \ \ \ \ \ (11)

The same result is obtained for the {y} and {z} components. If we look at mixing two different components, we have, for example

\displaystyle  \left\{ r_{x},p_{y}\right\} =\frac{\partial r_{x}}{\partial r_{1x}}\frac{\partial p_{y}}{\partial p_{1x}}+\frac{\partial r_{x}}{\partial r_{2x}}\frac{\partial p_{y}}{\partial p_{2x}}+\frac{\partial r_{x}}{\partial r_{1y}}\frac{\partial p_{y}}{\partial p_{1y}}+\frac{\partial r_{x}}{\partial r_{2y}}\frac{\partial p_{y}}{\partial p_{2y}}=0 \ \ \ \ \ (12)

This is zero because each term in the sum contains a derivative of an {x} component with respect to a {y} component (or vice versa), all of which are zero.

For the centre of mass components, we have

\displaystyle   \left\{ r_{CMx},p_{CMx}\right\} \displaystyle  = \displaystyle  \frac{\partial r_{CMx}}{\partial r_{1x}}\frac{\partial p_{CMx}}{\partial p_{1x}}+\frac{\partial r_{CMx}}{\partial r_{2x}}\frac{\partial p_{CMx}}{\partial p_{2x}}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{m_{1}}{M}\left(1\right)+\frac{m_{2}}{M}\left(1\right)\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  1\ \ \ \ \ (15)
\displaystyle  \left\{ r_{CMx},p_{CMy}\right\} \displaystyle  = \displaystyle  \frac{\partial r_{CMx}}{\partial r_{1x}}\frac{\partial p_{CMy}}{\partial p_{1x}}+\frac{\partial r_{CMx}}{\partial r_{2x}}\frac{\partial p_{CMy}}{\partial p_{2x}}+\frac{\partial r_{CMx}}{\partial r_{1y}}\frac{\partial p_{CMy}}{\partial p_{1y}}+\frac{\partial r_{CMx}}{\partial r_{2y}}\frac{\partial p_{CMy}}{\partial p_{2y}}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (17)

where the last bracket is zero for the same reason as {\left\{ r_{x},p_{y}\right\} }: we’re mixing {x} and {y} in the derivatives. Again, it should be obvious that the brackets for the other combinations of {x}, {y} and {z} components work out the same way.

Example 2 A bizarre transformation of variables in one dimension is given by

\displaystyle   \overline{q} \displaystyle  = \displaystyle  \ln\frac{\sin p}{q}=\ln\sin p-\ln q\ \ \ \ \ (18)
\displaystyle  \overline{p} \displaystyle  = \displaystyle  q\cot p \ \ \ \ \ (19)

To show this is canonical, we need calculate only {\left\{ \overline{q},\overline{p}\right\} } (since the Poisson bracket of a function with itself is always zero, we have {\left\{ \overline{q},\overline{q}\right\} =\left\{ \overline{p},\overline{p}\right\} =0}). We need one rather obscure derivative of a trig function.

\displaystyle   \frac{d}{dp}\cot p \displaystyle  = \displaystyle  \frac{d}{dp}\left(\frac{\cos p}{\sin p}\right)\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  \frac{-\sin^{2}p-\cos^{2}p}{\sin^{2}p}\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  -1-\cot^{2}p \ \ \ \ \ (22)

We get

\displaystyle   \left\{ \overline{q},\overline{p}\right\} \displaystyle  = \displaystyle  \frac{\partial\overline{q}}{\partial q}\frac{\partial\overline{p}}{\partial p}-\frac{\partial\overline{q}}{\partial p}\frac{\partial\overline{p}}{\partial q}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \left(-\frac{1}{q}\right)\left(q\left(-1-\cot^{2}p\right)\right)-\frac{\cos p}{\sin p}\cot p\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  1+\cot^{2}p-\cot^{2}p\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  1 \ \ \ \ \ (26)

Thus the transformation is canonical.

Example 3 Finally, we return to the point transformation, which is given in general by

\displaystyle   \overline{q}_{i} \displaystyle  = \displaystyle  \overline{q}_{i}\left(q_{1},\ldots,q_{n}\right)\ \ \ \ \ (27)
\displaystyle  \overline{p}_{i} \displaystyle  = \displaystyle  \sum_{j}\frac{\partial q_{j}}{\partial\overline{q}_{i}}p_{j} \ \ \ \ \ (28)

In this case, the coordinate transformation to {\overline{q}} is completely arbitrary, but the momentum transformation must follow the formula given. The derivatives {\frac{\partial q_{i}}{\partial\overline{q}_{j}}} in the formula for {\overline{p}_{i}} are taken at constant {\overline{q}}. As in the earlier examples, since the coordinate formulas depend only on the old coordinates, and the momentum formulas depend only on the old momenta, the Poisson brackets satisfy

\displaystyle  \left\{ \overline{q}_{i},\overline{q}_{j}\right\} =\left\{ \overline{p}_{i},\overline{p}_{j}\right\} =0 \ \ \ \ \ (29)

For the mixed brackets, we have

\displaystyle   \left\{ \overline{q}_{i},\overline{p}_{j}\right\} \displaystyle  = \displaystyle  \sum_{k}\left(\frac{\partial\overline{q}_{i}}{\partial q_{k}}\frac{\partial\overline{p}_{j}}{\partial p_{k}}-\frac{\partial\overline{q}_{i}}{\partial p_{k}}\frac{\partial\overline{p}_{j}}{\partial q_{k}}\right)\ \ \ \ \ (30)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\frac{\partial\overline{q}_{i}}{\partial q_{k}}\frac{\partial q_{k}}{\partial\overline{q}_{j}}\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\overline{q}_{i}}{\partial\overline{q}_{j}}\ \ \ \ \ (32)
\displaystyle  \displaystyle  = \displaystyle  \delta_{ij} \ \ \ \ \ (33)

The second term in the first line is zero (mixed derivatives again). We used 28 to calculate the derivative {\frac{\partial\overline{p}_{j}}{\partial p_{k}}} and get the second line and then notice that the sum is an expansion of the chain rule for the derivative in line 3. Since {\overline{q}_{i}} and {\overline{q}_{j}} are independent variables, the result is that given in the last line. Thus a point transformation is a canonical transformation.

Canonical transformations in 2-d: rotations and polar coordinates

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.7; Exercises 2.7.4 – 2.7.5.

Here are a couple of examples of canonical variable transformations.

Example 1 We rotate the 2-d rectangular coordinates through an angle {\theta}, giving the transformations

\displaystyle   \overline{x} \displaystyle  = \displaystyle  x\cos\theta-y\sin\theta\ \ \ \ \ (1)
\displaystyle  \overline{y} \displaystyle  = \displaystyle  x\sin\theta+y\cos\theta\ \ \ \ \ (2)
\displaystyle  \overline{p}_{x} \displaystyle  = \displaystyle  p_{x}\cos\theta-p_{y}\sin\theta\ \ \ \ \ (3)
\displaystyle  \overline{p}_{y} \displaystyle  = \displaystyle  p_{x}\sin\theta+p_{y}\cos\theta \ \ \ \ \ (4)

To show this is a canonical transformation, we must evaluate the Poisson brackets. Here, {q_{1}=x} and {q_{2}=y}. Remember that {\theta} is a constant in these derivatives.

\displaystyle   \left\{ \overline{x},\overline{y}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\overline{x}}{\partial q_{i}}\frac{\partial\overline{y}}{\partial p_{i}}-\frac{\partial\overline{x}}{\partial p_{i}}\frac{\partial\overline{y}}{\partial q_{i}}\right)\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (6)

since neither coordinate depends on any momentum. Similarly {\left\{ \overline{p}_{x},\overline{p}_{y}\right\} =0} since this Poisson bracket contains derivatives of {\overline{p}_{i}} with respect to {q_{i}} and these are all zero. The remaining Poisson bracket are of the form {\left\{ \overline{q}_{i},\overline{p}_{j}\right\} }. There are four of these, but we’ll work out only a couple. The other two have similar forms.

\displaystyle   \left\{ \overline{x},\overline{p}_{x}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\overline{x}}{\partial q_{i}}\frac{\partial\overline{p}_{x}}{\partial p_{i}}-\frac{\partial\overline{x}}{\partial p_{i}}\frac{\partial\overline{p}_{x}}{\partial q_{i}}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\overline{x}}{\partial x}\frac{\partial\overline{p}_{x}}{\partial p_{x}}+\frac{\partial\overline{x}}{\partial y}\frac{\partial\overline{p}_{x}}{\partial p_{y}}\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \cos^{2}\theta+\sin^{2}\theta\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  1\ \ \ \ \ (10)
\displaystyle  \left\{ \overline{x},\overline{p}_{y}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\overline{x}}{\partial q_{i}}\frac{\partial\overline{p}_{y}}{\partial p_{i}}-\frac{\partial\overline{x}}{\partial p_{i}}\frac{\partial\overline{p}_{y}}{\partial q_{i}}\right)\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\overline{x}}{\partial x}\frac{\partial\overline{p}_{y}}{\partial p_{x}}+\frac{\partial\overline{x}}{\partial y}\frac{\partial\overline{p}_{y}}{\partial p_{y}}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \sin\theta\cos\theta-\sin\theta\cos\theta\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (14)

Similarly

\displaystyle   \left\{ \overline{y},\overline{p}_{x}\right\} \displaystyle  = \displaystyle  0\ \ \ \ \ (15)
\displaystyle  \left\{ \overline{y},\overline{p}_{y}\right\} \displaystyle  = \displaystyle  1 \ \ \ \ \ (16)

Example 2 The transformation from 2-d rectangular to polar coordinates is given by

\displaystyle   \rho \displaystyle  = \displaystyle  \sqrt{x^{2}+y^{2}}\ \ \ \ \ (17)
\displaystyle  \phi \displaystyle  = \displaystyle  \arctan\frac{y}{x}\ \ \ \ \ (18)
\displaystyle  p_{\rho} \displaystyle  = \displaystyle  \frac{xp_{x}+yp_{y}}{\sqrt{x^{2}+y^{2}}}\ \ \ \ \ (19)
\displaystyle  p_{\phi} \displaystyle  = \displaystyle  xp_{y}-yp_{x} \ \ \ \ \ (20)

For the Poisson brackets we have

\displaystyle   \left\{ \rho,\phi\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\rho}{\partial q_{i}}\frac{\partial\phi}{\partial p_{i}}-\frac{\partial\rho}{\partial p_{i}}\frac{\partial\phi}{\partial q_{i}}\right)\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (22)

because, again, the coordinates don’t depend on the momenta.

In this case, however, the new momenta do depend on the old coordinates, so we need to actually do some calculation.

\displaystyle   \left\{ p_{\rho},p_{\phi}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial p_{\rho}}{\partial q_{i}}\frac{\partial p_{\phi}}{\partial p_{i}}-\frac{\partial p_{\rho}}{\partial p_{i}}\frac{\partial p_{\phi}}{\partial q_{i}}\right)\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \left(-\frac{x\left(xp_{x}+yp_{y}\right)}{\left(x^{2}+y^{2}\right)^{3/2}}+\frac{p_{x}}{\sqrt{x^{2}+y^{2}}}\right)\left(-y\right)-\frac{x}{\sqrt{x^{2}+y^{2}}}p_{y}+\nonumber
\displaystyle  \displaystyle  \displaystyle  \left(-\frac{y\left(xp_{x}+yp_{y}\right)}{\left(x^{2}+y^{2}\right)^{3/2}}+\frac{p_{y}}{\sqrt{x^{2}+y^{2}}}\right)x-\frac{y}{\sqrt{x^{2}+y^{2}}}\left(-p_{x}\right)\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  -\frac{y^{2}\left(yp_{x}-xp_{y}\right)}{\left(x^{2}+y^{2}\right)^{3/2}}-\frac{x}{\sqrt{x^{2}+y^{2}}}p_{y}-\frac{x^{2}\left(yp_{x}-xp_{y}\right)}{\left(x^{2}+y^{2}\right)^{3/2}}+\frac{y}{\sqrt{x^{2}+y^{2}}}p_{x}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  -\frac{y^{3}p_{x}+x^{3}p_{y}}{\left(x^{2}+y^{2}\right)^{3/2}}+\frac{y^{3}p_{x}+x^{3}p_{y}}{\left(x^{2}+y^{2}\right)^{3/2}}\ \ \ \ \ (26)
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (27)

Finally, we need to work out the mixed brackets.

\displaystyle   \left\{ \rho,p_{\rho}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\rho}{\partial q_{i}}\frac{\partial p_{\rho}}{\partial p_{i}}-\frac{\partial\rho}{\partial p_{i}}\frac{\partial p_{\rho}}{\partial q_{i}}\right)\ \ \ \ \ (28)
\displaystyle  \displaystyle  = \displaystyle  \frac{x^{2}}{x^{2}+y^{2}}-0+\frac{y^{2}}{x^{2}+y^{2}}-0\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  1\ \ \ \ \ (30)
\displaystyle  \left\{ \rho,p_{\phi}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\rho}{\partial q_{i}}\frac{\partial p_{\phi}}{\partial p_{i}}-\frac{\partial\rho}{\partial p_{i}}\frac{\partial p_{\phi}}{\partial q_{i}}\right)\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  -\frac{xy}{x^{2}+y^{2}}-0+\frac{xy}{x^{2}+y^{2}}-0\ \ \ \ \ (32)
\displaystyle  \displaystyle  = \displaystyle  0\ \ \ \ \ (33)
\displaystyle  \left\{ \phi,p_{\rho}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\phi}{\partial q_{i}}\frac{\partial p_{\rho}}{\partial p_{i}}-\frac{\partial\phi}{\partial p_{i}}\frac{\partial p_{\rho}}{\partial q_{i}}\right)\ \ \ \ \ (34)
\displaystyle  \displaystyle  = \displaystyle  -\frac{y}{x\left(1+\frac{y^{2}}{x^{2}}\right)\sqrt{x^{2}+y^{2}}}-0+\frac{y}{x\left(1+\frac{y^{2}}{x^{2}}\right)\sqrt{x^{2}+y^{2}}}-0\ \ \ \ \ (35)
\displaystyle  \displaystyle  = \displaystyle  0\ \ \ \ \ (36)
\displaystyle  \left\{ \phi,p_{\phi}\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\phi}{\partial q_{i}}\frac{\partial p_{\phi}}{\partial p_{i}}-\frac{\partial\phi}{\partial p_{i}}\frac{\partial p_{\phi}}{\partial q_{i}}\right)\ \ \ \ \ (37)
\displaystyle  \displaystyle  = \displaystyle  \frac{y^{2}}{x^{2}\left(1+\frac{y^{2}}{x^{2}}\right)}-0+\frac{1}{1+\frac{y^{2}}{x^{2}}}-0\ \ \ \ \ (38)
\displaystyle  \displaystyle  = \displaystyle  \frac{y^{2}}{x^{2}+y^{2}}+\frac{x^{2}}{x^{2}+y^{2}}\ \ \ \ \ (39)
\displaystyle  \displaystyle  = \displaystyle  1 \ \ \ \ \ (40)

Thus all the Poisson brackets are correct, so the transformation is canonical.

Conditions for a transformation to be canonical

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.7; Exercise 2.7.3.

We’ve seen that the Euler-Lagrange equations are invariant under canonical transformations, but in the Hamiltonian formalism where the system moves in a {2n}-dimensional phase space with {n} coordinates {q} and {n} momenta {p}, more general transformations are possible:

\displaystyle   \overline{q}_{i} \displaystyle  = \displaystyle  \overline{q}_{i}\left(q,p\right)\ \ \ \ \ (1)
\displaystyle  \overline{p}_{i} \displaystyle  = \displaystyle  \overline{p}_{i}\left(q,p\right) \ \ \ \ \ (2)

In order for such a transformation to be canonical, we require that the new variables {\overline{q}} and {\overline{p}} satisfy Hamilton’s equations, that is

\displaystyle   \frac{\partial H}{\partial\overline{p}_{i}} \displaystyle  = \displaystyle  \dot{\overline{q}}_{i}\ \ \ \ \ (3)
\displaystyle  -\frac{\partial H}{\partial\overline{q}_{i}} \displaystyle  = \displaystyle  \dot{\overline{p}}_{i} \ \ \ \ \ (4)

In principle, then, we could check the Hamiltonian in the new coordinates to see if these equations are valid, but it would seem that whether or not a set of coordinates and momenta is canonical should be determinable from the variables themselves, and not depend on the specific Hamiltonian. Here we derive a set of conditions on the {\overline{q}} and {\overline{p}} that determine whether or not the transformation is canonical.

The time derivative of any function {\omega} can be written as a Poisson bracket:

\displaystyle  \dot{\omega}=\left\{ \omega,H\right\} \ \ \ \ \ (5)

For the transformed velocities, we have

\displaystyle   \dot{\overline{q}}_{j} \displaystyle  = \displaystyle  \left\{ \overline{q}_{j},H\right\} \ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\overline{q}_{j}}{\partial q_{i}}\frac{\partial H}{\partial p_{i}}-\frac{\partial\overline{q}_{j}}{\partial p_{i}}\frac{\partial H}{\partial q_{i}}\right) \ \ \ \ \ (7)

Here, {H} is written as a function {H\left(q,p\right)} of the original variables. If we write it as a function of the transformed variables, we can find the two derivatives of {H} in 7 by using the chain rule:

\displaystyle   \frac{\partial H\left(\overline{q},\overline{p}\right)}{\partial p_{i}} \displaystyle  = \displaystyle  \sum_{k}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial p_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial p_{i}}\right)\ \ \ \ \ (8)
\displaystyle  \frac{\partial H\left(\overline{q},\overline{p}\right)}{\partial q_{i}} \displaystyle  = \displaystyle  \sum_{k}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial q_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial q_{i}}\right) \ \ \ \ \ (9)

Inserting these into 7 we get

\displaystyle   \dot{\overline{q}}_{j} \displaystyle  = \displaystyle  \sum_{i}\sum_{k}\left[\frac{\partial\overline{q}_{j}}{\partial q_{i}}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial p_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial p_{i}}\right)-\frac{\partial\overline{q}_{j}}{\partial p_{i}}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial q_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial q_{i}}\right)\right]\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\frac{\partial H}{\partial\overline{q}_{k}}\sum_{i}\left(\frac{\partial\overline{q}_{j}}{\partial q_{i}}\frac{\partial\overline{q}_{k}}{\partial p_{i}}-\frac{\partial\overline{q}_{j}}{\partial p_{i}}\frac{\partial\overline{q}_{k}}{\partial q_{i}}\right)+\sum_{k}\frac{\partial H}{\partial\overline{p}_{k}}\sum_{i}\left(\frac{\partial\overline{q}_{j}}{\partial q_{i}}\frac{\partial\overline{p}_{k}}{\partial p_{i}}-\frac{\partial\overline{q}_{j}}{\partial p_{i}}\frac{\partial\overline{p}_{k}}{\partial q_{i}}\right)\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\frac{\partial H}{\partial\overline{q}_{k}}\left\{ \overline{q}_{j},\overline{q}_{k}\right\} +\sum_{k}\frac{\partial H}{\partial\overline{p}_{k}}\left\{ \overline{q}_{j},\overline{p}_{k}\right\} \ \ \ \ \ (12)

In order for this result to satisfy 3, we must have

\displaystyle   \left\{ \overline{q}_{j},\overline{q}_{k}\right\} \displaystyle  = \displaystyle  0\ \ \ \ \ (13)
\displaystyle  \left\{ \overline{q}_{j},\overline{p}_{k}\right\} \displaystyle  = \displaystyle  \delta_{jk} \ \ \ \ \ (14)

We can repeat the calculation for {\dot{\overline{p}}_{i}}:

\displaystyle   \dot{\overline{p}}_{j} \displaystyle  = \displaystyle  \left\{ \overline{p}_{j},H\right\} \ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\overline{p}_{j}}{\partial q_{i}}\frac{\partial H}{\partial p_{i}}-\frac{\partial\overline{p}_{j}}{\partial p_{i}}\frac{\partial H}{\partial q_{i}}\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sum_{k}\left[\frac{\partial\overline{p}_{j}}{\partial q_{i}}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial p_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial p_{i}}\right)-\frac{\partial\overline{p}_{j}}{\partial p_{i}}\left(\frac{\partial H}{\partial\overline{q}_{k}}\frac{\partial\overline{q}_{k}}{\partial q_{i}}+\frac{\partial H}{\partial\overline{p}_{k}}\frac{\partial\overline{p}_{k}}{\partial q_{i}}\right)\right]\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\frac{\partial H}{\partial\overline{q}_{k}}\sum_{i}\left(\frac{\partial\overline{p}_{j}}{\partial q_{i}}\frac{\partial\overline{q}_{k}}{\partial p_{i}}-\frac{\partial\overline{p}_{j}}{\partial p_{i}}\frac{\partial\overline{q}_{k}}{\partial q_{i}}\right)+\sum_{k}\frac{\partial H}{\partial\overline{p}_{k}}\sum_{i}\left(\frac{\partial\overline{p}_{j}}{\partial q_{i}}\frac{\partial\overline{p}_{k}}{\partial p_{i}}-\frac{\partial\overline{p}_{j}}{\partial p_{i}}\frac{\partial\overline{p}_{k}}{\partial q_{i}}\right)\ \ \ \ \ (18)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\frac{\partial H}{\partial\overline{q}_{k}}\left\{ \overline{p}_{j},\overline{q}_{k}\right\} +\sum_{k}\frac{\partial H}{\partial\overline{p}_{k}}\left\{ \overline{p}_{j},\overline{p}_{k}\right\} \ \ \ \ \ (19)

Requiring this to satsify 4, we have

\displaystyle   \left\{ \overline{p}_{j},\overline{p}_{k}\right\} \displaystyle  = \displaystyle  0\ \ \ \ \ (20)
\displaystyle  \left\{ \overline{p}_{j},\overline{q}_{k}\right\} \displaystyle  = \displaystyle  -\delta_{jk} \ \ \ \ \ (21)

The last equation is equivalent to

\displaystyle  \left\{ \overline{q}_{j},\overline{p}_{k}\right\} =\delta_{jk} \ \ \ \ \ (22)

which agrees with 14. Thus in order for the transformation to be canonical, the conditions are

\displaystyle   \left\{ \overline{q}_{j},\overline{q}_{k}\right\} \displaystyle  = \displaystyle  \left\{ \overline{p}_{j},\overline{p}_{k}\right\} =0\ \ \ \ \ (23)
\displaystyle  \left\{ \overline{q}_{j},\overline{p}_{k}\right\} \displaystyle  = \displaystyle  \delta_{jk} \ \ \ \ \ (24)

Note that these Poisson brackets require calculating the derivatives of the new variables {\overline{q}} and {\overline{p}} with respect to the original ones {q} and {p}, but they don’t involve any particular Hamiltonian. Thus it’s possible to determine whether or not a transformation is canonical entirely from the transformation equations 1 and 2.

Invariance of Euler-Lagrange and Hamilton’s equations under canonical transformations

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.7; Exercise 2.7.8 (1-3).

Here we’ll investigate how the Euler-Lagrange equations and Hamilton’s canonical equations are affected by a change in coordinates of the form

\displaystyle  q_{i}\rightarrow\overline{q}_{i}\left(q_{1},\ldots,q_{n}\right) \ \ \ \ \ (1)

Note that the new coordinates {\overline{q}} depend only on the old coordinates and not on the velocities {\dot{q}_{i}}. We also assume that the transformation is invertible, so it’s possible to find the {q_{i}} as functions of the {\overline{q}_{i}}.

First, we need to show that the Euler-Lagrange equations are invariant under such a transformation. Starting with the inverse equations

\displaystyle  q_{i}=q_{i}\left(\overline{q}\right) \ \ \ \ \ (2)

(we’re using unsubscripted variables to refer to the entire set, so that {\overline{q}=\left(\overline{q}_{1},\ldots,\overline{q}_{n}\right)}), we have

\displaystyle  \dot{q}_{i}=\sum_{j}\frac{\partial q_{i}}{\partial\overline{q}_{j}}\dot{\overline{q}}_{j} \ \ \ \ \ (3)

Since the velocities {\dot{\overline{q}}_{j}} are independent variables, this implies that, if we hold the coordinates {\overline{q}} constant,

\displaystyle  \left(\frac{\partial\dot{q}_{i}}{\partial\dot{\overline{q}}_{j}}\right)_{\overline{q}}=\frac{\partial q_{i}}{\partial\overline{q}_{j}} \ \ \ \ \ (4)

since the derivative just picks out the one term containing {\dot{\overline{q}}_{j}} in the sum 3. Now consider the Euler-Lagrange equations in the new coordinates. To do this, we write the Lagrangian in terms of the new coordinates and velocities, so that

\displaystyle  L=L\left(\overline{q},\dot{\overline{q}}\right) \ \ \ \ \ (5)

Taking derivatives, we have

\displaystyle  \frac{\partial L}{\partial\overline{q}_{i}}=\sum_{j}\left[\frac{\partial L}{\partial q_{j}}\frac{\partial q_{j}}{\partial\overline{q}_{i}}+\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial\overline{q}_{i}}\right] \ \ \ \ \ (6)

The second term on the RHS is zero since the velocities don’t depend on the coordinates (and vice versa), so we’re left with

\displaystyle  \frac{\partial L}{\partial\overline{q}_{i}}=\sum_{j}\frac{\partial L}{\partial q_{j}}\frac{\partial q_{j}}{\partial\overline{q}_{i}} \ \ \ \ \ (7)

Now for the other derivative

\displaystyle  \frac{\partial L}{\partial\dot{\overline{q}}_{i}}=\sum_{j}\left[\frac{\partial L}{\partial q_{j}}\frac{\partial q_{j}}{\partial\dot{\overline{q}}_{i}}+\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial\dot{\overline{q}}_{i}}\right] \ \ \ \ \ (8)

The first term on the RHS is zero (same reason as in the previous equation), and we can apply 4 to the second term to get

\displaystyle  \frac{\partial L}{\partial\dot{\overline{q}}_{i}}=\sum_{j}\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial q_{i}}{\partial\overline{q}_{j}} \ \ \ \ \ (9)

We can now take the derivative with respect to time and apply the Euler-Lagrange equation (which we know to be valid for the {q} coordinates). We’re also assuming that the coordinates have no explicit time dependence. Thus

\displaystyle   \frac{d}{dt}\left(\frac{\partial L}{\partial\dot{\overline{q}}_{i}}\right) \displaystyle  = \displaystyle  \sum_{j}\frac{d}{dt}\left(\frac{\partial L}{\partial\dot{q}_{j}}\right)\frac{\partial q_{i}}{\partial\overline{q}_{j}}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}\frac{\partial L}{\partial q_{j}}\frac{\partial q_{i}}{\partial\overline{q}_{j}} \ \ \ \ \ (11)

Comparing this with 7 we see that

\displaystyle  \frac{d}{dt}\left(\frac{\partial L}{\partial\dot{\overline{q}}_{i}}\right)=\frac{\partial L}{\partial\overline{q}_{i}} \ \ \ \ \ (12)

That is, the Euler-Lagrange equations are valid for the {\overline{q}} coordinates as well.

We can use the Lagrangian to see how the momenta {p_{i}} transform under the coordinate change. The definition of the canonical momentum is

\displaystyle  p_{i}=\frac{\partial L}{\partial\dot{q}_{i}} \ \ \ \ \ (13)

If we write the Lagrangian in terms of the {\overline{q}} coordinates and velocities as in 5, then the momenta in the new coordinate system are

\displaystyle  \overline{p}_{i}=\frac{\partial L\left(\overline{q},\dot{\overline{q}}\right)}{\partial\dot{\overline{q}}_{i}} \ \ \ \ \ (14)

At this point, it’s worth noting that although {L\left(\overline{q},\dot{\overline{q}}\right)} and {L\left(q,\dot{q}\right)} are different functions, they have the same value at each point in the configuration space. That is, if we choose some point that has the coordinates {\left(q,\dot{q}\right)} in the {q} system and coordinates {\left(\overline{q},\dot{\overline{q}}\right)} in the {\overline{q}} system, then, numerically at that one point, we must have {L\left(\overline{q},\dot{\overline{q}}\right)=L\left(q,\dot{q}\right)}. Because of this, we can write

\displaystyle  \overline{p}_{i}=\left(\frac{\partial L\left(\overline{q},\dot{\overline{q}}\right)}{\partial\dot{\overline{q}}_{i}}\right)_{\overline{q}}=\left(\frac{\partial L\left(q,\dot{q}\right)}{\partial\dot{\overline{q}}_{i}}\right)_{\overline{q}} \ \ \ \ \ (15)

That is, if we’re keeping {\overline{q}} constant, the derivative of {L} with respect to {\dot{\overline{q}}_{i}} must be the same (numerically) no matter what coordinates we’re using to write {L}. Therefore, we can use the latter form and then use the chain rule to write out the derivative:

\displaystyle  \overline{p}_{i}=\left(\frac{\partial L\left(q,\dot{q}\right)}{\partial\dot{\overline{q}}_{i}}\right)_{\overline{q}}=\sum_{j}\left[\frac{\partial L}{\partial q_{j}}\frac{\partial q_{j}}{\partial\dot{\overline{q}}_{i}}+\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial\dot{\overline{q}}_{i}}\right] \ \ \ \ \ (16)

Because the coordinates {q} don’t depend on the velocities {\dot{\overline{q}}}, the first term on the RHS is zero. We can use 4 in the second term, and we have

\displaystyle   \overline{p}_{i} \displaystyle  = \displaystyle  \sum_{j}\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial q_{j}}{\partial\overline{q}_{i}}\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}\frac{\partial q_{j}}{\partial\overline{q}_{i}}p_{j} \ \ \ \ \ (18)

where we used the definition of {p_{j}=\partial L/\partial\dot{q}_{j}} in the last line.

If we review the derivation of Hamilton’s equations, we see that nowhere did we make any assumptions about the particular coordinate system that was being used in the Lagrangian. All that is required for Hamilton’s equations to be valid is that the momenta are defined as in 14, and that the Euler-Lagrange equations are satisfied. Therefore, in any such system, Hamilton’s equations are valid:

\displaystyle   \frac{\partial H}{\partial\overline{p}_{i}} \displaystyle  = \displaystyle  \dot{\overline{q}}_{i}\ \ \ \ \ (19)
\displaystyle  -\frac{\partial H}{\partial\overline{q}_{i}} \displaystyle  = \displaystyle  \dot{\overline{p}}_{i} \ \ \ \ \ (20)

A transformation of the form 1 and 18, that is, that obeys

\displaystyle   \overline{q}_{i} \displaystyle  = \displaystyle  \overline{q}_{i}\left(q_{1},\ldots,q_{n}\right)\ \ \ \ \ (21)
\displaystyle  \overline{p}_{i} \displaystyle  = \displaystyle  \sum_{j}\frac{\partial q_{j}}{\partial\overline{q}_{i}}p_{j} \ \ \ \ \ (22)

is called a point transformation.

In the {2n}-dimensional phase space of the Hamiltonian formalism, where {q} and {p} are the variables rather than the {q} and {\dot{q}} used in the Lagrangian, we can envision a more general transformation in which

\displaystyle   \overline{q}_{i} \displaystyle  = \displaystyle  \overline{q}_{i}\left(q,p\right)\ \ \ \ \ (23)
\displaystyle  \overline{p}_{i} \displaystyle  = \displaystyle  \overline{p}_{i}\left(q,p\right) \ \ \ \ \ (24)

In such a general transformation, there’s no guarantee that 18 is satisfied, so such transformations need not be point transformations (though they could be). There’s also no guarantee that the momenta are related to the Lagrangian by 14, and thus Hamilton’s equations may not be satisfied.

However, a set of coordinates {\left(\overline{q},\overline{p}\right)} that does satisfy Hamilton’s equations 19 and 20 is known as a canonical transformation.

Cyclic coordinates and Poisson brackets

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.7; Exercises 2.7.1 – 2.7.2.

Hamilton’s canonical equations are:

\displaystyle   \frac{\partial H}{\partial p_{i}} \displaystyle  = \displaystyle  \dot{q}_{i}\ \ \ \ \ (1)
\displaystyle  -\frac{\partial H}{\partial q_{i}} \displaystyle  = \displaystyle  \dot{p}_{i} \ \ \ \ \ (2)

If a coordinate {q_{i}} is missing in the Hamiltonian (that is, {H} is indepedent of {q_{i}}), then

\displaystyle  \dot{p}_{i}=-\frac{\partial H}{\partial q_{i}}=0 \ \ \ \ \ (3)

Thus the conjugate momentum {p_{i}} is conserved. Such a missing coordinate {q_{i}} is known as a cyclic coordinate. [I’m not sure of the origin of this term. Again Google doesn’t provide a definitive answer.]

There is a general method for calculating the rate of change of some function {\omega\left(p,q\right)} that depends on the momenta and coordinates, but not explicitly on the time ({\omega} is allowed to depend implicitly on time since {p} and/or {q} can depend on time). The time derivative can then be written using the chain rule:

\displaystyle   \frac{d\omega}{dt} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\dot{q}_{i}+\frac{\partial\omega}{\partial p_{i}}\dot{p}_{i}\right)\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial H}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial H}{\partial q_{i}}\right)\ \ \ \ \ (5)
\displaystyle  \displaystyle  \equiv \displaystyle  \left\{ \omega,H\right\}  \ \ \ \ \ (6)

where in the second line we used Hamilton’s equations 1 and 2. The last line defines the Poisson bracket of the function {\omega} with the Hamiltonian {H}. We can see that if {\left\{ \omega,H\right\} =0}, the function {\omega} is conserved.

Since {\left\{ H,H\right\} =0} automatically, the total energy (represented by the Hamiltonian) is conserved, provided there is no explicit time dependence. Such a time dependence can arise if the system is subject to some external force, for example.

From the definition 5 we can derive a few fundamental properties of Poisson brackets. We’ll consider a general Poisson bracket between two arbitrary functions {\omega\left(p,q\right)} and {\lambda\left(p,q\right)}. Then

\displaystyle   \left\{ \omega,\lambda\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\lambda}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\lambda}{\partial q_{i}}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  -\sum_{i}\left(\frac{\partial\omega}{\partial p_{i}}\frac{\partial\lambda}{\partial q_{i}}-\frac{\partial\omega}{\partial q_{i}}\frac{\partial\lambda}{\partial p_{i}}\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  -\sum_{i}\left(\frac{\partial\lambda}{\partial q_{i}}\frac{\partial\omega}{\partial p_{i}}-\frac{\partial\lambda}{\partial p_{i}}\frac{\partial\omega}{\partial q_{i}}\right)\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  -\left\{ \lambda,\omega\right\} \ \ \ \ \ (10)

A Poisson bracket is distributive, in the sense that

\displaystyle   \left\{ \omega,\lambda+\sigma\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\left(\lambda+\sigma\right)}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\left(\lambda+\sigma\right)}{\partial q_{i}}\right)\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\left[\frac{\partial\lambda}{\partial p_{i}}+\frac{\partial\sigma}{\partial p_{i}}\right]-\frac{\partial\omega}{\partial p_{i}}\left[\frac{\partial\lambda}{\partial q_{i}}+\frac{\partial\sigma}{\partial q_{i}}\right]\right)\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\lambda}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\lambda}{\partial q_{i}}\right)+\sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\sigma}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\sigma}{\partial q_{i}}\right)\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \left\{ \omega,\lambda\right\} +\left\{ \omega,\sigma\right\} \ \ \ \ \ (14)

One more identity is useful, which we can derive using the product rule:

\displaystyle   \left\{ \omega,\lambda\sigma\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\left(\lambda\sigma\right)}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\left(\lambda\sigma\right)}{\partial q_{i}}\right)\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sigma\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\lambda}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\lambda}{\partial q_{i}}\right)+\sum_{i}\lambda\left(\frac{\partial\omega}{\partial q_{i}}\frac{\partial\sigma}{\partial p_{i}}-\frac{\partial\omega}{\partial p_{i}}\frac{\partial\sigma}{\partial q_{i}}\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \left\{ \omega,\lambda\right\} \sigma+\left\{ \omega,\sigma\right\} \lambda \ \ \ \ \ (17)

The Poisson brackets involving the coordinates {q_{i}} and momenta {p_{i}} turn up frequently, so it’s worth deriving them in detail. We have

\displaystyle  \left\{ q_{i},q_{j}\right\} =\sum_{k}\left(\frac{\partial q_{i}}{\partial q_{k}}\frac{\partial q_{j}}{\partial p_{k}}-\frac{\partial q_{i}}{\partial p_{k}}\frac{\partial q_{j}}{\partial q_{k}}\right)=0 \ \ \ \ \ (18)

This follows because, in the Hamiltonian formalism, the {q_{i}}s and {p_{i}}s are independent variables, so {\frac{\partial q_{j}}{\partial p_{k}}=\frac{\partial p_{j}}{\partial q_{k}}=0} for all {j} and {k}. For the same reason, we have

\displaystyle  \left\{ p_{i},p_{j}\right\} =\sum_{k}\left(\frac{\partial p_{i}}{\partial q_{k}}\frac{\partial p_{j}}{\partial p_{k}}-\frac{\partial p_{i}}{\partial p_{k}}\frac{\partial p_{j}}{\partial q_{k}}\right)=0 \ \ \ \ \ (19)

The mixed Poisson bracket is a different story, however:

\displaystyle   \left\{ q_{i},p_{j}\right\} \displaystyle  = \displaystyle  \sum_{k}\left(\frac{\partial q_{i}}{\partial q_{k}}\frac{\partial p_{j}}{\partial p_{k}}-\frac{\partial q_{i}}{\partial p_{k}}\frac{\partial p_{j}}{\partial q_{k}}\right)\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  \sum_{k}\delta_{ik}\delta_{jk}-0\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \delta_{ij} \ \ \ \ \ (22)

Hamilton’s equations 1 and 2 can be written using Poisson brackets by setting {\omega} equal to {q_{i}} and {p_{i}} respectively in 6:

\displaystyle   \dot{q}_{i} \displaystyle  = \displaystyle  \left\{ q_{i},H\right\} \ \ \ \ \ (23)
\displaystyle  \dot{p}_{i} \displaystyle  = \displaystyle  \left\{ p_{i},H\right\} \ \ \ \ \ (24)

Example In two dimensions, we have a Hamiltonian:

\displaystyle  H=p_{x}^{2}+p_{y}^{2}+ax^{2}+by^{2} \ \ \ \ \ (25)

If {a=b}, then in polar coordiantes, the only coordinate appearing in {H} is the radial distance from the origin {r=\sqrt{x^{2}+y^{2}}}, which means that the polar angle {\theta} is a cyclic coordinate. This means that the conjugate momentum {p_{\theta}} must be conserved. That is,

\displaystyle  \dot{p}_{\theta}=\left\{ p_{\theta},H\right\} =0 \ \ \ \ \ (26)

However, {p_{\theta}} is the angular momentum {\ell_{z}}, so this just says that angular momentum is conserved.

To see this explicitly, it’s easier to convert to polar coordinates. From Hamilton’s equations

\displaystyle   \dot{x} \displaystyle  = \displaystyle  \frac{\partial H}{\partial p_{x}}=2p_{x}\ \ \ \ \ (27)
\displaystyle  \dot{y} \displaystyle  = \displaystyle  2p_{y}\ \ \ \ \ (28)
\displaystyle  p_{x}^{2}+p_{y}^{2} \displaystyle  = \displaystyle  \frac{1}{4}\left(\dot{x}^{2}+\dot{y}^{2}\right)\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  \frac{v^{2}}{4}\ \ \ \ \ (30)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4}\left(\dot{r}^{2}+r^{2}\dot{\theta}^{2}\right) \ \ \ \ \ (31)

where in the fourth line, {v} is the linear velocity and in the fifth line we converted this to polar coordinates. Thus the Hamiltonian becomes, in the case where {a=b}:

\displaystyle  H=\frac{1}{4}\left(\dot{r}^{2}+r^{2}\dot{\theta}^{2}\right)+ar^{2} \ \ \ \ \ (32)

To find the conjugate momenta in polar coordinates, we can write out the Lagrangian. We use {p_{x}\dot{x}=\frac{\dot{x}^{2}}{2}} and {p_{y}\dot{y}=\frac{\dot{y}^{2}}{2}} and get

\displaystyle   L \displaystyle  = \displaystyle  \sum_{i}p_{i}\dot{q}_{i}-H\ \ \ \ \ (33)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\left(\dot{x}^{2}+\dot{y}^{2}\right)-\frac{1}{4}\left(\dot{r}^{2}+r^{2}\dot{\theta}^{2}\right)-ar^{2}\ \ \ \ \ (34)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4}\left(\dot{r}^{2}+r^{2}\dot{\theta}^{2}\right)-ar^{2} \ \ \ \ \ (35)

The conjugate momenta are thus

\displaystyle   p_{\theta} \displaystyle  = \displaystyle  \frac{\partial L}{\partial\dot{\theta}}=\frac{1}{2}r^{2}\dot{\theta}\ \ \ \ \ (36)
\displaystyle  p_{r} \displaystyle  = \displaystyle  \frac{\partial L}{\partial\dot{r}}=\frac{\dot{r}}{2} \ \ \ \ \ (37)

From this we can see that {p_{\theta}} is indeed angular momentum as it’s proportional to the product of {r} and the tangential velocity {v_{\theta}=r\dot{\theta}}. (‘Real’ momentum and angular momentum must, of course, also contain a factor of a mass, but from the definition of the Hamiltonian above, we see that the mass has been incorporated into the momentum parameters.)

Plugging these back into 32 we get

\displaystyle  H=p_{r}^{2}+p_{\theta}^{2}+ar^{2} \ \ \ \ \ (38)

We can now calculate the Poisson brackets easily:

\displaystyle   \left\{ p_{\theta},H\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial p_{\theta}}{\partial q_{i}}\frac{\partial H}{\partial p_{i}}-\frac{\partial p_{\theta}}{\partial p_{i}}\frac{\partial H}{\partial q_{i}}\right)\ \ \ \ \ (39)
\displaystyle  \displaystyle  = \displaystyle  0-\frac{\partial p_{\theta}}{\partial p_{\theta}}\frac{\partial H}{\partial\theta}=0\ \ \ \ \ (40)
\displaystyle  \left\{ p_{r},H\right\} \displaystyle  = \displaystyle  \sum_{i}\left(\frac{\partial p_{r}}{\partial q_{i}}\frac{\partial H}{\partial p_{i}}-\frac{\partial p_{r}}{\partial p_{i}}\frac{\partial H}{\partial q_{i}}\right)\ \ \ \ \ (41)
\displaystyle  \displaystyle  = \displaystyle  0-\frac{\partial p_{r}}{\partial p_{r}}\frac{\partial H}{\partial r}\ \ \ \ \ (42)
\displaystyle  \displaystyle  = \displaystyle  -2ar \ \ \ \ \ (43)

Thus {p_{\theta}} (the angular momentum) is conserved, while {p_{r}<0}, so that the object is always being pulled in towards the origin.

Hamiltonian for the electromagnetic force

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.6.

Here we derive the equations of motion for the electromagnetic force using the Hamiltonian formalism.

The Hamiltonian is given by

\displaystyle  H\left(q,p\right)=\sum_{i}p_{i}\dot{q}_{i}-L\left(q,\dot{q}\right) \ \ \ \ \ (1)

where the velocities {\dot{q}_{i}} are expressed in terms of the positions {q_{i}} and momenta {p_{i}}. The electromagnetic Lagrangian is

\displaystyle  L=\frac{1}{2}m\mathbf{v}\cdot\mathbf{v}-q\phi+\frac{q}{c}\mathbf{v}\cdot\mathbf{A} \ \ \ \ \ (2)

where {\phi} is the electric potential and {\mathbf{A}} is the magnetic potential, with {\mathbf{v}} the velocity of the charge {q} with mass {m}. To convert to the Hamiltonian, we need the momentum, defined as

\displaystyle  p_{i}=\frac{\partial L}{\partial\dot{q}_{i}}

In this case, the generalized velocity is given by

\displaystyle  \dot{q}_{i}=v_{i} \ \ \ \ \ (3)

so we have

\displaystyle  p_{i}=mv_{i}+\frac{q}{c}A_{i} \ \ \ \ \ (4)

or, in vector notation

\displaystyle   \mathbf{p} \displaystyle  = \displaystyle  m\mathbf{v}+\frac{q}{c}\mathbf{A}\ \ \ \ \ (5)
\displaystyle  \mathbf{v} \displaystyle  = \displaystyle  \frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A} \ \ \ \ \ (6)

The Lagrangian is therefore

\displaystyle  L=\frac{\left|\mathbf{p}-q\mathbf{A}/c\right|^{2}}{2m}-q\phi+\frac{q}{c}\left(\frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A}\right)\cdot\mathbf{A} \ \ \ \ \ (7)

The first sum in the Hamiltonian is

\displaystyle  \sum_{i}p_{i}\dot{q}_{i}=\mathbf{p}\cdot\mathbf{v}=\mathbf{p}\cdot\left(\frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A}\right) \ \ \ \ \ (8)

The Hamiltonian is then

\displaystyle   H \displaystyle  = \displaystyle  \mathbf{p}\cdot\left(\frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A}\right)-\frac{\left|\mathbf{p}-q\mathbf{A}/c\right|^{2}}{2m}+q\phi-\frac{q}{c}\left(\frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A}\right)\cdot\mathbf{A}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \left(\frac{\mathbf{p}}{m}-\frac{q}{mc}\mathbf{A}\right)\left(\mathbf{p}-\frac{q}{c}\mathbf{A}\right)-\frac{\left|\mathbf{p}-q\mathbf{A}/c\right|^{2}}{2m}+q\phi\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left|\mathbf{p}-q\mathbf{A}/c\right|^{2}}{2m}+q\phi \ \ \ \ \ (11)

Hamiltonian for the two-body problem

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.5; Exercise 2.5.4.

Here we derive the equations of motion of the two-body problem using the Hamiltonian formalism.

The Hamiltonian is given by

\displaystyle H\left(q,p\right)=\sum_{i}p_{i}\dot{q}_{i}-L\left(q,\dot{q}\right) \ \ \ \ \ (1)

where the velocities {\dot{q}_{i}} are expressed in terms of the positions {q_{i}} and momenta {p_{i}}. In this case, we start with the Lagrangian in terms of the centre of mass position {\mathbf{r}_{CM}} and the relative position {\mathbf{r}} of mass 2 to mass 1.

\displaystyle L \displaystyle = \displaystyle \frac{1}{2}\left(m_{1}+m_{2}\right)\left|\dot{\mathbf{r}}_{CM}\right|^{2}+\frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\left|\dot{\mathbf{r}}\right|^{2}-V\left(\mathbf{r}\right)\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle \frac{M}{2}\left|\dot{\mathbf{r}}_{CM}\right|^{2}+\frac{\mu}{2}\left|\dot{\mathbf{r}}\right|^{2}-V\left(\mathbf{r}\right) \ \ \ \ \ (3)

where {M=m_{1}+m_{2}} is the total mass and {\mu=\frac{m_{1}m_{2}}{m_{1}+m_{2}}} is the reduced mass.

There are potentially 6 velocity components and 6 coordinate components in the Lagrangian, but the 3 components of {\mathbf{r}_{CM}} do not appear, which simplifies things a bit. To convert to a Hamiltonian, we need the momenta

\displaystyle p_{i}=\frac{\partial L}{\partial\dot{q}_{i}} \ \ \ \ \ (4)

The {x} component of momentum of the centre of mass is

\displaystyle p_{CM,x}=\frac{\partial L}{\partial\dot{r}_{CM,x}}=M\dot{r}_{CM,x} \ \ \ \ \ (5)

The other two components of the centre of mass velocity, and of the relative velocity, have a similar form, and in general we can write

\displaystyle p_{CM,i} \displaystyle = \displaystyle M\dot{r}_{CM,i}\ \ \ \ \ (6)
\displaystyle p_{i} \displaystyle = \displaystyle \mu\dot{r}_{i} \ \ \ \ \ (7)

In vector notation, this becomes

\displaystyle \dot{\mathbf{r}}_{CM} \displaystyle = \displaystyle \frac{\mathbf{p}_{CM}}{M}\ \ \ \ \ (8)
\displaystyle \dot{\mathbf{r}} \displaystyle = \displaystyle \frac{\mathbf{p}}{\mu}\ \ \ \ \ (9)
\displaystyle \left|\dot{\mathbf{r}}_{CM}\right|^{2} \displaystyle = \displaystyle \frac{\left|\mathbf{p}_{CM}\right|^{2}}{M^{2}}\ \ \ \ \ (10)
\displaystyle \left|\dot{\mathbf{r}}\right|^{2} \displaystyle = \displaystyle \frac{\left|\mathbf{p}\right|^{2}}{\mu^{2}} \ \ \ \ \ (11)

The Lagrangian thus becomes

\displaystyle L=\frac{\left|\mathbf{p}_{CM}\right|^{2}}{2M}+\frac{\left|\mathbf{p}\right|^{2}}{2\mu}-V\left(\mathbf{r}\right) \ \ \ \ \ (12)

The Hamiltonian is

\displaystyle H \displaystyle = \displaystyle \mathbf{p}\cdot\dot{\mathbf{r}}+\mathbf{p}_{CM}\cdot\dot{\mathbf{r}}_{CM}-L\ \ \ \ \ (13)
\displaystyle \displaystyle = \displaystyle \frac{\left|\mathbf{p}\right|^{2}}{\mu}+\frac{\left|\mathbf{p}_{CM}\right|^{2}}{M}-\left[\frac{\left|\mathbf{p}_{CM}\right|^{2}}{2M}+\frac{\left|\mathbf{p}\right|^{2}}{2\mu}-V\left(\mathbf{r}\right)\right]\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle \frac{\left|\mathbf{p}_{CM}\right|^{2}}{2M}+\frac{\left|\mathbf{p}\right|^{2}}{2\mu}+V\left(\mathbf{r}\right) \ \ \ \ \ (15)

Once we’ve got the Hamiltonian, we can apply Hamilton’s canonical equations to get the equations of motion.

\displaystyle \frac{\partial H}{\partial p_{i}} \displaystyle = \displaystyle \dot{r}_{i}\ \ \ \ \ (16)
\displaystyle -\frac{\partial H}{\partial r_{i}} \displaystyle = \displaystyle \dot{p}_{i} \ \ \ \ \ (17)

Since {\mathbf{r}_{CM}} does not appear in the Hamiltonian, we have

\displaystyle \dot{\mathbf{p}}_{CM} \displaystyle = \displaystyle 0\ \ \ \ \ (18)
\displaystyle \mathbf{p}_{CM} \displaystyle = \displaystyle \mbox{constant} \ \ \ \ \ (19)

so the momentum of the centre of mass does not change, as expected.

For {\mathbf{r}}, we have

\displaystyle \frac{\partial H}{\partial p_{i}} \displaystyle = \displaystyle \frac{p_{i}}{\mu}=\dot{r}_{i}\ \ \ \ \ (20)
\displaystyle \frac{\partial H}{\partial r_{i}} \displaystyle = \displaystyle \frac{\partial V}{\partial r_{i}}=-\dot{p}_{i} \ \ \ \ \ (21)

The first equation tells us nothing new, while the second is just Newton’s law for a central force: {\mathbf{\dot{p}}=-\nabla V}.

Hamiltonians for harmonic oscillators

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.5; Exercises 2.5.2 – 2.5.3.

Here are a couple of examples of equations of motion using the Hamiltonian formalism. First, we look at the simple harmonic oscillator, in which we have a mass {m} sliding on a frictionless horizontal surface. The mass is connected to a spring with constant {k}, with the other end of the spring connected to a fixed support.

The Hamiltonian is given by

\displaystyle H\left(q,p\right)=\sum_{i}p_{i}\dot{q}_{i}-L\left(q,\dot{q}\right) \ \ \ \ \ (1)

where the velocities {\dot{q}_{i}} are expressed in terms of the positions {q_{i}} and momenta {p_{i}}. In this case, we have, using the coordinate {x} as the displacement from equilibrium

\displaystyle L\left(x,\dot{x}\right) \displaystyle = \displaystyle \frac{1}{2}m\dot{x}^{2}-\frac{1}{2}kx^{2}\ \ \ \ \ (2)
\displaystyle p \displaystyle = \displaystyle \frac{\partial L}{\partial\dot{x}}=m\dot{x}\ \ \ \ \ (3)
\displaystyle \dot{x} \displaystyle = \displaystyle \frac{p}{m}\ \ \ \ \ (4)
\displaystyle L\left(x,\dot{x}\left(x,p\right)\right) \displaystyle = \displaystyle \frac{p^{2}}{2m}-\frac{1}{2}kx^{2}\ \ \ \ \ (5)
\displaystyle H \displaystyle = \displaystyle \frac{p^{2}}{m}-\left(\frac{p^{2}}{2m}-\frac{1}{2}kx^{2}\right)\ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle \frac{p^{2}}{2m}+\frac{1}{2}kx^{2} \ \ \ \ \ (7)

We can now apply Hamilton’s canonical equations:

\displaystyle \frac{\partial H}{\partial p} \displaystyle = \displaystyle \dot{x}\ \ \ \ \ (8)
\displaystyle -\frac{\partial H}{\partial x} \displaystyle = \displaystyle \dot{p} \ \ \ \ \ (9)

We get

\displaystyle \frac{\partial H}{\partial p} \displaystyle = \displaystyle \frac{p}{m}=\dot{x}\ \ \ \ \ (10)
\displaystyle -\frac{\partial H}{\partial x} \displaystyle = \displaystyle -kx=\dot{p} \ \ \ \ \ (11)

We thus get a pair of first order ODEs which can be solved in the usual way, given {x\left(0\right)} and {p\left(0\right)}. The second order ODE that we got by using the Lagrangian method can be obtained by differentiating the first equation and plugging it into the second:

\displaystyle \ddot{x} \displaystyle = \displaystyle \frac{\dot{p}}{m}\ \ \ \ \ (12)
\displaystyle \displaystyle = \displaystyle -\frac{k}{m}x \ \ \ \ \ (13)

From 7 we see that, since in the absence of external force, the total energy {H=T+V=E} is a constant,

\displaystyle \frac{p^{2}}{2m}+\frac{1}{2}kx^{2}=E=\mbox{constant} \ \ \ \ \ (14)

This can be written as the equation of an ellipse:

\displaystyle \frac{p^{2}}{b^{2}}+\frac{x^{2}}{a^{2}}=1 \ \ \ \ \ (15)

where

\displaystyle a^{2} \displaystyle = \displaystyle \frac{2E}{k}\ \ \ \ \ (16)
\displaystyle b^{2} \displaystyle = \displaystyle 2mE \ \ \ \ \ (17)

We can use the Hamiltonian formalism to get the equations of motion of the coupled harmonic oscillator. From our Lagrangian treatment, we had

\displaystyle L=\frac{1}{2}m\left(\dot{x}_{1}^{2}+\dot{x}_{2}^{2}\right)-k\left(x_{1}^{2}+x_{2}^{2}-x_{1}x_{2}\right) \ \ \ \ \ (18)

Converting to coordinates and momenta, we have

\displaystyle p_{i} \displaystyle = \displaystyle \frac{\partial L}{\partial\dot{x}_{i}}=m\dot{x}_{i}\ \ \ \ \ (19)
\displaystyle \dot{x}_{i} \displaystyle = \displaystyle \frac{p_{i}}{m}\ \ \ \ \ (20)
\displaystyle H \displaystyle = \displaystyle \sum_{i}p_{i}\dot{x}_{i}-L\left(x,\dot{x}\right)\ \ \ \ \ (21)
\displaystyle \displaystyle = \displaystyle \frac{1}{m}\left(p_{1}^{2}+p_{2}^{2}\right)-\left[\frac{1}{2m}m\left(p_{1}^{2}+p_{2}^{2}\right)-k\left(x_{1}^{2}+x_{2}^{2}-x_{1}x_{2}\right)\right]\ \ \ \ \ (22)
\displaystyle \displaystyle = \displaystyle \frac{1}{2m}\left(p_{1}^{2}+p_{2}^{2}\right)+k\left(x_{1}^{2}+x_{2}^{2}-x_{1}x_{2}\right) \ \ \ \ \ (23)

Applying the canonical equations gives

\displaystyle \frac{\partial H}{\partial p_{i}} \displaystyle = \displaystyle \frac{p_{i}}{m}=\dot{x}_{i}\ \ \ \ \ (24)
\displaystyle -\frac{\partial H}{\partial x_{1}} \displaystyle = \displaystyle -2kx_{1}+kx_{2}=\dot{p}_{1}\ \ \ \ \ (25)
\displaystyle -\frac{\partial H}{\partial x_{2}} \displaystyle = \displaystyle -2kx_{2}+kx_{1}=\dot{p}_{2} \ \ \ \ \ (26)

Again, by taking the derivative of the first line and substituting into the last two lines, we get back the previous equations of motion:

\displaystyle \ddot{x}_{1} \displaystyle = \displaystyle -2\frac{k}{m}x_{1}+\frac{k}{m}x_{2}\ \ \ \ \ (27)
\displaystyle \ddot{x}_{2} \displaystyle = \displaystyle \frac{k}{m}x_{1}-2\frac{k}{m}x_{2} \ \ \ \ \ (28)

Hamiltonian formalism and Legendre transformations

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.5; Exercise 2.5.1.

The Lagrangian formulation of classical mechanics is one of two principal formalisms used to obtain equations of motion for a system. The other method is the Hamiltonian formalism. The main difference between the two methods is that the Lagrangian treats the generalized coordinates {q_{i}} and their respective velocities {\dot{q}_{i}} as the independent variables, while in the Hamiltonian formalism, the coordinates and their associated momenta are the independent variables. The momentum {p_{i}} corresponding to a coordinate {q_{i}} is defined by

\displaystyle  p_{i}\equiv\frac{dL}{d\dot{q}_{i}} \ \ \ \ \ (1)

The Lagrangian is replaced by a function {H\left(q,p\right)} (where we’re using unsubscripted variables {q} and {p} to represent the sets of coordinates and momenta) with the property that

\displaystyle  \dot{q}_{i}=\frac{\partial H}{\partial p_{i}} \ \ \ \ \ (2)

The method for transforming from the Lagrangian picture to the Hamiltonian picture is known as a Legendre transformation and works as follows. Suppose we start with a function {f\left(x_{1},x_{2},\ldots,x_{n}\right)} (here, the {x_{i}} can be any independent variables; we’re not considering coordinates explicitly yet) and we want to replace a subset {\left\{ x_{i},i=1\ldots,j\right\} } with different variables {u_{i}}, where

\displaystyle  u_{i}\equiv\frac{\partial f}{\partial x_{i}} \ \ \ \ \ (3)

We now construct the function

\displaystyle  g\left(u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right)\equiv\sum_{i=1}^{j}u_{i}x_{i}-f\left(x_{1},\ldots,x_{n}\right) \ \ \ \ \ (4)

We’re assuming that all the {x_{i}} in the set {\left\{ x_{i},i=1\ldots,j\right\} } can be written as functions of {\left\{ u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right\} }. In other words, when written out in full, 4 contains only the variables {\left\{ u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right\} }. We can now take the derivative:

\displaystyle   \frac{\partial g}{\partial u_{i}} \displaystyle  = \displaystyle  x_{i}+\sum_{i=1}^{j}\left[u_{i}\frac{\partial x_{i}}{\partial u_{i}}-\frac{\partial f}{\partial x_{i}}\frac{\partial x_{i}}{\partial u_{i}}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  x_{i}+\sum_{i=1}^{j}\left[u_{i}\frac{\partial x_{i}}{\partial u_{i}}-u_{i}\frac{\partial x_{i}}{\partial u_{i}}\right]\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  x_{i} \ \ \ \ \ (7)

where the second line follows from the definition 3.

To move from the Lagrangian formalism to the Hamiltonian formalism, the Lagrangian plays the role of {f}, the generalized velocities {\dot{q}_{i}} are the variables {\left\{ x_{i},i=1\ldots,j\right\} } to be replaced, and the Hamiltonian is the new function {g}. That is, we have

\displaystyle  H\left(q,p\right)=\sum_{i=1}^{n}p_{i}\dot{q}_{i}-L\left(q,\dot{q}\right) \ \ \ \ \ (8)

There are a total of {n} momenta {p_{i}} and {n} coordinates {q_{i}}, for a total of {2n} independent coordinates. In 8, it is assumed that we can express all the velocities {\dot{q}_{i}} as functions of {q_{i}} and {p_{i}}. With these definitions, we can see by following through the derivation of 7 that 2 is satisfied.

We can get another equation by considering the derivative

\displaystyle   \frac{\partial H}{\partial q_{i}} \displaystyle  = \displaystyle  \sum_{j=1}^{n}p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-\frac{\partial L}{\partial q_{i}}-\sum_{j=1}^{n}\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j=1}^{n}\left[p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\right]-\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j=1}^{n}\left[p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\right]-\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  -\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  -\frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  -\dot{p}_{i} \ \ \ \ \ (14)

In the third line, we used 1, in the fifth line we used the Euler-Lagrange equation

\displaystyle  \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}-\frac{\partial L}{\partial q_{i}}=0 \ \ \ \ \ (15)

and in the last line, we used 1 again. We thus get Hamilton’s canonical equations:

\displaystyle   \frac{\partial H}{\partial p_{i}} \displaystyle  = \displaystyle  \dot{q}_{i}\ \ \ \ \ (16)
\displaystyle  -\frac{\partial H}{\partial q_{i}} \displaystyle  = \displaystyle  \dot{p}_{i} \ \ \ \ \ (17)

[As an aside at this point, I was (and still am) unsure exactly what the term ‘canonical’ means in this, or in almost any other, context. Google is not very helpful in this respect, as it appears that nobody else really knows where the term came from. According to Wikepedia, the term ‘canonical’ is used to describe equations in several areas of mathematics, physics and even computer science, but ultimately the term appears to originate in religion, as in ‘canon law’, which is a system of laws created by the Catholic church. Presumably the term in physics is used to describe some equation or principle which is widely applicable and general. Any other thoughts are welcome in the comments.]

In cases where the potential energy doesn’t depend on velocity, the Lagrangian is {T-V}, where {T} is the kinetic energy. The Hamiltonian (as you’ve probably guessed) can be interpreted as the total energy of such a system, as we can see as follows.

Using rectangular coordinates, where each mass {m_{i}} has a kinetic energy {T_{i}=\frac{1}{2}m_{i}\dot{x}_{i}^{2}} (this is true in one dimension; to extend to 3 dimensions, we write {T_{i}=\frac{1}{2}m_{i}\left(\dot{x}_{i}^{2}+\dot{y}_{i}^{2}+\dot{z}_{i}^{2}\right)} and the same argument follows). Thus the momentum is

\displaystyle  p_{i}=\frac{\partial L}{\partial\dot{x}_{i}}=\frac{\partial T}{\partial\dot{x}_{i}}=m_{i}\dot{x}_{i} \ \ \ \ \ (18)

Thus the first term in 8 is

\displaystyle  \sum_{i=1}^{n}p_{i}\dot{q}_{i}=\sum_{i=1}^{n}m_{i}\dot{x}_{i}^{2}=2T \ \ \ \ \ (19)

and the Hamiltonian is

\displaystyle  H=2T-L=2T-T+V=T+V \ \ \ \ \ (20)

Now consider a more general kinetic energy defined as

\displaystyle  T=\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\dot{q}_{j} \ \ \ \ \ (21)

That is, {T} is a matrix that depends on the positions of the various masses. We have

\displaystyle   p_{k} \displaystyle  = \displaystyle  \frac{\partial L}{\partial\dot{q}_{k}}=\frac{\partial T}{\partial\dot{q}_{k}}\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sum_{j}T_{ij}\left(q\right)\frac{\partial\dot{q}_{i}}{\partial\dot{q}_{k}}\dot{q}_{j}+\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\frac{\partial\dot{q}_{j}}{\partial\dot{q}_{k}}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sum_{j}T_{ij}\left(q\right)\delta_{ik}\dot{q}_{j}+\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\delta_{jk}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}T_{kj}\left(q\right)\dot{q}_{j}+\sum_{i}T_{ik}\left(q\right)\dot{q}_{i}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}\left(T_{kj}+T_{jk}\right)\dot{q}_{j} \ \ \ \ \ (26)

The first term in 8 now becomes

\displaystyle   \sum_{k}p_{k}\dot{q}_{k} \displaystyle  = \displaystyle  \sum_{k}\sum_{j}\left(T_{kj}+T_{jk}\right)\dot{q}_{j}\dot{q}_{k}\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  2T \ \ \ \ \ (28)

where the last line follows because the RHS of the first line is symmetric under the exchange of {j} and {k}.

Lagrangian for the two-body problem

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.3; Exercise 2.3.1.

A fundamental problem in classical physics is the two-body problem, in which two masses interact via a potential {V\left(\mathbf{r}_{1}-\mathbf{r}_{2}\right)} that depends only on the relative positions of the two masses. In such a case, the Lagrangian can be decoupled so that the problem gets reduced to a one-body problem.

The Euler-Lagrange equations are

\displaystyle \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}-\frac{\partial L}{\partial q_{i}}=0 \ \ \ \ \ (1)

 

where {q_{i}} and {\dot{q_{i}}} are the generalized coordinates and velocities, respectively. For systems where the potential energy {V\left(q_{i}\right)} is independent of the velocities {\dot{q}_{i}}, the Lagrangian can be written as

\displaystyle L=T-V \ \ \ \ \ (2)

where {T} is the kinetic energy. In terms of the absolute positions and velocities, we have

\displaystyle L=\frac{1}{2}m_{1}\left|\dot{\mathbf{r}}_{1}\right|^{2}+\frac{1}{2}m_{2}\left|\dot{\mathbf{r}}_{2}\right|^{2}-V\left(\mathbf{r}_{1}-\mathbf{r}_{2}\right) \ \ \ \ \ (3)

 

To decouple this equation, we define two new position vectors:

\displaystyle \mathbf{r} \displaystyle \equiv \displaystyle \mathbf{r}_{1}-\mathbf{r}_{2}\ \ \ \ \ (4)
\displaystyle \mathbf{r}_{CM} \displaystyle \equiv \displaystyle \frac{m_{1}\mathbf{r}_{1}+m_{2}\mathbf{r}_{2}}{m_{1}+m_{2}} \ \ \ \ \ (5)

Here {\mathbf{r}} is the relative position, and {\mathbf{r}_{CM}} is the position of the centre of mass.

We can invert these equations to get

\displaystyle \mathbf{r}_{1} \displaystyle = \displaystyle \mathbf{r}+\mathbf{r}_{2}\ \ \ \ \ (6)
\displaystyle \left(m_{1}+m_{2}\right)\mathbf{r}_{CM} \displaystyle = \displaystyle m_{1}\mathbf{r}+\left(m_{1}+m_{2}\right)\mathbf{r}_{2}\ \ \ \ \ (7)
\displaystyle \mathbf{r}_{2} \displaystyle = \displaystyle \mathbf{r}_{CM}-\frac{m_{1}}{m_{1}+m_{2}}\mathbf{r}\ \ \ \ \ (8)
\displaystyle \mathbf{r}_{1} \displaystyle = \displaystyle \mathbf{r}_{CM}-\frac{m_{2}}{m_{1}+m_{2}}\mathbf{r} \ \ \ \ \ (9)

To decouple the Lagrangian, we insert these last two equations into 3.

\displaystyle m_{1}\left|\dot{\mathbf{r}}_{1}\right|^{2} \displaystyle = \displaystyle m_{1}\left[\dot{\mathbf{r}}_{CM}-\frac{m_{2}}{m_{1}+m_{2}}\dot{\mathbf{r}}\right]\cdot\left[\dot{\mathbf{r}}_{CM}-\frac{m_{2}}{m_{1}+m_{2}}\dot{\mathbf{r}}\right]\ \ \ \ \ (10)
\displaystyle \displaystyle = \displaystyle m_{1}\left|\dot{\mathbf{r}}_{CM}\right|^{2}-2\frac{m_{1}m_{2}}{m_{1}+m_{2}}\dot{\mathbf{r}}_{CM}\cdot\dot{\mathbf{r}}+m_{1}\left(\frac{m_{2}}{m_{1}+m_{2}}\right)^{2}\left|\dot{\mathbf{r}}\right|^{2}\ \ \ \ \ (11)
\displaystyle m_{2}\left|\dot{\mathbf{r}}_{2}\right|^{2} \displaystyle = \displaystyle m_{2}\left[\dot{\mathbf{r}}_{CM}+\frac{m_{1}}{m_{1}+m_{2}}\dot{\mathbf{r}}\right]\cdot\left[\dot{\mathbf{r}}_{CM}+\frac{m_{1}}{m_{1}+m_{2}}\dot{\mathbf{r}}\right]\ \ \ \ \ (12)
\displaystyle \displaystyle = \displaystyle m_{2}\left|\dot{\mathbf{r}}_{CM}\right|^{2}+2\frac{m_{1}m_{2}}{m_{1}+m_{2}}\dot{\mathbf{r}}_{CM}\cdot\dot{\mathbf{r}}+m_{2}\left(\frac{m_{1}}{m_{1}+m_{2}}\right)^{2}\left|\dot{\mathbf{r}}\right|^{2}\ \ \ \ \ (13)
\displaystyle \frac{1}{2}m_{1}\left|\dot{\mathbf{r}}_{1}\right|^{2}+\frac{1}{2}m_{2}\left|\dot{\mathbf{r}}_{2}\right|^{2} \displaystyle = \displaystyle \frac{1}{2}\left(m_{1}+m_{2}\right)\left|\dot{\mathbf{r}}_{CM}\right|^{2}+\frac{1}{2}\frac{m_{1}m_{2}^{2}+m_{2}m_{1}^{2}}{\left(m_{1}+m_{2}\right)^{2}}\left|\dot{\mathbf{r}}\right|^{2}\ \ \ \ \ (14)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}\left(m_{1}+m_{2}\right)\left|\dot{\mathbf{r}}_{CM}\right|^{2}+\frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\left|\dot{\mathbf{r}}\right|^{2} \ \ \ \ \ (15)

The Lagrangian 3 thus becomes

\displaystyle L \displaystyle = \displaystyle \frac{1}{2}\left(m_{1}+m_{2}\right)\left|\dot{\mathbf{r}}_{CM}\right|^{2}+\frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\left|\dot{\mathbf{r}}\right|^{2}-V\left(\mathbf{r}\right)\ \ \ \ \ (16)
\displaystyle \displaystyle \equiv \displaystyle L_{CM}+L_{r} \ \ \ \ \ (17)

with

\displaystyle L_{CM} \displaystyle \equiv \displaystyle \frac{1}{2}\left(m_{1}+m_{2}\right)\left|\dot{\mathbf{r}}_{CM}\right|^{2}\ \ \ \ \ (18)
\displaystyle L_{r} \displaystyle \equiv \displaystyle \frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\left|\dot{\mathbf{r}}\right|^{2}-V\left(\mathbf{r}\right) \ \ \ \ \ (19)

Thus {L} decouples into two Lagrangians, one of which depends only on {\dot{\mathbf{r}}_{CM}} and the other of which depends only on {\mathbf{r}} and {\dot{\mathbf{r}}}. The absence of {\mathbf{r}_{CM}} means that, from 1

\displaystyle \frac{d}{dt}\frac{\partial L}{\partial\dot{r}_{i,CM}} \displaystyle = \displaystyle \frac{d}{dt}\frac{\partial L_{CM}}{\partial\dot{r}_{i,CM}}=\frac{m_{1}+m_{2}}{2}\frac{d\dot{r}_{i,CM}}{dt}=0\ \ \ \ \ (20)
\displaystyle \dot{r}_{i,CM} \displaystyle = \displaystyle \mbox{constant} \ \ \ \ \ (21)

which is separately true for each component of {\dot{\mathbf{r}}_{CM}}, which shows that the velocity of the centre of mass is a constant, as we’d expect for an isolated two-body system with no external force.

From the other Lagrangian, we get

\displaystyle \frac{m_{1}m_{2}}{m_{1}+m_{2}}\ddot{\mathbf{r}}=-\nabla V\left(\mathbf{r}\right) \ \ \ \ \ (22)

which is the equation of motion of a single particle of mass {\frac{m_{1}m_{2}}{m_{1}+m_{2}}}, called the reduced mass. Viewed from the centre of mass frame, where {\dot{\mathbf{r}}_{CM}=0}, {\mathbf{r}} becomes the absolute position of the reduced mass. We can transform the result back to the ‘absolute’ frame by using 4.