Hamiltonian formalism and Legendre transformations

References: Shankar, R. (1994), Principles of Quantum Mechanics, Plenum Press. Section 2.5; Exercise 2.5.1.

The Lagrangian formulation of classical mechanics is one of two principal formalisms used to obtain equations of motion for a system. The other method is the Hamiltonian formalism. The main difference between the two methods is that the Lagrangian treats the generalized coordinates {q_{i}} and their respective velocities {\dot{q}_{i}} as the independent variables, while in the Hamiltonian formalism, the coordinates and their associated momenta are the independent variables. The momentum {p_{i}} corresponding to a coordinate {q_{i}} is defined by

\displaystyle  p_{i}\equiv\frac{dL}{d\dot{q}_{i}} \ \ \ \ \ (1)

The Lagrangian is replaced by a function {H\left(q,p\right)} (where we’re using unsubscripted variables {q} and {p} to represent the sets of coordinates and momenta) with the property that

\displaystyle  \dot{q}_{i}=\frac{\partial H}{\partial p_{i}} \ \ \ \ \ (2)

The method for transforming from the Lagrangian picture to the Hamiltonian picture is known as a Legendre transformation and works as follows. Suppose we start with a function {f\left(x_{1},x_{2},\ldots,x_{n}\right)} (here, the {x_{i}} can be any independent variables; we’re not considering coordinates explicitly yet) and we want to replace a subset {\left\{ x_{i},i=1\ldots,j\right\} } with different variables {u_{i}}, where

\displaystyle  u_{i}\equiv\frac{\partial f}{\partial x_{i}} \ \ \ \ \ (3)

We now construct the function

\displaystyle  g\left(u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right)\equiv\sum_{i=1}^{j}u_{i}x_{i}-f\left(x_{1},\ldots,x_{n}\right) \ \ \ \ \ (4)

We’re assuming that all the {x_{i}} in the set {\left\{ x_{i},i=1\ldots,j\right\} } can be written as functions of {\left\{ u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right\} }. In other words, when written out in full, 4 contains only the variables {\left\{ u_{1},\ldots,u_{j},x_{j+1},\ldots,x_{n}\right\} }. We can now take the derivative:

\displaystyle   \frac{\partial g}{\partial u_{i}} \displaystyle  = \displaystyle  x_{i}+\sum_{i=1}^{j}\left[u_{i}\frac{\partial x_{i}}{\partial u_{i}}-\frac{\partial f}{\partial x_{i}}\frac{\partial x_{i}}{\partial u_{i}}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  x_{i}+\sum_{i=1}^{j}\left[u_{i}\frac{\partial x_{i}}{\partial u_{i}}-u_{i}\frac{\partial x_{i}}{\partial u_{i}}\right]\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  x_{i} \ \ \ \ \ (7)

where the second line follows from the definition 3.

To move from the Lagrangian formalism to the Hamiltonian formalism, the Lagrangian plays the role of {f}, the generalized velocities {\dot{q}_{i}} are the variables {\left\{ x_{i},i=1\ldots,j\right\} } to be replaced, and the Hamiltonian is the new function {g}. That is, we have

\displaystyle  H\left(q,p\right)=\sum_{i=1}^{n}p_{i}\dot{q}_{i}-L\left(q,\dot{q}\right) \ \ \ \ \ (8)

There are a total of {n} momenta {p_{i}} and {n} coordinates {q_{i}}, for a total of {2n} independent coordinates. In 8, it is assumed that we can express all the velocities {\dot{q}_{i}} as functions of {q_{i}} and {p_{i}}. With these definitions, we can see by following through the derivation of 7 that 2 is satisfied.

We can get another equation by considering the derivative

\displaystyle   \frac{\partial H}{\partial q_{i}} \displaystyle  = \displaystyle  \sum_{j=1}^{n}p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-\frac{\partial L}{\partial q_{i}}-\sum_{j=1}^{n}\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j=1}^{n}\left[p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-\frac{\partial L}{\partial\dot{q}_{j}}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\right]-\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j=1}^{n}\left[p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}-p_{j}\frac{\partial\dot{q}_{j}}{\partial q_{i}}\right]-\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  -\frac{\partial L}{\partial q_{i}}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  -\frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  -\dot{p}_{i} \ \ \ \ \ (14)

In the third line, we used 1, in the fifth line we used the Euler-Lagrange equation

\displaystyle  \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{i}}-\frac{\partial L}{\partial q_{i}}=0 \ \ \ \ \ (15)

and in the last line, we used 1 again. We thus get Hamilton’s canonical equations:

\displaystyle   \frac{\partial H}{\partial p_{i}} \displaystyle  = \displaystyle  \dot{q}_{i}\ \ \ \ \ (16)
\displaystyle  -\frac{\partial H}{\partial q_{i}} \displaystyle  = \displaystyle  \dot{p}_{i} \ \ \ \ \ (17)

[As an aside at this point, I was (and still am) unsure exactly what the term ‘canonical’ means in this, or in almost any other, context. Google is not very helpful in this respect, as it appears that nobody else really knows where the term came from. According to Wikepedia, the term ‘canonical’ is used to describe equations in several areas of mathematics, physics and even computer science, but ultimately the term appears to originate in religion, as in ‘canon law’, which is a system of laws created by the Catholic church. Presumably the term in physics is used to describe some equation or principle which is widely applicable and general. Any other thoughts are welcome in the comments.]

In cases where the potential energy doesn’t depend on velocity, the Lagrangian is {T-V}, where {T} is the kinetic energy. The Hamiltonian (as you’ve probably guessed) can be interpreted as the total energy of such a system, as we can see as follows.

Using rectangular coordinates, where each mass {m_{i}} has a kinetic energy {T_{i}=\frac{1}{2}m_{i}\dot{x}_{i}^{2}} (this is true in one dimension; to extend to 3 dimensions, we write {T_{i}=\frac{1}{2}m_{i}\left(\dot{x}_{i}^{2}+\dot{y}_{i}^{2}+\dot{z}_{i}^{2}\right)} and the same argument follows). Thus the momentum is

\displaystyle  p_{i}=\frac{\partial L}{\partial\dot{x}_{i}}=\frac{\partial T}{\partial\dot{x}_{i}}=m_{i}\dot{x}_{i} \ \ \ \ \ (18)

Thus the first term in 8 is

\displaystyle  \sum_{i=1}^{n}p_{i}\dot{q}_{i}=\sum_{i=1}^{n}m_{i}\dot{x}_{i}^{2}=2T \ \ \ \ \ (19)

and the Hamiltonian is

\displaystyle  H=2T-L=2T-T+V=T+V \ \ \ \ \ (20)

Now consider a more general kinetic energy defined as

\displaystyle  T=\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\dot{q}_{j} \ \ \ \ \ (21)

That is, {T} is a matrix that depends on the positions of the various masses. We have

\displaystyle   p_{k} \displaystyle  = \displaystyle  \frac{\partial L}{\partial\dot{q}_{k}}=\frac{\partial T}{\partial\dot{q}_{k}}\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sum_{j}T_{ij}\left(q\right)\frac{\partial\dot{q}_{i}}{\partial\dot{q}_{k}}\dot{q}_{j}+\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\frac{\partial\dot{q}_{j}}{\partial\dot{q}_{k}}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \sum_{i}\sum_{j}T_{ij}\left(q\right)\delta_{ik}\dot{q}_{j}+\sum_{i}\sum_{j}T_{ij}\left(q\right)\dot{q}_{i}\delta_{jk}\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}T_{kj}\left(q\right)\dot{q}_{j}+\sum_{i}T_{ik}\left(q\right)\dot{q}_{i}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \sum_{j}\left(T_{kj}+T_{jk}\right)\dot{q}_{j} \ \ \ \ \ (26)

The first term in 8 now becomes

\displaystyle   \sum_{k}p_{k}\dot{q}_{k} \displaystyle  = \displaystyle  \sum_{k}\sum_{j}\left(T_{kj}+T_{jk}\right)\dot{q}_{j}\dot{q}_{k}\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  2T \ \ \ \ \ (28)

where the last line follows because the RHS of the first line is symmetric under the exchange of {j} and {k}.