References: edX online course MIT 8.05.1x Week 4.
Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapters 3.F, 7.
A linear functional is a linear map from a vector space to the number field which satisfies the two properties
- , with .
- for all and .
That is, a linear functional acts on a vector and produces a number as output.
A linear functional is actually a vector space, since it satisfies all the required axioms. Many of these axioms are satisfied because on which acts is a vector space. The only axiom that requires a bit of examination is the existence of an additive identity. This requires . From property 1 above, this means that .
From the definition above, we can prove that any linear functional can be written as an inner product.
Theorem 1 For any linear functional on there is a unique vector such that for all .
Proof: We can write any vector in terms of an orthonormal basis as
By applying the two properties of a linear functional above, we have
We were able to move inside the inner product in the third line above since is just a number.
To prove that is unique, as usual we suppose there is another that gives the same result as for all . This means that or for all . We can then choose , giving , which implies that so .
Suppose we have some linear operator and some fixed vector . We can then form the inner product
is a linear functional since it satisfies the two properties specified earlier. It is now stated in Zwiebach’s notes that because is a linear functional, we can write it in the form for some vector . It’s not clear to me that this follows directly, since the original definition of a linear functional applied to the entire vector space , whereas here we don’t know whether is a surjective operator, that is, whether the range of is the entire space . The motivation behind this step is the definition of the adjoint operator, but in Axler’s book (chapter 7.A), an adjoint is just defined directly without any motivation from linear functionals.
Anyway, we’ll just go with Zwiebach’s argument, since the rest of the derivation is fairly easy to follow. We assume that for a suitable vector we have
The vector depends on both the operator and the vector , so we can write it as a function of , using the notation
This gives us the relation
At this stage, we can’t be sure that is a linear operator; it may be some non-linear map from one vector to another. However, we have
Theorem 2 The operator , called the adjoint of , is a linear operator: .
Comparing the first and last lines gives us
A similar argument can be used for multiplication by a number:
Again, comparing the first and last lines we have
Thus satisfies the two conditions required for linearity.
A couple of other results follow fairly easily (proofs are in Zwiebach’s notes, if you’re interested):
A very important result is the representation of adjoint operators in matrix form.
If we have an orthonormal basis and an operator , then transforms the basis according to (using the summation convention):
Thus the matrix elements in this basis are found by taking the inner product with :
Taking the inner product on the right with we get
We can take the and outside the inner product as they are just numbers. Using this, we have
That is, in an orthonormal basis, the adjoint matrix is the complex conjugate transpose of the original matrix:
The superscript indicates ‘transpose’, not another operator!