Projection operators

References: edX online course MIT 8.05.1x Week 4.

Sheldon Axler (2015), Linear Algebra Done Right, 3rd edition, Springer. Chapter 6.

Continuing from our examination of orthonormal bases and the orthogonal complement in a vector space {V}, we can now look at the orthogonal projection, sometimes known in physics as a projection operator.

Suppose we have defined a subspace {U} of {V} and its orthogonal complement {U^{\perp}}, so that {V=U\oplus U^{\perp}}. We can define a linear operator {P_{U}} called the orthogonal projection operator. It has the property that, given any vector {v\in V}, it ‘projects’ out the component of {v} that lies in {U}. That is, if we write

\displaystyle v=u+w \ \ \ \ \ (1)

where {u\in U} and {w\in U^{\perp}}, then

\displaystyle P_{U}v=u \ \ \ \ \ (2)

An example of a projection operator is an operator in 3-d space that projects a vector onto the {xy} plane. Then the {xy} plane is the subspace {U} and the {z} axis is the orthogonal complement {U^{\perp}}.

From the definition of {P_{U}} we can list a few properties:

  1. {P_{U}} is not surjective, that is, its range is smaller than the entire space {V}.
  2. {P_{U}} is not injective, since it maps all vectors {u+w} to {u}, for all {w\in U^{\perp}}. Thus it is a many-to-one mapping.
  3. {P_{U}} is not invertible, since it is not injective.
  4. Its null space is {\mbox{null }P_{U}=U^{\perp}}.
  5. Once {P_{U}} is applied to any vector {v}, all subsequent applications of {P_{U}} have no effect. That is, once you’ve projected out the component of {v} that lies in {U}, all further projections into {U} just give the same result. In other words {P_{U}^{n}=P_{U}} for all integers {n>0}.
  6. {\left|P_{U}v\right|\le\left|v\right|}. This follows from the Pythagorean theorem, since {u} and {w} are orthogonal, so {\left|v\right|^{2}=\left|u\right|^{2}+\left|w\right|^{2}\ge\left|u\right|^{2}=\left|P_{U}v\right|^{2}}. Geometrically, a projection operator cannot increase the ‘length’ (norm) of a vector. This property relies on the fact that the projection is an orthogonal projection. Other projections can increase the length of a vector (think of the shadow cast by a stick; if the surface onto which the shadow falls is nearly parallel to the direction of the incoming light, the shadow is much longer than the stick).

An explicit form for {P_{U}v} can be obtained from the decomposition we had earlier

\displaystyle v=\underbrace{\sum_{i=1}^{n}\left\langle e_{i},v\right\rangle e_{i}}_{\in U}+\underbrace{v-\sum_{i=1}^{n}\left\langle e_{i},v\right\rangle e_{i}}_{\in U^{\perp}} \ \ \ \ \ (3)

From this,

\displaystyle P_{U}v=\sum_{i=1}^{n}\left\langle e_{i},v\right\rangle e_{i} \ \ \ \ \ (4)

From the definition, it seems reasonable that a vector space {V} can be decomposed into a direct sum of {\mbox{range }P_{U}} and {\mbox{null }P_{U}}. We can in fact prove this.

Theorem 1 {P} is an orthogonal projection within the vector space {V} if

\displaystyle V=\mbox{null }P\oplus\mbox{range }P \ \ \ \ \ (5)

Proof: We can take the subspace {U=\mbox{range }P}. From our earlier theorem, we know that {V=U\oplus U^{\perp}}, so we need to show that {U^{\perp}=\mbox{null }P}. Since {Pw=0} for any {w\in U^{\perp}}, then {\mbox{null }P\subset U^{\perp}}, but are there vectors in {U^{\perp}} that are not in {\mbox{null }P}? Suppose there is such a vector {x\in U^{\perp}} such that {Px\ne0}. For such a vector, we can decompose it into {x=x^{\prime}+x^{\prime\prime}} where {x^{\prime}\in\mbox{null }P} and {x^{\prime\prime}\in\mbox{range }P}, with {x^{\prime\prime}\ne0} (since if {x^{\prime\prime}=0}, then {x} would be in {\mbox{null }P}, contrary to our assumption).

As {x\in U^{\perp}}, {\left\langle x,u\right\rangle =0} for all {u\in U=\mbox{range }P}. Therefore {\left\langle x,u\right\rangle =\left\langle x^{\prime}+x^{\prime\prime},u\right\rangle =\left\langle x^{\prime},u\right\rangle +\left\langle x^{\prime\prime},u\right\rangle =0}. Since {x^{\prime}\in\mbox{null }P}, {\left\langle x^{\prime},u\right\rangle =0} (as {x^{\prime}\in U^{\perp}}). Therefore we must have {\left\langle x^{\prime\prime},u\right\rangle =0}, implying that {x^{\prime\prime}\in U^{\perp}} also. Thus {x^{\prime\prime}\in U} and {x^{\prime\prime}\in U^{\perp}}, but the only vector that can be in both a subspace and its orthogonal complement is 0, so {x^{\prime\prime}=0}, which contradicts our assumption above. \Box

From property 5 above, we must have {P_{U}^{2}=P_{U}}, which implies that the eigenvalues of {P_{U}} are 0 and 1. The eigenvectors belong to either the subspace {U} (for eigenvalue 1) or to the orthogonal complement {U^{\perp}} (for eigenvalue 0).

The orthonormal basis of a vector space {V} can be divided into two separate lists of vectors, with one list {\left(e_{1},\ldots,e_{m}\right)} spanning the subspace {U} and the other list {\left(f_{1},\ldots,f_{k}\right)} spanning {U^{\perp}}. A matrix representation of {P_{U}} can be obtained by considering the action of {P_{U}} on each of the basis vectors from the two subspaces. We have

\displaystyle P_{U}e_{i} \displaystyle = \displaystyle e_{i}\ \ \ \ \ (6)
\displaystyle P_{U}f_{i} \displaystyle = \displaystyle 0 \ \ \ \ \ (7)

In general, the matrix representation of an operator {T} is defined in terms of its action on the basis vectors {v_{i}} by

\displaystyle v_{j}^{\prime}=\sum_{i=1}^{n}T_{ij}v_{i} \ \ \ \ \ (8)

For a projection operator, we can see that this means that for the {m} basis vectors {\left(e_{1},\ldots,e_{m}\right)} we must have {P_{ij}=\delta_{ij}} for all {i,j=1,\ldots,m}, while for the {k} basis vectors {\left(f_{1},\ldots,f_{k}\right)} we must have {P_{ij}=0} for all {i,j=1,\ldots,k}. If we list the basis vectors in the order {\left(e_{1},\ldots,e_{m},f_{1},\ldots,f_{k}\right)}, then {P_{U}} is a {\left(m+k\right)\times\left(m+k\right)} diagonal matrix with the diagonal elements in the top {m} rows equal to 1, and all other elements equal to zero.

In this basis, we see that {\mbox{det }P_{U}=0} (because there is at least one zero element on the diagonal) and {\mbox{tr }P_{U}=m}, which is the dimension of the subspace {U}. As the trace and determinant are invariant under a change of basis, these properties apply to any basis.

Leave a Reply

Your email address will not be published. Required fields are marked *