Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right, or in the menu at the top of the page.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are in the Welcome menu above.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. (The total number of hits is given in the sidebar at the right.)

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

Shaula (Lambda Scorpii)

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.19.

As another example of calculating the colour indices of a star using the blackbody radiation rate, we’ll look at the star Shaula ({\lambda} Scorpii), which has a surface temperature of around 22000 K. We can use the formulas:

\displaystyle U-B \displaystyle = \displaystyle -2.5\log\frac{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{U}}{\lambda_{U}^{5}\left(e^{hc/\lambda_{U}k_{B}T}-1\right)\Delta\lambda_{B}}+C_{U-B}\ \ \ \ \ (1)
\displaystyle B-V \displaystyle = \displaystyle -2.5\log\frac{\lambda_{V}^{5}\left(e^{hc/\lambda_{V}k_{B}T}-1\right)\Delta\lambda_{B}}{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{V}}+C_{B-V} \ \ \ \ \ (2)


\displaystyle C_{U-B} \displaystyle = \displaystyle -0.87\ \ \ \ \ (3)
\displaystyle C_{B-V} \displaystyle = \displaystyle +0.65 \ \ \ \ \ (4)

Plugging in the numbers, we get

\displaystyle U-B \displaystyle = \displaystyle -1.076\ \ \ \ \ (5)
\displaystyle B-V \displaystyle = \displaystyle -0.227 \ \ \ \ \ (6)

The measured values given by Carroll & Ostlie are {U-B=-0.90} and {B-V=-0.23} so the blackbody values are quite close to those measured.

The apparent visual magnitude of Shaula is {V=1.62} and its parallax as measured by Hipparcos is {0.00464^{\prime\prime}}. The distance of Shaula from Earth is therefore

\displaystyle d=\frac{1}{p}=215.517\mbox{ pc} \ \ \ \ \ (7)

The absolute visual magnitude is therefore

\displaystyle M_{V} \displaystyle = \displaystyle V+5-5\log d\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle -5.047 \ \ \ \ \ (9)

If it were 10 parsecs from Earth, it would be the second brightest object (after the moon) in the night sky, outshining even Venus.

Colour indices in terms of blackbody radiation rate

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.18.

We can write the apparent magnitude {m_{\lambda}} of a star for some wavelength band in terms of the flux {F_{\lambda}} in that band received on Earth from the star as

\displaystyle m_{\lambda} \displaystyle = \displaystyle -2.5\log\int F_{\lambda}S_{\lambda}d\lambda+C_{\lambda} \ \ \ \ \ (1)

The constant{C_{\lambda}} depends on the wavelength interval over which the integral is done, and on the sensitivity {S_{\lambda}} of the detector, so it will be different for different filters. If we measure the apparent magnitudes in the three standard bands {U}, {B} and {V} and treat the star as a blackbody, we can work out the three constants {C_{U}}, {C_{B}} and {C_{V}}. Since colour indices are commonly used to classify stars, we can work out similar equations for these indices.

\displaystyle U-B \displaystyle = \displaystyle -2.5\log\int F_{U}S_{U}d\lambda+C_{U}+2.5\log\int F_{B}S_{B}d\lambda-C_{B}\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle -2.5\log\frac{\int F_{U}S_{U}d\lambda}{\int F_{B}S_{B}d\lambda}+C_{U-B} \ \ \ \ \ (3)

where {C_{U-B}\equiv C_{U}-C_{B}}. Since the argument of the log is now dimensionless, the constant {C_{U-B}} is independent of the units used to measure flux. In their example 3.6.2, Carroll & Ostlie use a star with surface temperature of {T=42000\mbox{ K}} and measured colour indices of {U-B=-1.19} and {B-V=-0.33}. The standard filters are

  • {U}: {365\pm34\mbox{ nm}};
  • {B}: {440\pm49\mbox{ nm}};
  • {V}: {550\pm44.5\mbox{ nm}.}

If we make the assumptions that {S=1} for each filter within these bands and {S=0} outside these bands, and that the flux doesn’t change much over the bandwidth in each filter, we can approximate the relation above by

\displaystyle U-B\approx-2.5\log\frac{F_{U}\Delta\lambda_{U}}{F_{B}\Delta\lambda_{B}}+C_{U-B} \ \ \ \ \ (4)

with a similar relation for {B-V}:

\displaystyle B-V\approx-2.5\log\frac{F_{B}\Delta\lambda_{B}}{F_{V}\Delta\lambda_{V}}+C_{B-V} \ \ \ \ \ (5)

As we’re dealing with ratios of flux for the same star, we can express these equations in terms of the blackbody radiation rate:

\displaystyle U-B \displaystyle = \displaystyle -2.5\log\frac{B_{U}\Delta\lambda_{U}}{B_{B}\Delta\lambda_{B}}+C_{U-B}\ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle -2.5\log\frac{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{U}}{\lambda_{U}^{5}\left(e^{hc/\lambda_{U}k_{B}T}-1\right)\Delta\lambda_{B}}+C_{U-B}\ \ \ \ \ (7)
\displaystyle B-V \displaystyle = \displaystyle -2.5\log\frac{\lambda_{V}^{5}\left(e^{hc/\lambda_{V}k_{B}T}-1\right)\Delta\lambda_{B}}{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{V}}+C_{B-V} \ \ \ \ \ (8)

Plugging in the values for the star mentioned above, we get

\displaystyle C_{U-B} \displaystyle = \displaystyle -0.87\ \ \ \ \ (9)
\displaystyle C_{B-V} \displaystyle = \displaystyle +0.65 \ \ \ \ \ (10)

Given these values, we can now estimate the colour indices for any star if we know its temperature. For the Sun, {T=5777\mbox{ K}} and plugging this into 7 and 8, we get

\displaystyle U-B \displaystyle = \displaystyle -0.222\ \ \ \ \ (11)
\displaystyle B-V \displaystyle = \displaystyle +0.571 \ \ \ \ \ (12)

The measured values given by Carroll & Ostlie are {U-B=+0.195} and {B-V=+0.65} so the agreement isn’t great, especially for {U-B}.

Bolometric magnitude from flux

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.17.

The apparent magnitude of a star at a particular wavelength can be written in terms of the flux observed at that wavelength on Earth by

\displaystyle   m \displaystyle  = \displaystyle  -2.5\log\int F_{\lambda}S_{\lambda}d\lambda+C\ \ \ \ \ (1)
\displaystyle  C \displaystyle  \equiv \displaystyle  M_{Sun}+2.5\log\int F_{\lambda,10,Sun}S_{\lambda}d\lambda \ \ \ \ \ (2)

Here {S_{\lambda}} is the sensitivity function and indicates what fraction of the actual flux a particular telescope receives at a given wavelength. This formula is actually not well-formed since whenever we use a transcendental function such as the logarithm, its argument should be dimensionless. It is true that if we combine the two terms into a single logarithm, we get

\displaystyle  m=-2.5\log\frac{\int F_{\lambda}S_{\lambda}d\lambda}{\int F_{\lambda,10,Sun}S_{\lambda}d\lambda}+M_{Sun} \ \ \ \ \ (3)

giving a dimensionless argument for the log term. However, in the original form, the constant {C} depends on the units used for the flux.

It seems to be standard to use {\mbox{watts m}^{-2}} for total flux, so the units of {F_{\lambda}} are {\mbox{watts m}^{-2}\mbox{ nm}^{-1}} if the wavelength {\lambda} is given in nanometres.

For a bolometric magnitude, we set {S_{\lambda}=1} for all {\lambda}. The bolometric flux for the Sun at the distance of Earth is

\displaystyle  \int_{0}^{\infty}F_{\lambda}d\lambda=1365\mbox{ W m}^{-2} \ \ \ \ \ (4)

Taking the apparent bolometric magnitude of the Sun as {m_{Sun}=-26.83} we get

\displaystyle  C_{bol}=-18.992 \ \ \ \ \ (5)

As a consistency check, we can plug in the values for Dschubba (Delta Sco), whose flux at Earth is {F=6.44\times10^{-8}\mbox{ W m}^{-2}}:

\displaystyle  m=-2.5\log\left(6.44\times10^{-8}\right)-18.992=-1.01 \ \ \ \ \ (6)

which just gives us the bolometric magnitude we had before.

Vega as a blackbody

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.16.

The blackbody radiation rate in a wavelength interval {d\lambda} is

\displaystyle  B_{\lambda}\left(T\right)d\lambda=\frac{2hc^{2}}{\lambda^{5}\left(e^{hc/\lambda k_{B}T}-1\right)}d\lambda \ \ \ \ \ (1)

The luminosity {L_{\lambda}} within the same wavelength interval is the radiation rate times the surface area of the star:

\displaystyle  L_{\lambda}d\lambda=4\pi R^{2}B_{\lambda}=\frac{8\pi hc^{2}R^{2}}{\lambda^{5}\left(e^{hc/\lambda k_{B}T}-1\right)}d\lambda \ \ \ \ \ (2)

The flux {F_{\lambda}} received by an observer on Earth at a distance {r} from the star is then

\displaystyle  F_{\lambda}d\lambda=\frac{L_{\lambda}}{4\pi r^{2}}=\frac{2hc^{2}}{\lambda^{5}\left(e^{hc/\lambda k_{B}T}-1\right)}\frac{R^{2}}{r^{2}}d\lambda \ \ \ \ \ (3)

The apparent magnitude {m} in some wavelength interval (for example, one of the {U}, {B} and {V} magnitudes) is

\displaystyle  m=M_{Sun}-2.5\log\frac{\int F_{\lambda}S_{\lambda}d\lambda}{\int F_{\lambda,10,Sun}S_{\lambda}d\lambda} \ \ \ \ \ (4)

where {M_{Sun}} is the absolute magnitude of the Sun over the same wavelength interval and {F_{\lambda,10,Sun}} is the flux {F_{\lambda}} for the Sun at a distance of 10 pc. The function {S_{\lambda}}is the sensitivity function and indicates what fraction of the star’s light is received at a given wavelength {\lambda}. Thus {0\le S_{\lambda}\le1} for all {\lambda}. Since the values for the Sun in 4 are same for every star, we can write it as

\displaystyle   m \displaystyle  = \displaystyle  -2.5\log\int F_{\lambda}S_{\lambda}d\lambda+C\ \ \ \ \ (5)
\displaystyle  C \displaystyle  \equiv \displaystyle  M_{Sun}+2.5\log\int F_{\lambda,10,Sun}S_{\lambda}d\lambda \ \ \ \ \ (6)

{C} depends on the wavelength interval over which the integral is done, and on the sensitivity {S_{\lambda}} of the detector, so it will be different for different filters and telescopes. In practice, {C_{U}}, {C_{B}} and {C_{V}} are all chosen so that {U}, {B} and {V} are all zero for the star Vega. Unfortunately, this doesn’t mean that Vega is actually the same brightness when viewed in these three regions of the spectrum. We can find the wavelength band in which Vega actually appears brightest by calculating the colour indices using the above formulas.

\displaystyle   U-B \displaystyle  = \displaystyle  M_{Sun}-2.5\log\left[\frac{\int F_{\lambda}S_{\lambda}d\lambda}{\int F_{\lambda,10,Sun}S_{\lambda}d\lambda}\right]_{U}-M_{Sun}+2.5\log\left[\frac{\int F_{\lambda}S_{\lambda}d\lambda}{\int F_{\lambda,10,Sun}S_{\lambda}d\lambda}\right]_{B}\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  2.5\log\frac{\left[\int F_{\lambda}S_{\lambda}d\lambda\right]_{B}}{\left[\int F_{\lambda}S_{\lambda}d\lambda\right]_{U}}\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  2.5\log\frac{\left[\int B_{\lambda}S_{\lambda}d\lambda\right]_{B}}{\left[\int B_{\lambda}S_{\lambda}d\lambda\right]_{U}} \ \ \ \ \ (9)

where in each case the integrals in square brackets are evaluated over the wavelength range corresponding to the subscript {U} or {B}. If the wavelength filters are narrow enough and we take the sensitivity function {S_{\lambda}} to be 1 inside the filter’s range and 0 outside, we can approximate the integrals by calculating {B_{\lambda}} at the midpoint of the wavelength range and just multiplying by the filter’s bandwidth {\Delta\lambda}. That is, we get

\displaystyle   U-B \displaystyle  \approx \displaystyle  2.5\log\frac{B_{\lambda_{B}}\Delta\lambda_{B}}{B_{\lambda_{U}}\Delta\lambda_{U}}\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  2.5\log\frac{\lambda_{U}^{5}\left(e^{hc/\lambda_{U}k_{B}T}-1\right)\Delta\lambda_{B}}{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{U}} \ \ \ \ \ (11)

There’s a similar relation for {B-V}:

\displaystyle  B-V\approx2.5\log\frac{\lambda_{B}^{5}\left(e^{hc/\lambda_{B}k_{B}T}-1\right)\Delta\lambda_{V}}{\lambda_{V}^{5}\left(e^{hc/\lambda_{V}k_{B}T}-1\right)\Delta\lambda_{B}} \ \ \ \ \ (12)

The standard filters are

  • {U}: {365\pm34\mbox{ nm}};
  • {B}: {440\pm49\mbox{ nm}};
  • {V}: {550\pm44.5\mbox{ nm}.}

Using a temperature of {T=9600\mbox{ K}} for Vega, we get

\displaystyle   U-B \displaystyle  = \displaystyle  +0.161\ \ \ \ \ (13)
\displaystyle  B-V \displaystyle  = \displaystyle  -0.539 \ \ \ \ \ (14)

Thus a blackbody with {T=9600\mbox{ K}} as an approximation to Vega would appear brightest in the blue region, since {U>B} and {B<V} so {B} is the smallest (brightest) magnitude.

Average of product of two waves

References: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 9, Post 11.

A common calculation that is required when analyzing any system that varies with a sinusoidal period is a time average over one cycle. For example, a monochromatic plane wave with amplitude {A}, direction {\mathbf{k}}, frequency {\omega} and phase {\delta} can be written as

\displaystyle   f \displaystyle  = \displaystyle  A\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta\right)=\Re\tilde{A}e^{i\left(\mathbf{k}\cdot\mathbf{r}-\omega t\right)}\ \ \ \ \ (1)
\displaystyle  \tilde{A} \displaystyle  = \displaystyle  Ae^{i\delta} \ \ \ \ \ (2)

Now suppose we have two waves with the same direction and frequency, but different amplitudes and phases. Then

\displaystyle   f \displaystyle  = \displaystyle  A\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{a}\right)\ \ \ \ \ (3)
\displaystyle  g \displaystyle  = \displaystyle  B\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{b}\right) \ \ \ \ \ (4)

The average of the product of these waves over a single cycle is then

\displaystyle  \left\langle fg\right\rangle =\frac{\omega AB}{2\pi}\int_{0}^{2\pi/\omega}\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{a}\right)\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{b}\right)dt \ \ \ \ \ (5)

We can transform this integral by defining

\displaystyle   \theta \displaystyle  \equiv \displaystyle  \mathbf{k}\cdot\mathbf{r}-\omega t\ \ \ \ \ (6)
\displaystyle  d\theta \displaystyle  = \displaystyle  -\omega dt\ \ \ \ \ (7)
\displaystyle  \left\langle fg\right\rangle \displaystyle  = \displaystyle  \frac{AB}{2\pi}\int_{0}^{2\pi}\cos\left(\theta+\delta_{a}\right)\cos\left(\theta+\delta_{b}\right)d\theta \ \ \ \ \ (8)

We’ve used the limits of 0 and {2\pi} since any interval of {2\pi} covers one complete cycle of {\theta}.

The two cosines have the same period and differ only in their phase, so we will get the same result from the integral if we replace them by

\displaystyle   \cos\left(\theta+\delta_{a}\right)\cos\left(\theta+\delta_{b}\right) \displaystyle  \rightarrow \displaystyle  \cos\theta\cos\left(\theta+\delta_{a}-\delta_{b}\right)\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \cos^{2}\theta\cos\left(\delta_{a}-\delta_{b}\right)-\cos\theta\sin\theta\sin\left(\delta_{a}-\delta_{b}\right) \ \ \ \ \ (10)

We now have

\displaystyle   \left\langle fg\right\rangle \displaystyle  = \displaystyle  \frac{AB}{2\pi}\cos\left(\delta_{a}-\delta_{b}\right)\int_{0}^{2\pi}\cos^{2}\theta d\theta-\frac{AB}{2\pi}\sin\left(\delta_{a}-\delta_{b}\right)\int_{0}^{2\pi}\cos\theta\sin\theta d\theta\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}AB\cos\left(\delta_{a}-\delta_{b}\right)-0\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}AB\cos\left(\delta_{a}-\delta_{b}\right)\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\Re\left(fg^*\right)=\frac{1}{2}\Re\left(f^*g\right) \ \ \ \ \ (14)

Thus we can get the answer using complex notation without doing any integrals.

This applies to vector products as well, since the components of vector products are just products of scalar functions. For example, the time average of the Poynting vector becomes, when the electric and magnetic fields are written in complex notation:

\displaystyle  \left\langle \mathbf{S}\right\rangle =\frac{1}{2\mu_{0}}\Re\left(\tilde{\mathbf{E}}\times\tilde{\mathbf{B}}^*\right) \ \ \ \ \ (15)

The electromagnetic energy density in the fields has a time average of

\displaystyle   \left\langle u_{em}\right\rangle \displaystyle  = \displaystyle  \frac{1}{4}\Re\left(\epsilon_{0}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\frac{1}{\mu_{0}}\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4}\left(\epsilon_{0}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\frac{1}{\mu_{0}}\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4\mu_{0}}\left(\frac{1}{c^{2}}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right) \ \ \ \ \ (18)

We dropped the {\Re} in line 2 since the quantity in parentheses is automatically real anyway, and in the last line we used

\displaystyle  \mu_{0}\epsilon_{0}=\frac{1}{c^{2}} \ \ \ \ \ (19)

Stark effect in hydrogen for n = 1 and n = 2

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 6.36.

The Zeeman effect occurs when an atom is placed in an external magnetic field, resulting in the interaction between field and the magnetic dipole moments of the atom causing splitting of the energy levels. The electrical analogue of the Zeeman effect, when an atom is placed in an external electric field, is called the Stark effect. We can use perturbation theory to analyze the effect on the energy levels of the electron.

The perturbation hamiltonian is, assuming the electric field points in the {z} direction:

\displaystyle  H_{S}^{\prime}=eE_{ext}z=eE_{ext}r\cos\theta \ \ \ \ \ (1)

To use perturbation theory, we’ll need the wave functions for unperturbed hydrogen, which are given in Griffiths as equation 4.89. For the ground state {n=1}, we have

\displaystyle  \left|100\right\rangle =\frac{2}{a^{3/2}}\frac{1}{\sqrt{4\pi}}e^{-r/a} \ \ \ \ \ (2)

Since the ground state is non-degenerate, we can use non-degenerate perturbation theory:

\displaystyle   E_{100,1} \displaystyle  = \displaystyle  \left\langle 100\right|H_{S}^{\prime}\left|100\right\rangle \ \ \ \ \ (3)

Rather than writing out the integral, we observe that {\left\langle 100\right|H_{S}^{\prime}\left|100\right\rangle } contains the integral of {\cos\theta\sin\theta=\frac{1}{2}\sin2\theta} over {\theta=0,..,\pi} which is zero, so {E_{100,1}=0}.

To analyze {n=2}, we need the four wave functions:

\displaystyle   \left|200\right\rangle \displaystyle  = \displaystyle  R_{20}\left(r\right)Y_{0}^{0}\left(\theta,\phi\right)\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\sqrt{2}a^{3/2}}\left(1-\frac{r}{2a}\right)\frac{1}{\sqrt{4\pi}}e^{-r/2a}\ \ \ \ \ (5)
\displaystyle  \left|211\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{1}\left(\theta,\phi\right)\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  -\left(\frac{3}{8\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\sin\theta e^{i\phi}\ \ \ \ \ (7)
\displaystyle  \left|210\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{0}\left(\theta,\phi\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \left(\frac{3}{4\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\cos\theta\ \ \ \ \ (9)
\displaystyle  \left|21-1\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{-1}\left(\theta,\phi\right)\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \left(\frac{3}{8\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\sin\theta e^{-i\phi} \ \ \ \ \ (11)

Since all four of these states have the same unperturbed energy, we need to use degenerate perturbation theory, so we’ll need to find the matrix {W} with elements

\displaystyle  W_{a,b}=\left\langle a\right|H_{S}^{\prime}\left|b\right\rangle \ \ \ \ \ (12)

where {a} and {b} represent one of the four states above.

First, we’ll look at the {\theta} integrals. All matrix elements involve integrals of the form (remember that {H_{S}^{\prime}} always contributes a {\cos\theta} and the spherical volume element always contributes a {\sin\theta}):

\displaystyle  I_{nm}=\int_{0}^{\pi}\sin^{n}\theta\cos^{m}\theta d\theta \ \ \ \ \ (13)

For the possible values of {n} and {m} in this problem, the only non-zero integrals of this form are

\displaystyle   I_{12} \displaystyle  = \displaystyle  \frac{2}{3}\ \ \ \ \ (14)
\displaystyle  I_{22} \displaystyle  = \displaystyle  \frac{\pi}{8} \ \ \ \ \ (15)

{I_{12}} arises in {\left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle } (and its transpose) and {I_{22}} arises in {\left\langle 211\right|H_{S}^{\prime}\left|210\right\rangle } and {\left\langle 210\right|H_{S}^{\prime}\left|21-1\right\rangle } (and their transposes). Thus these are the only possible non-zero entries in {W}. However, {\left\langle 211\right|H_{S}^{\prime}\left|210\right\rangle } and {\left\langle 210\right|H_{S}^{\prime}\left|21-1\right\rangle } involve integrating {e^{\pm i\phi}} over {\phi=0..2\pi} which gives zero. Thus the only non-zero matrix elements are {\left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle } (and its transpose). This gives

\displaystyle   \left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle \displaystyle  = \displaystyle  \frac{1}{\sqrt{2}a^{3/2}}\left(\frac{3}{4\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}\frac{1}{\sqrt{4\pi}}eE_{ext}\int_{0}^{\infty}\int_{0}^{\pi}\int_{0}^{2\pi}\left(1-\frac{r}{2a}\right)re^{-r/a}\cos^{2}\theta\sin\theta d\phi d\theta dr\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  -3aeE_{ext} \ \ \ \ \ (17)

(The integral can be done with software, or by hand using integration by parts.) The matrix {W} is therefore

\displaystyle  W=\left[\begin{array}{cccc} 0 & 0 & -3aeE_{ext} & 0\\ 0 & 0 & 0 & 0\\ -3aeE_{ext} & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{array}\right] \ \ \ \ \ (18)

The eigenvalues are 0, 0, {\pm3aeE_{ext}} so the {n=2} state splits into 3 states, one with energy {E_{2,0}} (degeneracy 2) and two with energies {E_{2,0}\pm3aeE_{ext}} (each with degeneracy 1). The eigenvectors are {\left[0,1,0,0\right]} and {\left[0,0,0,1\right]} for eigenvalue 0, {\left[-1,0,1,0\right]} for {3aeE_{ext}} and {\left[1,0,1,0\right]} for {-3aeE_{ext}}. Thus the ‘good’ states are

\displaystyle  \left|211\right\rangle ,\left|21-1\right\rangle ,\frac{1}{\sqrt{2}}\left(-\left|200\right\rangle +\left|210\right\rangle \right),\frac{1}{\sqrt{2}}\left(\left|200\right\rangle +\left|210\right\rangle \right) \ \ \ \ \ (19)

The electric dipole moment of hydrogen is (treating the proton and electron as point charges):

\displaystyle   \mathbf{p} \displaystyle  = \displaystyle  -e\mathbf{r}\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  -er\left(\sin\theta\cos\phi\hat{\mathbf{x}}+\sin\theta\sin\phi\hat{\mathbf{y}}+\cos\theta\hat{\mathbf{z}}\right) \ \ \ \ \ (21)

We can work out the expectation value of {\mathbf{p}} in each of the ‘good’ states by straightforward integration: {\left\langle \mathbf{p}\right\rangle =\left\langle a\right|\mathbf{p}\left|a\right\rangle } where {a} stands for one of the ‘good’ states. Note that if {a=\left|211\right\rangle } or {a=\left|21-1\right\rangle }, then {\left\langle a\right|\mathbf{p}\left|a\right\rangle } has only a {z} component that is non-zero, since the complex exponentials in {\phi} cancel out and the integral of {\sin\phi} or {\cos\phi} in the {x} or {y} components is zero. Similarly, if {a=\frac{1}{\sqrt{2}}\left(-\left|200\right\rangle +\left|210\right\rangle \right)} or {a=\frac{1}{\sqrt{2}}\left(\left|200\right\rangle +\left|210\right\rangle \right)}, the {x} and {y} components are again zero, since these wave functions are independent of {\phi} so the integral of {\sin\phi} or {\cos\phi} in the {x} or {y} components gives zero again. Therefore, {\left\langle \mathbf{p}\right\rangle } is always in the {z} direction, and can be calculated from

\displaystyle  \left\langle \mathbf{p}\right\rangle =-e\left\langle a\right|r\cos\theta\left|a\right\rangle \hat{\mathbf{z}} \ \ \ \ \ (22)

Doing the integrals results in

\displaystyle  \left\langle \mathbf{p}\right\rangle =0,0,3a\hat{\mathbf{z}},-3a\hat{\mathbf{z}} \ \ \ \ \ (23)


Hall effect

Reference: Griffiths, David J. (2007) Introduction to Electrodynamics, 3rd Edition; Prentice Hall – Chapter 5, Post 39.

The Hall effect occurs when a current-carrying substance is placed in a magnetic field that is perpendicular to the direction of the current. Suppose we have a wire with a rectangular cross-section that carries current in the {+y} direction. A magnetic field pointing in the {+x} direction is applied to the wire. From the Lorentz force law, a moving charge in the wire feels a magnetic force {q\mathbf{v}\times\mathbf{B}}, so it will be deflected in the {\pm z} direction, where the sign of the deflection depends on the sign of the charge and the direction of motion. If the charges are positive and flowing in the {+y} direction, they are deflected in the {-z} direction.

As a result, a charge imbalance is created inside the wire resulting in an electric field in the {z} direction. Equilibrium is established when the electric and magnetic forces balance, and this happens when {q\mathbf{E}=-q\mathbf{v}\times\mathbf{B}}. For positive charges, this means that {E=vB}, so if the wire has a thickness {t} in the {z} direction, the potential difference across the wire is {Et=vBt}.

If the charges are negative, then to produce the same current as above they would have to be moving in the {-y} direction. Since the direction of motion and the sign of the charges are both opposite to the first case, the negative charges will still be deflected downwards, so the direction of the induced electric field will be reversed. Thus by measuring the sign of the potential difference we can tell whether the charge carriers are positive or negative.

Metric tensor: spherical coordinates

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 5; Problem 5.6.

The non-rectangular coordinate systems (semi-log and sinusoidal) we’ve looked at so far have all been flat, so it’s time to look at one in curved space. We’ll use the surface of a sphere, but rather than the usual spherical coordinates we’ll use a slight variation. We keep the azimuthal angle {\phi} but use as the second coordinate the quantity {r} which is the distance along the surface of the sphere measured from the north pole. If the radius of the sphere is {R}, then in terms of normal spherical coordinates, {r=R\theta}.

Curves of constant {\phi} are the usual lines of longitude, while curves of constant {r} are lines of latitude. The tangents to the two curves at a given point are always perpendicular, so the metric {g_{ij}} will be diagonal. To find the diagonal components, consider an infinitesimal displacement {d\mathbf{s}}. We have

\displaystyle  d\mathbf{s}=dr\mathbf{e}_{r}+d\phi\mathbf{e}_{\phi} \ \ \ \ \ (1)

and our job is to find the two basis vectors.

The displacement along {\mathbf{e}_{r}} is just {dr=Rd\theta}, so {\mathbf{e}_{r}} is a unit vector. A displacement along {\mathbf{e}_{\phi}} depends on the radius of the constant {r} curve. In spherical coordinates, this is {R\sin\theta}, so in our new coordinate system we get the displacement as {R\sin\theta d\phi=R\sin\frac{r}{R}d\phi}. Therefore the magnitude of {\mathbf{e}_{\phi}} is {R\sin\frac{r}{R}}. The metric tensor is thus

\displaystyle  g_{ij}=\left[\begin{array}{cc} 1 & 0\\ 0 & \left(R\sin\frac{r}{R}\right)^{2} \end{array}\right] \ \ \ \ \ (2)

Michelson-Morley experiment: length contraction?

Required math: algebra, vectors

Required physics: basics

The outcome of the Michelson-Morley experiment was that the speed of light appeared to be independent of the velocity of the apparatus relative to the postulated universal ether, which was the medium in which light was presumed to travel. If light really did travel in some substance, so that the wave nature of light was due to its propagation through it, then the speed of light should be fixed relative to this ether in the same way that the speed of other waves such as sound or water are fixed relative to their propagation medium, and if the apparatus is moving relative to the ether, then the velocity of light relative to the apparatus should vary.

Michelson’s explanation for the null result of his experiment was that the Earth dragged the ether along with it, so that the Earth remained at rest in the ether. This didn’t seem to convince many physicists of the time, since it would imply that every mass dragged its own little aura of ether along with it, which didn’t seem likely (of course, the final explanation – special relativity – didn’t seem very likely at first either).

One other explanation that was proposed at the time was a suggestion made independently by the Dutch physicist Hendrik Lorentz and the Irish physicist George Fitzgerald. This was that all objects contracted in the direction of their motion relative to the ether. As we saw when we analyzed the Michelson-Morley experiment, the round trip time for the light travelling parallel to the motion relative to the ether is

\displaystyle  t_{\parallel}=\frac{2\lambda}{1-v^{2}} \ \ \ \ \ (1)

while the time for the round trip perpendicular to the direction of motion is

\displaystyle  t_{\perp}=\frac{2\lambda}{\sqrt{1-v^{2}}} \ \ \ \ \ (2)

where {\lambda} is the distance from the source to the mirror in each case, and {v} is the velocity of the Earth relative to the ether. The result of the experiment was that {t_{\parallel}=t_{\perp}}.

If {\lambda} is actually different in the two cases, then the equality of the two times could be explained. In particular if we write

\displaystyle   t_{\parallel} \displaystyle  = \displaystyle  \frac{2\lambda_{\parallel}}{1-v^{2}}\ \ \ \ \ (3)
\displaystyle  t_{\perp} \displaystyle  = \displaystyle  \frac{2\lambda_{\perp}}{\sqrt{1-v^{2}}} \ \ \ \ \ (4)

then if

\displaystyle  \lambda_{\parallel}=\lambda_{\perp}\sqrt{1-v^{2}} \ \ \ \ \ (5)

the two times are equal. Thus lengths in the direction of motion are contracted by a factor of {\sqrt{1-v^{2}}}, where {0\le v\le1} (although at the time, the restriction of {v} to be less than 1 wasn’t formally imposed).

This is, of course, the same result as is obtained in special relativity, since there, objects do in fact appear contracted in the direction of motion of one observer with respect to another. There are two crucial differences, however. In relativity, the contraction is a result of the motion of one observer relative to any other observer, not with respect to some background ether. The second difference is the most fundamental; it is that the time as measured by the two observers is not the same.

The possibility that time wasn’t absolute did not occur to Lorentz or Fitzgerald, and as a result their proposal doesn’t work properly. To see this a bit more quantitatively, we can work out the transformations between two coordinate systems assuming only their contraction hypothesis. If our two observers are {G} (at rest relative to the ether, and using Greek letters) and {R} (moving at speed {v} in the {x} direction relative to the ether, and using Roman letters), then {R} will say that any distance measured by {G} as {\xi} is actually shorter, so that {x=\xi\sqrt{1-v^{2}}}, but that the two observers will agree on the times at which events occur, so that {t=\tau}. We can write this as a transformation matrix:

\displaystyle  F_{v}=\left(\begin{array}{cc} 1 & 0\\ v & \sqrt{1-v^{2}} \end{array}\right) \ \ \ \ \ (6)

The formal transformations are therefore

\displaystyle   t \displaystyle  = \displaystyle  \tau\ \ \ \ \ (7)
\displaystyle  x \displaystyle  = \displaystyle  v\tau+\xi\sqrt{1-v^{2}} \ \ \ \ \ (8)

The {v\tau} term, of course, just represents the fact that any point on the {\xi} axis in {G}‘s frame is moving with speed {v} relative to {R}. In particular, at {t=\tau=0}, this transformation provides a uniform contraction of all distances in the {x} direction.

This is fine as far as it goes and seems to explain the null result of the experiment, but there are a couple of problems. First, we would expect that the inverse transformation (from {R} to {G}) should be obtained by plugging in {-v} in place of {v}, but if we try that, we get

\displaystyle  F_{-v}=\left(\begin{array}{cc} 1 & 0\\ -v & \sqrt{1-v^{2}} \end{array}\right) \ \ \ \ \ (9)

and it can be seen by direct multiplication that {F_{-v}\ne F_{v}^{-1}}. For reference, the inverse matrix turns out to be

\displaystyle  F_{v}^{-1}=\left(\begin{array}{cc} 1 & 0\\ -\frac{v}{\sqrt{1-v^{2}}} & \frac{1}{\sqrt{1-v^{2}}} \end{array}\right) \ \ \ \ \ (10)

which doesn’t have any obvious meaning as a transformation matrix.

Another problem appears when we consider how this transformation affects light. If we fire two photons in opposite directions along the {\xi} axis, clearly they travel with speed 1 (in the positive {\xi} direction) and {-1} (in the opposite direction). Under transformation, since {R} also sees the two photons take equal travel times, the two photons should transform the same way. In {G}‘s frame, the equations of the two world lines of the photons are

\displaystyle  \xi_{\pm}=\pm\tau \ \ \ \ \ (11)

Under the transformation {F_{v}} we get

\displaystyle   x_{+} \displaystyle  = \displaystyle  (v+\sqrt{1-v^{2}})\tau\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  (v+\sqrt{1-v^{2}})t\ \ \ \ \ (13)
\displaystyle  x_{-} \displaystyle  = \displaystyle  (v-\sqrt{1-v^{2}})\tau\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  (v-\sqrt{1-v^{2}})t \ \ \ \ \ (15)

since {t=\tau}. At the two extremes, we get, first for {v=0}

\displaystyle   x_{+}(0) \displaystyle  = \displaystyle  t\ \ \ \ \ (16)
\displaystyle  x_{-}(0) \displaystyle  = \displaystyle  -t \ \ \ \ \ (17)

which is as it should be, since if {v=0}, {G} and {R} are in the same frame.

For {v=1,} though, we get

\displaystyle   x_{+}(1) \displaystyle  = \displaystyle  t\ \ \ \ \ (18)
\displaystyle  x_{-}(1) \displaystyle  = \displaystyle  t \ \ \ \ \ (19)

For values of {v} in between 0 and 1, the slope of {x_{+}} versus {t} is always greater than 1 (it has a maximum value of {(1+\sqrt{2})/2} when {v=1/\sqrt{2}}), and the slope of {x_{-}} versus {t} is always greater than {-1} (the slope increases monotonically from {-1} at {v=0} to {+1} at {v=1}). Thus the photons have different speeds in {R}‘s frame, and will take different travel times to travel the same length, so this transformation still contradicts the results of the experiment.

The solution to the problem requires special relativity, in which the ether is abolished, the speed of light is independent of the observer’s frame, and the universality of time is abolished. On reflection, this solution is probably much more radical and non-intuitive than the simple contraction proposed by Lorentz and Fitzgerald, but it has passed many experimental tests since.