Featured post

Welcome to Physics Pages

This blog consists of my notes and solutions to problems in various areas of mainstream physics. An index to the topics covered is contained in the links in the sidebar on the right.

This isn’t a “popular science” site, in that most posts use a fair bit of mathematics to explain their concepts. Thus this blog aims mainly to help those who are learning or reviewing physics in depth. More details on what the site contains and how to use it are here.

Despite Stephen Hawking’s caution that every equation included in a book (or, I suppose in a blog) would halve the readership, this blog has proved very popular since its inception in December 2010. The total number of hits was approaching 3 million before I switched hosts. (Sadly I can’t carry over the statistics to the new site, so the hit counter shows only those visits since the new site started up.)

Many thanks to my loyal followers and best wishes to everyone who visits. I hope you find it useful. Constructive criticism (or even praise) is always welcome, so feel free to leave a comment in response to any of the posts.

Average of product of two waves

References: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 9, Post 11.

A common calculation that is required when analyzing any system that varies with a sinusoidal period is a time average over one cycle. For example, a monochromatic plane wave with amplitude {A}, direction {\mathbf{k}}, frequency {\omega} and phase {\delta} can be written as

\displaystyle   f \displaystyle  = \displaystyle  A\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta\right)=\Re\tilde{A}e^{i\left(\mathbf{k}\cdot\mathbf{r}-\omega t\right)}\ \ \ \ \ (1)
\displaystyle  \tilde{A} \displaystyle  = \displaystyle  Ae^{i\delta} \ \ \ \ \ (2)

Now suppose we have two waves with the same direction and frequency, but different amplitudes and phases. Then

\displaystyle   f \displaystyle  = \displaystyle  A\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{a}\right)\ \ \ \ \ (3)
\displaystyle  g \displaystyle  = \displaystyle  B\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{b}\right) \ \ \ \ \ (4)

The average of the product of these waves over a single cycle is then

\displaystyle  \left\langle fg\right\rangle =\frac{\omega AB}{2\pi}\int_{0}^{2\pi/\omega}\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{a}\right)\cos\left(\mathbf{k}\cdot\mathbf{r}-\omega t+\delta_{b}\right)dt \ \ \ \ \ (5)

We can transform this integral by defining

\displaystyle   \theta \displaystyle  \equiv \displaystyle  \mathbf{k}\cdot\mathbf{r}-\omega t\ \ \ \ \ (6)
\displaystyle  d\theta \displaystyle  = \displaystyle  -\omega dt\ \ \ \ \ (7)
\displaystyle  \left\langle fg\right\rangle \displaystyle  = \displaystyle  \frac{AB}{2\pi}\int_{0}^{2\pi}\cos\left(\theta+\delta_{a}\right)\cos\left(\theta+\delta_{b}\right)d\theta \ \ \ \ \ (8)

We’ve used the limits of 0 and {2\pi} since any interval of {2\pi} covers one complete cycle of {\theta}.

The two cosines have the same period and differ only in their phase, so we will get the same result from the integral if we replace them by

\displaystyle   \cos\left(\theta+\delta_{a}\right)\cos\left(\theta+\delta_{b}\right) \displaystyle  \rightarrow \displaystyle  \cos\theta\cos\left(\theta+\delta_{a}-\delta_{b}\right)\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \cos^{2}\theta\cos\left(\delta_{a}-\delta_{b}\right)-\cos\theta\sin\theta\sin\left(\delta_{a}-\delta_{b}\right) \ \ \ \ \ (10)

We now have

\displaystyle   \left\langle fg\right\rangle \displaystyle  = \displaystyle  \frac{AB}{2\pi}\cos\left(\delta_{a}-\delta_{b}\right)\int_{0}^{2\pi}\cos^{2}\theta d\theta-\frac{AB}{2\pi}\sin\left(\delta_{a}-\delta_{b}\right)\int_{0}^{2\pi}\cos\theta\sin\theta d\theta\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}AB\cos\left(\delta_{a}-\delta_{b}\right)-0\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}AB\cos\left(\delta_{a}-\delta_{b}\right)\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\Re\left(fg^*\right)=\frac{1}{2}\Re\left(f^*g\right) \ \ \ \ \ (14)

Thus we can get the answer using complex notation without doing any integrals.

This applies to vector products as well, since the components of vector products are just products of scalar functions. For example, the time average of the Poynting vector becomes, when the electric and magnetic fields are written in complex notation:

\displaystyle  \left\langle \mathbf{S}\right\rangle =\frac{1}{2\mu_{0}}\Re\left(\tilde{\mathbf{E}}\times\tilde{\mathbf{B}}^*\right) \ \ \ \ \ (15)

The electromagnetic energy density in the fields has a time average of

\displaystyle   \left\langle u_{em}\right\rangle \displaystyle  = \displaystyle  \frac{1}{4}\Re\left(\epsilon_{0}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\frac{1}{\mu_{0}}\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right)\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4}\left(\epsilon_{0}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\frac{1}{\mu_{0}}\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{4\mu_{0}}\left(\frac{1}{c^{2}}\tilde{\mathbf{E}}\cdot\tilde{\mathbf{E}}^*+\tilde{\mathbf{B}}\cdot\tilde{\mathbf{B}}^*\right) \ \ \ \ \ (18)

We dropped the {\Re} in line 2 since the quantity in parentheses is automatically real anyway, and in the last line we used

\displaystyle  \mu_{0}\epsilon_{0}=\frac{1}{c^{2}} \ \ \ \ \ (19)

Stark effect in hydrogen for n = 1 and n = 2

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 6.36.

The Zeeman effect occurs when an atom is placed in an external magnetic field, resulting in the interaction between field and the magnetic dipole moments of the atom causing splitting of the energy levels. The electrical analogue of the Zeeman effect, when an atom is placed in an external electric field, is called the Stark effect. We can use perturbation theory to analyze the effect on the energy levels of the electron.

The perturbation hamiltonian is, assuming the electric field points in the {z} direction:

\displaystyle  H_{S}^{\prime}=eE_{ext}z=eE_{ext}r\cos\theta \ \ \ \ \ (1)

To use perturbation theory, we’ll need the wave functions for unperturbed hydrogen, which are given in Griffiths as equation 4.89. For the ground state {n=1}, we have

\displaystyle  \left|100\right\rangle =\frac{2}{a^{3/2}}\frac{1}{\sqrt{4\pi}}e^{-r/a} \ \ \ \ \ (2)

Since the ground state is non-degenerate, we can use non-degenerate perturbation theory:

\displaystyle   E_{100,1} \displaystyle  = \displaystyle  \left\langle 100\right|H_{S}^{\prime}\left|100\right\rangle \ \ \ \ \ (3)

Rather than writing out the integral, we observe that {\left\langle 100\right|H_{S}^{\prime}\left|100\right\rangle } contains the integral of {\cos\theta\sin\theta=\frac{1}{2}\sin2\theta} over {\theta=0,..,\pi} which is zero, so {E_{100,1}=0}.

To analyze {n=2}, we need the four wave functions:

\displaystyle   \left|200\right\rangle \displaystyle  = \displaystyle  R_{20}\left(r\right)Y_{0}^{0}\left(\theta,\phi\right)\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\sqrt{2}a^{3/2}}\left(1-\frac{r}{2a}\right)\frac{1}{\sqrt{4\pi}}e^{-r/2a}\ \ \ \ \ (5)
\displaystyle  \left|211\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{1}\left(\theta,\phi\right)\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  -\left(\frac{3}{8\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\sin\theta e^{i\phi}\ \ \ \ \ (7)
\displaystyle  \left|210\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{0}\left(\theta,\phi\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \left(\frac{3}{4\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\cos\theta\ \ \ \ \ (9)
\displaystyle  \left|21-1\right\rangle \displaystyle  = \displaystyle  R_{21}\left(r\right)Y_{1}^{-1}\left(\theta,\phi\right)\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  \left(\frac{3}{8\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}re^{-r/2a}\sin\theta e^{-i\phi} \ \ \ \ \ (11)

Since all four of these states have the same unperturbed energy, we need to use degenerate perturbation theory, so we’ll need to find the matrix {W} with elements

\displaystyle  W_{a,b}=\left\langle a\right|H_{S}^{\prime}\left|b\right\rangle \ \ \ \ \ (12)

where {a} and {b} represent one of the four states above.

First, we’ll look at the {\theta} integrals. All matrix elements involve integrals of the form (remember that {H_{S}^{\prime}} always contributes a {\cos\theta} and the spherical volume element always contributes a {\sin\theta}):

\displaystyle  I_{nm}=\int_{0}^{\pi}\sin^{n}\theta\cos^{m}\theta d\theta \ \ \ \ \ (13)

For the possible values of {n} and {m} in this problem, the only non-zero integrals of this form are

\displaystyle   I_{12} \displaystyle  = \displaystyle  \frac{2}{3}\ \ \ \ \ (14)
\displaystyle  I_{22} \displaystyle  = \displaystyle  \frac{\pi}{8} \ \ \ \ \ (15)

{I_{12}} arises in {\left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle } (and its transpose) and {I_{22}} arises in {\left\langle 211\right|H_{S}^{\prime}\left|210\right\rangle } and {\left\langle 210\right|H_{S}^{\prime}\left|21-1\right\rangle } (and their transposes). Thus these are the only possible non-zero entries in {W}. However, {\left\langle 211\right|H_{S}^{\prime}\left|210\right\rangle } and {\left\langle 210\right|H_{S}^{\prime}\left|21-1\right\rangle } involve integrating {e^{\pm i\phi}} over {\phi=0..2\pi} which gives zero. Thus the only non-zero matrix elements are {\left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle } (and its transpose). This gives

\displaystyle   \left\langle 200\right|H_{S}^{\prime}\left|210\right\rangle \displaystyle  = \displaystyle  \frac{1}{\sqrt{2}a^{3/2}}\left(\frac{3}{4\pi}\right)^{1/2}\frac{1}{\sqrt{24}a^{5/2}}\frac{1}{\sqrt{4\pi}}eE_{ext}\int_{0}^{\infty}\int_{0}^{\pi}\int_{0}^{2\pi}\left(1-\frac{r}{2a}\right)re^{-r/a}\cos^{2}\theta\sin\theta d\phi d\theta dr\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  -3aeE_{ext} \ \ \ \ \ (17)

(The integral can be done with software, or by hand using integration by parts.) The matrix {W} is therefore

\displaystyle  W=\left[\begin{array}{cccc} 0 & 0 & -3aeE_{ext} & 0\\ 0 & 0 & 0 & 0\\ -3aeE_{ext} & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{array}\right] \ \ \ \ \ (18)

The eigenvalues are 0, 0, {\pm3aeE_{ext}} so the {n=2} state splits into 3 states, one with energy {E_{2,0}} (degeneracy 2) and two with energies {E_{2,0}\pm3aeE_{ext}} (each with degeneracy 1). The eigenvectors are {\left[0,1,0,0\right]} and {\left[0,0,0,1\right]} for eigenvalue 0, {\left[-1,0,1,0\right]} for {3aeE_{ext}} and {\left[1,0,1,0\right]} for {-3aeE_{ext}}. Thus the ‘good’ states are

\displaystyle  \left|211\right\rangle ,\left|21-1\right\rangle ,\frac{1}{\sqrt{2}}\left(-\left|200\right\rangle +\left|210\right\rangle \right),\frac{1}{\sqrt{2}}\left(\left|200\right\rangle +\left|210\right\rangle \right) \ \ \ \ \ (19)

The electric dipole moment of hydrogen is (treating the proton and electron as point charges):

\displaystyle   \mathbf{p} \displaystyle  = \displaystyle  -e\mathbf{r}\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  -er\left(\sin\theta\cos\phi\hat{\mathbf{x}}+\sin\theta\sin\phi\hat{\mathbf{y}}+\cos\theta\hat{\mathbf{z}}\right) \ \ \ \ \ (21)

We can work out the expectation value of {\mathbf{p}} in each of the ‘good’ states by straightforward integration: {\left\langle \mathbf{p}\right\rangle =\left\langle a\right|\mathbf{p}\left|a\right\rangle } where {a} stands for one of the ‘good’ states. Note that if {a=\left|211\right\rangle } or {a=\left|21-1\right\rangle }, then {\left\langle a\right|\mathbf{p}\left|a\right\rangle } has only a {z} component that is non-zero, since the complex exponentials in {\phi} cancel out and the integral of {\sin\phi} or {\cos\phi} in the {x} or {y} components is zero. Similarly, if {a=\frac{1}{\sqrt{2}}\left(-\left|200\right\rangle +\left|210\right\rangle \right)} or {a=\frac{1}{\sqrt{2}}\left(\left|200\right\rangle +\left|210\right\rangle \right)}, the {x} and {y} components are again zero, since these wave functions are independent of {\phi} so the integral of {\sin\phi} or {\cos\phi} in the {x} or {y} components gives zero again. Therefore, {\left\langle \mathbf{p}\right\rangle } is always in the {z} direction, and can be calculated from

\displaystyle  \left\langle \mathbf{p}\right\rangle =-e\left\langle a\right|r\cos\theta\left|a\right\rangle \hat{\mathbf{z}} \ \ \ \ \ (22)

Doing the integrals results in

\displaystyle  \left\langle \mathbf{p}\right\rangle =0,0,3a\hat{\mathbf{z}},-3a\hat{\mathbf{z}} \ \ \ \ \ (23)


Hall effect

Reference: Griffiths, David J. (2007) Introduction to Electrodynamics, 3rd Edition; Prentice Hall – Chapter 5, Post 39.

The Hall effect occurs when a current-carrying substance is placed in a magnetic field that is perpendicular to the direction of the current. Suppose we have a wire with a rectangular cross-section that carries current in the {+y} direction. A magnetic field pointing in the {+x} direction is applied to the wire. From the Lorentz force law, a moving charge in the wire feels a magnetic force {q\mathbf{v}\times\mathbf{B}}, so it will be deflected in the {\pm z} direction, where the sign of the deflection depends on the sign of the charge and the direction of motion. If the charges are positive and flowing in the {+y} direction, they are deflected in the {-z} direction.

As a result, a charge imbalance is created inside the wire resulting in an electric field in the {z} direction. Equilibrium is established when the electric and magnetic forces balance, and this happens when {q\mathbf{E}=-q\mathbf{v}\times\mathbf{B}}. For positive charges, this means that {E=vB}, so if the wire has a thickness {t} in the {z} direction, the potential difference across the wire is {Et=vBt}.

If the charges are negative, then to produce the same current as above they would have to be moving in the {-y} direction. Since the direction of motion and the sign of the charges are both opposite to the first case, the negative charges will still be deflected downwards, so the direction of the induced electric field will be reversed. Thus by measuring the sign of the potential difference we can tell whether the charge carriers are positive or negative.

Metric tensor: spherical coordinates

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 5; Problem 5.6.

The non-rectangular coordinate systems (semi-log and sinusoidal) we’ve looked at so far have all been flat, so it’s time to look at one in curved space. We’ll use the surface of a sphere, but rather than the usual spherical coordinates we’ll use a slight variation. We keep the azimuthal angle {\phi} but use as the second coordinate the quantity {r} which is the distance along the surface of the sphere measured from the north pole. If the radius of the sphere is {R}, then in terms of normal spherical coordinates, {r=R\theta}.

Curves of constant {\phi} are the usual lines of longitude, while curves of constant {r} are lines of latitude. The tangents to the two curves at a given point are always perpendicular, so the metric {g_{ij}} will be diagonal. To find the diagonal components, consider an infinitesimal displacement {d\mathbf{s}}. We have

\displaystyle  d\mathbf{s}=dr\mathbf{e}_{r}+d\phi\mathbf{e}_{\phi} \ \ \ \ \ (1)

and our job is to find the two basis vectors.

The displacement along {\mathbf{e}_{r}} is just {dr=Rd\theta}, so {\mathbf{e}_{r}} is a unit vector. A displacement along {\mathbf{e}_{\phi}} depends on the radius of the constant {r} curve. In spherical coordinates, this is {R\sin\theta}, so in our new coordinate system we get the displacement as {R\sin\theta d\phi=R\sin\frac{r}{R}d\phi}. Therefore the magnitude of {\mathbf{e}_{\phi}} is {R\sin\frac{r}{R}}. The metric tensor is thus

\displaystyle  g_{ij}=\left[\begin{array}{cc} 1 & 0\\ 0 & \left(R\sin\frac{r}{R}\right)^{2} \end{array}\right] \ \ \ \ \ (2)

Michelson-Morley experiment: length contraction?

Required math: algebra, vectors

Required physics: basics

The outcome of the Michelson-Morley experiment was that the speed of light appeared to be independent of the velocity of the apparatus relative to the postulated universal ether, which was the medium in which light was presumed to travel. If light really did travel in some substance, so that the wave nature of light was due to its propagation through it, then the speed of light should be fixed relative to this ether in the same way that the speed of other waves such as sound or water are fixed relative to their propagation medium, and if the apparatus is moving relative to the ether, then the velocity of light relative to the apparatus should vary.

Michelson’s explanation for the null result of his experiment was that the Earth dragged the ether along with it, so that the Earth remained at rest in the ether. This didn’t seem to convince many physicists of the time, since it would imply that every mass dragged its own little aura of ether along with it, which didn’t seem likely (of course, the final explanation – special relativity – didn’t seem very likely at first either).

One other explanation that was proposed at the time was a suggestion made independently by the Dutch physicist Hendrik Lorentz and the Irish physicist George Fitzgerald. This was that all objects contracted in the direction of their motion relative to the ether. As we saw when we analyzed the Michelson-Morley experiment, the round trip time for the light travelling parallel to the motion relative to the ether is

\displaystyle  t_{\parallel}=\frac{2\lambda}{1-v^{2}} \ \ \ \ \ (1)

while the time for the round trip perpendicular to the direction of motion is

\displaystyle  t_{\perp}=\frac{2\lambda}{\sqrt{1-v^{2}}} \ \ \ \ \ (2)

where {\lambda} is the distance from the source to the mirror in each case, and {v} is the velocity of the Earth relative to the ether. The result of the experiment was that {t_{\parallel}=t_{\perp}}.

If {\lambda} is actually different in the two cases, then the equality of the two times could be explained. In particular if we write

\displaystyle   t_{\parallel} \displaystyle  = \displaystyle  \frac{2\lambda_{\parallel}}{1-v^{2}}\ \ \ \ \ (3)
\displaystyle  t_{\perp} \displaystyle  = \displaystyle  \frac{2\lambda_{\perp}}{\sqrt{1-v^{2}}} \ \ \ \ \ (4)

then if

\displaystyle  \lambda_{\parallel}=\lambda_{\perp}\sqrt{1-v^{2}} \ \ \ \ \ (5)

the two times are equal. Thus lengths in the direction of motion are contracted by a factor of {\sqrt{1-v^{2}}}, where {0\le v\le1} (although at the time, the restriction of {v} to be less than 1 wasn’t formally imposed).

This is, of course, the same result as is obtained in special relativity, since there, objects do in fact appear contracted in the direction of motion of one observer with respect to another. There are two crucial differences, however. In relativity, the contraction is a result of the motion of one observer relative to any other observer, not with respect to some background ether. The second difference is the most fundamental; it is that the time as measured by the two observers is not the same.

The possibility that time wasn’t absolute did not occur to Lorentz or Fitzgerald, and as a result their proposal doesn’t work properly. To see this a bit more quantitatively, we can work out the transformations between two coordinate systems assuming only their contraction hypothesis. If our two observers are {G} (at rest relative to the ether, and using Greek letters) and {R} (moving at speed {v} in the {x} direction relative to the ether, and using Roman letters), then {R} will say that any distance measured by {G} as {\xi} is actually shorter, so that {x=\xi\sqrt{1-v^{2}}}, but that the two observers will agree on the times at which events occur, so that {t=\tau}. We can write this as a transformation matrix:

\displaystyle  F_{v}=\left(\begin{array}{cc} 1 & 0\\ v & \sqrt{1-v^{2}} \end{array}\right) \ \ \ \ \ (6)

The formal transformations are therefore

\displaystyle   t \displaystyle  = \displaystyle  \tau\ \ \ \ \ (7)
\displaystyle  x \displaystyle  = \displaystyle  v\tau+\xi\sqrt{1-v^{2}} \ \ \ \ \ (8)

The {v\tau} term, of course, just represents the fact that any point on the {\xi} axis in {G}‘s frame is moving with speed {v} relative to {R}. In particular, at {t=\tau=0}, this transformation provides a uniform contraction of all distances in the {x} direction.

This is fine as far as it goes and seems to explain the null result of the experiment, but there are a couple of problems. First, we would expect that the inverse transformation (from {R} to {G}) should be obtained by plugging in {-v} in place of {v}, but if we try that, we get

\displaystyle  F_{-v}=\left(\begin{array}{cc} 1 & 0\\ -v & \sqrt{1-v^{2}} \end{array}\right) \ \ \ \ \ (9)

and it can be seen by direct multiplication that {F_{-v}\ne F_{v}^{-1}}. For reference, the inverse matrix turns out to be

\displaystyle  F_{v}^{-1}=\left(\begin{array}{cc} 1 & 0\\ -\frac{v}{\sqrt{1-v^{2}}} & \frac{1}{\sqrt{1-v^{2}}} \end{array}\right) \ \ \ \ \ (10)

which doesn’t have any obvious meaning as a transformation matrix.

Another problem appears when we consider how this transformation affects light. If we fire two photons in opposite directions along the {\xi} axis, clearly they travel with speed 1 (in the positive {\xi} direction) and {-1} (in the opposite direction). Under transformation, since {R} also sees the two photons take equal travel times, the two photons should transform the same way. In {G}‘s frame, the equations of the two world lines of the photons are

\displaystyle  \xi_{\pm}=\pm\tau \ \ \ \ \ (11)

Under the transformation {F_{v}} we get

\displaystyle   x_{+} \displaystyle  = \displaystyle  (v+\sqrt{1-v^{2}})\tau\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  (v+\sqrt{1-v^{2}})t\ \ \ \ \ (13)
\displaystyle  x_{-} \displaystyle  = \displaystyle  (v-\sqrt{1-v^{2}})\tau\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  (v-\sqrt{1-v^{2}})t \ \ \ \ \ (15)

since {t=\tau}. At the two extremes, we get, first for {v=0}

\displaystyle   x_{+}(0) \displaystyle  = \displaystyle  t\ \ \ \ \ (16)
\displaystyle  x_{-}(0) \displaystyle  = \displaystyle  -t \ \ \ \ \ (17)

which is as it should be, since if {v=0}, {G} and {R} are in the same frame.

For {v=1,} though, we get

\displaystyle   x_{+}(1) \displaystyle  = \displaystyle  t\ \ \ \ \ (18)
\displaystyle  x_{-}(1) \displaystyle  = \displaystyle  t \ \ \ \ \ (19)

For values of {v} in between 0 and 1, the slope of {x_{+}} versus {t} is always greater than 1 (it has a maximum value of {(1+\sqrt{2})/2} when {v=1/\sqrt{2}}), and the slope of {x_{-}} versus {t} is always greater than {-1} (the slope increases monotonically from {-1} at {v=0} to {+1} at {v=1}). Thus the photons have different speeds in {R}‘s frame, and will take different travel times to travel the same length, so this transformation still contradicts the results of the experiment.

The solution to the problem requires special relativity, in which the ether is abolished, the speed of light is independent of the observer’s frame, and the universality of time is abolished. On reflection, this solution is probably much more radical and non-intuitive than the simple contraction proposed by Lorentz and Fitzgerald, but it has passed many experimental tests since.

Colour-colour diagrams

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.15.

We can use the colour indices of a star to determine its magnitudes in the three colour regions of ultraviolet, blue and visual. From Appendix G in Carroll & Ostlie, we have, for the Sun:

\displaystyle   M_{bol} \displaystyle  = \displaystyle  +4.74\ \ \ \ \ (1)
\displaystyle  BC \displaystyle  = \displaystyle  -0.08\ \ \ \ \ (2)
\displaystyle  M_{V} \displaystyle  = \displaystyle  +4.82\ \ \ \ \ (3)
\displaystyle  U-B \displaystyle  = \displaystyle  +0.195\ \ \ \ \ (4)
\displaystyle  B-V \displaystyle  = \displaystyle  +0.650 \ \ \ \ \ (5)

We can get the absolute magnitudes from

\displaystyle   M_{B} \displaystyle  = \displaystyle  \left(B-V\right)+M_{V}=+5.47\ \ \ \ \ (6)
\displaystyle  M_{U} \displaystyle  = \displaystyle  \left(U-B\right)+M_{B}=+5.665 \ \ \ \ \ (7)

We can get the apparent magnitude from the relation

\displaystyle  m=M+5\log d-5 \ \ \ \ \ (8)

where {d} is the distance in parsecs. The Sun has

\displaystyle  d=1\mbox{ AU}=4.848137\times10^{-6}\mbox{ pc} \ \ \ \ \ (9)


\displaystyle   V \displaystyle  = \displaystyle  M_{V}+5\left(\log d-1\right)\ \ \ \ \ (10)
\displaystyle  \displaystyle  = \displaystyle  +4.82+5\left(\log\left(4.848137\times10^{-6}\right)-1\right)\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  -26.752 \ \ \ \ \ (12)

The other two apparent magnitudes are

\displaystyle   B \displaystyle  = \displaystyle  \left(B-V\right)+V=-26.102\ \ \ \ \ (13)
\displaystyle  U \displaystyle  = \displaystyle  \left(U-B\right)+B=-25.907 \ \ \ \ \ (14)

For Sirius, the colour indices are measured as (from Example 3.6.1 in Carroll & Ostlie):

\displaystyle   U-B \displaystyle  = \displaystyle  -0.04\ \ \ \ \ (15)
\displaystyle  B-V \displaystyle  = \displaystyle  +0.01 \ \ \ \ \ (16)

The two colour indices can be plotted on a diagram called a colour-colour diagram in which the vertical axis is {U-B} (increasing downwards) and the horizontal axis is {B-V} (increasing to the right). A typical colour-colour diagram is shown here (this diagram is by Brews ohare, from the Wikipedia page on the colour-colour diagram):

Main sequence stars comprise the majority of ‘normal’ stars in the galaxy; supergiant stars are, as the name implies, very large stars. The black line shows the ideal curve for blackbodies. The hottest stars are on the left, ranging through intermediate temperatures to the coolest stars in the lower right. In general, most stars lie below the blackbody curve, indicating that their {U-B} values are larger than a blackbody at the same temperature. That is, most stars tend to have an excess of blue light over ultraviolet light as compared to a similar blackbody.

If we plot the Sun and Sirius on a colour-colour diagram, we get:

Again, the straight line is the blackbody curve. The red cross is the Sun and the blue cross (near the origin) is Sirius. Sirius lies to the left of the Sun, so it is hotter.

Stefan-Boltzmann constant: luminosity of a star as a blackbody

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.14.

The blackbody radiation rate in terms of wavelength is

\displaystyle  B_{\lambda}\left(T\right)=\frac{2hc^{2}}{\lambda^{5}\left(e^{hc/\lambda k_{B}T}-1\right)} \ \ \ \ \ (1)

{B_{\lambda}} is the rate at which a blackbody at temperature {T} radiates in watts per unit area, per unit wavelength band, per steradian. The amount of radiation emitted in a wavelength span {\left[\lambda,\lambda+d\lambda\right]} is therefore {B_{\lambda}d\lambda}. As we saw earlier, the total rate of emission of energy per unit area, integrated over all wavelengths and over solid angle is

\displaystyle  j=\frac{2\pi^{5}k_{B}^{4}}{15h^{3}c^{2}}T^{4} \ \ \ \ \ (2)

The constant multiplying {T^{4}} is the Stefan-Boltzmann constant, defined as

\displaystyle  \sigma\equiv\frac{2\pi^{5}k_{B}^{4}}{15h^{3}c^{2}} \ \ \ \ \ (3)

The value of {\sigma} is

\displaystyle   \sigma \displaystyle  = \displaystyle  \frac{2\pi^{5}\left(1.3806488\times10^{-23}\right)^{4}}{15\left(6.62606957\times10^{-34}\right)^{3}\left(2.99792458\times10^{8}\right)^{2}}\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  5.670373\times10^{-8}\mbox{ W m}^{-2}\mbox{K}^{-4} \ \ \ \ \ (5)

This value agrees with the currently accepted value of the constant.

If the star radiates uniformly over its entire surface, then the luminosity of the star is {j} times the surface area, so

\displaystyle  L=4\pi R^{2}j=\frac{8\pi^{6}R^{2}k_{B}^{4}}{15h^{3}c^{2}}T^{4}=4\pi R^{2}\sigma T^{4} \ \ \ \ \ (6)

Blackbody radiation in the frequency domain

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 3, Problem 3.13.

The blackbody radiation rate in terms of wavelength is

\displaystyle  B_{\lambda}\left(T\right)=\frac{2hc^{2}}{\lambda^{5}\left(e^{hc/\lambda k_{B}T}-1\right)} \ \ \ \ \ (1)

{B_{\lambda}} is the rate at which a blackbody at temperature {T} radiates in watts per unit area, per unit wavelength band, per steradian. The amount of radiation emitted in a wavelength span {\left[\lambda,\lambda+d\lambda\right]} is therefore {B_{\lambda}d\lambda}. To convert this to a rate per unit frequency interval, we need to convert {B_{\lambda}d\lambda} to the equivalent form {B_{\nu}d\nu}. We have

\displaystyle   \nu \displaystyle  = \displaystyle  \frac{c}{\lambda}\ \ \ \ \ (2)
\displaystyle  d\nu \displaystyle  = \displaystyle  -\frac{c}{\lambda^{2}}d\lambda\ \ \ \ \ (3)
\displaystyle  \left|B_{\lambda}d\lambda\right| \displaystyle  = \displaystyle  \left|\frac{\lambda^{2}}{c}B_{\lambda}d\nu\right|\ \ \ \ \ (4)
\displaystyle  B_{\nu} \displaystyle  = \displaystyle  \frac{\lambda^{2}}{c}B_{\lambda}\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \frac{2hc}{\lambda^{3}\left(e^{hc/\lambda k_{B}T}-1\right)}\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \frac{2h\nu^{3}}{c^{2}\left(e^{h\nu/k_{B}T}-1\right)} \ \ \ \ \ (7)

The frequency at which {B_{\nu}} is a maximum is found as usual by solving {dB_{\nu}/d\nu=0}. As in the case of finding {\lambda_{max}}, this gives rise to an equation that must be solved numerically. We get

\displaystyle   \frac{dB_{\nu}}{d\nu} \displaystyle  = \displaystyle  \frac{6h\nu^{2}}{c^{2}\left(e^{h\nu/k_{B}T}-1\right)}-\frac{2h^{2}\nu^{3}e^{h\nu/k_{B}T}}{c^{2}k_{B}T\left(e^{h\nu/k_{B}T}-1\right)^{2}}=0\ \ \ \ \ (8)
\displaystyle  3 \displaystyle  = \displaystyle  e^{h\nu/k_{B}T}\left(3-\frac{h\nu}{k_{B}T}\right)\ \ \ \ \ (9)
\displaystyle  3 \displaystyle  = \displaystyle  e^{x}\left(3-x\right) \ \ \ \ \ (10)

where {x\equiv h\nu/k_{B}T}. Solving this using Maple we get {x=2.821439372} (or {x=0}, but that isn’t very interesting) and then plugging in {h=6.62606957\times10^{-34}} and {k_{B}=1.3806488\times10^{-23}} (in SI units) we get

\displaystyle  \nu_{max}=5.879\times10^{10}T\mbox{ s}^{-1} \ \ \ \ \ (11)

For the Sun, {T=5777\mbox{ K}} so {\nu_{max}=3.396\times10^{14}\mbox{ s}^{-1}} which corresponds to a wavelength of

\displaystyle  \lambda=\frac{c}{\nu_{max}}=880\mbox{ nm} \ \ \ \ \ (12)

This is in the infrared region. This is different from {\lambda_{max}=2.901\times10^{-3}/T=502\mbox{ nm}} as calculated from Wien’s displacement law because {B_{\lambda}} and {B_{\nu}} measure different things. {B_{\lambda}} is the radiation per unit wavelength interval and {B_{\nu}} is the radiation per unit frequency interval and as we’ve seen from 3, these two intervals are not linearly related. The size of a frequency interval depends on the wavelength (and vice versa).