Required math: algebra, calculus
Required physics: none
Reference: d’Inverno, Ray, Introducing Einstein’s Relativity (1992), Oxford Uni Press. – Section 6.4; Problem 6.7.
From basic linear algebra, we’re familiar with the idea of moving a vector parallel to itself. To add two vectors A and B in 3-d Euclidean space, for example, we’re usually told to move the vector B parallel to itself so that its tail coincides with the head of A and then draw the sum as the vector from the tail of A to the head of B.
Moving a vector in this way always works in ‘flat’ space such as 3-d space spanned by the three rectangular basis vectors because this space is uniform everywhere. To see that this isn’t always the case for different types of space, consider the example of a sphere.
Here, the notion of a vector has to be defined a bit more carefully. In 3-d flat space, a vector belongs to the same space as the space used to define the points between which we can move. Said another way, the tangent space to a point in 3-d flat space is just that very 3-d flat space itself. On a sphere, though, the tangent space to a point on the sphere is a plane which intersects the sphere at that single point. The tangent space (the plane) is no longer the same space as the surface of the sphere. The vector tangent to the sphere at a point exists in a different space from the sphere itself.
To make matters worse, every tangent vector on the sphere exists in a different tangent space, since the plane tangent to the sphere varies as we move around the sphere. Thus if we try to move the tangent vector and, in some sense, keep it parallel to itself, we are faced with the problem of comparing vectors from different tangent spaces.
A common but illustrative example is to start with a vector tangent to the Earth (assuming the Earth is a sphere, which isn’t quite right, but it will do) at the intersection of the Greenwich meridian ( longitude) and the equator ( latitude), with the vector pointing north. If we now walk due north along this line until we reach the north pole and keep the vector parallel to itself (in the sense of keeping it pointing due north and tangent to the sphere at each point), then when we reach the north pole, we’ll have a vector that is horizontal and which points down the meridian.
However, if we return to our starting point on the equator and now walk due east until we reach a longitude of E, still keeping our vector pointing due north, then walk north along this meridian until we again reach the north pole, we will now end up with a vector that is horizontal and pointing down the W meridian. The angle between the two north pole vectors is thus , even though we took great care to keep the vector ‘constant’ as we moved it through the two paths.
This example serves to show that if we are in a curved space, the notion of parallel transport of a vector (and by extension, a tensor) depends on the path we take. This is in fact a general problem and there isn’t a clever way of defining some new term that eliminates this effect.
How can we describe this situation mathematically? Returning to our 3-d flat-space situation for a moment, suppose we define some curve by means of a parameter , so that
Now suppose we have a vector v and we want to transport this vector along this curve in such a way that the vector doesn’t change. Mathematically, what we’re saying is that
for each vector component . Introducing the coordinates, we can expand this using the chain rule to get
Since the tangent vector to the curve has components , this equation is taking the contraction of the tangent vector with the derivatives of .
Now we’ve seen that in general, the partial derivative of a vector is not a tensor, and that we got round this problem by introducing the covariant derivative, which is a tensor. Since the tangent vector is a tensor we don’t need to modify it, but to make the derivative of along the curve a tensor, we need to generalize the ordinary derivative above to a covariant derivative. That is, we define the condition for parallel transport of a vector in a general curved coordinate system to be
Expanding the covariant derivative, we get
We see that if the connections are all zero, this reduces to the flat-space case. Further, the condition for parallel transport becomes
For a general tensor, the covariant derivative is
so we’d need to plug this into the equation above to get
The tangent vector can be written as
and we get what is called the absolute derivative of a tensor:
We can generalize this idea to get the contraction of any vector with the covariant derivative of a tensor . The notation used for this is:
This quantity satisfies some standard properties. If we restrict ourselves to vector fields, then if we have three vector fields and , two scalar functions and , and two constants and , then we can derive some of these properties. First:
where the is the ordinary gradient of . (I believe the answer given in d’Inverno’s problem 6.7 is wrong here.)