Required math: vectors, calculus

Required physics: none

Given a scalar field defined over three-dimensional space (for example, the temperature in a room or the density of some substance over its volume), we would like to be able to determine the rate of change of the field at a given point as we move along a given direction. For example, if we know the temperature in a room we might like to know how fast the temperature changes as we move straight up towards the ceiling.

To make things precise, we’ll consider a unit vector that points in the direction in which we wish to find the rate of change of the scalar field. In rectangular coordinates we can write this unit vector as

 $\displaystyle \mbox{\ensuremath{\hat{\mathbf{s}}}}$ $\displaystyle =$ $\displaystyle u_{x}\hat{\mathbf{i}}+u_{y}\hat{\mathbf{j}}+u_{z}\hat{\mathbf{k}}\ \ \ \ \ (1)$ $\displaystyle s_{x}^{2}+s_{y}^{2}+s_{z}^{2}$ $\displaystyle =$ $\displaystyle 1 \ \ \ \ \ (2)$

From linear algebra, we know that a line ${\mathbf{l}=(x,y,z)}$ parallel to this vector can be written in vector form as

$\displaystyle \mathbf{l}=\mathbf{r}_{0}+t\hat{\mathbf{s}} \ \ \ \ \ (3)$

where ${t}$ is a parameter that ranges from ${-\infty}$ to ${+\infty}$ and ${\mathbf{r}_{0}=(x_{0,}y_{0},z_{0})}$ is a point on the line. In components, we have

 $\displaystyle x$ $\displaystyle =$ $\displaystyle x_{0}+ts_{x}\ \ \ \ \ (4)$ $\displaystyle y$ $\displaystyle =$ $\displaystyle y_{0}+ts_{y}\ \ \ \ \ (5)$ $\displaystyle z$ $\displaystyle =$ $\displaystyle z_{0}+ts_{z} \ \ \ \ \ (6)$

Now if we have some scalar field given by ${a=a(x,y,z)}$ we can find its directional derivative in a given direction by finding the total derivative of ${f}$ with respect to the parameter ${t}$. From the chain rule for functions of several variables, we have, since each of the components of ${\mathbf{l}}$ depends on ${t}$:

$\displaystyle \frac{da}{dt}=\frac{\partial a}{\partial x}\frac{dx}{dt}+\frac{\partial a}{\partial y}\frac{dy}{dt}+\frac{\partial a}{\partial z}\frac{dz}{dt} \ \ \ \ \ (7)$

From the above equations, we can work out the deriviatives with respect to ${t}$ and get

$\displaystyle \frac{da}{dt}=\frac{\partial a}{\partial x}s_{x}+\frac{\partial a}{\partial y}s_{y}+\frac{\partial a}{\partial z}s_{z} \ \ \ \ \ (8)$

If we define a vector called the gradient of ${f}$ by

$\displaystyle \nabla a\equiv\frac{\partial a}{\partial x}\hat{\mathbf{i}}+\frac{\partial a}{\partial y}\hat{\mathbf{j}}+\frac{\partial a}{\partial z}\hat{\mathbf{k}} \ \ \ \ \ (9)$

we can write the directional derivative as

$\displaystyle \frac{da}{dt}=\nabla a\cdot\hat{\mathbf{s}} \ \ \ \ \ (10)$

Since ${\hat{\mathbf{s}}}$ is a unit vector, we can write this as

$\displaystyle \frac{da}{dt}=|\nabla a|\cos\theta \ \ \ \ \ (11)$

where ${\theta}$ is the angle between the gradient and the unit vector ${\hat{\mathbf{s}}}$. From this we see that ${da/dt}$ has its maximum value when ${\theta=0}$, or when the gradient is parallel to the direction of the derivative. In other words, the gradient of a scalar field points in the direction of greatest rate of change of the field at a given point.

In general curvilinear coordinates, we have three coordinates ${u}$, ${v}$ and ${w}$ whose unit vectors are mutually perpendicular. We can define the line ${\mathbf{l}}$ parametrically in terms of these coordinates in the same way as with rectangular coordinates above, so that the rate of change of the scalar field ${a}$ is

$\displaystyle \frac{da}{dt}=\frac{\partial a}{\partial u}\frac{du}{dt}+\frac{\partial a}{\partial v}\frac{dv}{dt}+\frac{\partial a}{\partial w}\frac{dw}{dt} \ \ \ \ \ (12)$

In these general coordinates, however, a line element has the form

$\displaystyle d\mathbf{l}=f\; du\hat{\mathbf{u}}+g\; dv\hat{\mathbf{v}}+h\; dw\hat{\mathbf{w}} \ \ \ \ \ (13)$

where ${f}$, ${g}$ and ${h}$ are functions of the three coordinates. Dividing through by ${dt}$ and taking the limit to get a derivative, we have

$\displaystyle \frac{d\mathbf{l}}{dt}=f\;\frac{du}{dt}\hat{\mathbf{u}}+g\;\frac{dv}{dt}\hat{\mathbf{v}}+h\;\frac{dw}{dt}\hat{\mathbf{w}} \ \ \ \ \ (14)$

From this, we can write 12 as a dot product if we define the gradient in curvilinear coordinates as

$\displaystyle \nabla a=\frac{1}{f}\frac{\partial a}{\partial u}\hat{\mathbf{u}}+\frac{1}{g}\frac{\partial a}{\partial v}\hat{\mathbf{v}}+\frac{1}{h}\frac{\partial a}{\partial w}\hat{\mathbf{w}} \ \ \ \ \ (15)$

Then we get

$\displaystyle \frac{da}{dt}=\nabla a\cdot\frac{d\mathbf{l}}{dt} \ \ \ \ \ (16)$

We can still write the equation for the line ${\mathbf{l}}$ in the form 3, except now the components of the unit vector ${\hat{\mathbf{s}}}$ would be written in terms of the three basis vectors for whatever coordinate system we are using. Thus the analysis above is still valid for general coordinate systems, and the gradient still represents the direction of maximum increase of the scalar field.