Yesterday, I spent some time thinking about how, if you have a function f:R2→R and some point x∈R2, the value of the directional derivative from x could change as a function of the angle. I.e., what does the function ϕ:[0,2π]→R+ look like? I thought that any relationship was probably possible as long as it has the property that ϕ(α)=−ϕ(2π−α). (The values of the derivative in two opposite directions need to be negatives of each other.)
Anyone reading this is hopefully better at Analysis than I am and realized that there is, in fact, no freedom at all because each directional derivative is entirely determined by the gradient through the equation ∇vf(x)=⟨∇f(x),vN⟩ (where vN=v||v||). This means that ϕ has to be the cosine function scaled by ||∇vf(x)||, it cannot be anything else.
I clearly failed to internalize what this equation means when I first heard it because I found it super surprising that the gradient determines the value of every directional derivative. Like, really? It’s impossible to have more than exactly two directions with equally large derivatives unless the function is constant? It’s impossible to turn 90 degree from the direction of the gradient and having anything but derivative 0 in that direction? I’m not asking that ϕ be discontinuous, only that it not be precisely ||∇f(α)||cos(α). But alas.
This also made me realize that cos if viewed as a function of the circle is just the dot product with the standard vector, i.e.,
cos:S2→[−1,+1]cos:x↦⟨x,(1,0)⟩
or even just cos(x,y)=x. Similarly, sin(x,y)=y.
I know what you’re thinking; you need sin and cos to map [0,2π] to S2 in the first place. But the circle seems like a good deal more fundamental than those two functions. Wouldn’t it make more sense to introduce trigonometry in terms of ‘how do we wrap R around S2?’. The function that does this is γ(x)=(cos(x),sin(x)), and then you can study the properties that this function needs to have and eventually call the coordinates cos and sin. This feels like a way better motivation than putting a right triangle onto the unit circle for some reason, which is how I always see the topic introduced (and how I’ve introduced it myself).
Looking further at the analogy with the gradient, this also suggests that there is a natural extension of cos to Sn for all n∈N. I.e., if we look at some point x∈Rn, we can again ask about the function ϕ that maps each angle to the value of the directional derivative on x in that direction, and if we associate these angles with points of Sn−1, then this yields the function ϕ:Sn−1→R, which is again just the dot product with (1,...,0) or the projection onto the first coordinate (scaled by ||∇f(x)||). This can then be considered a higher-dimensional cos function.
There’s also the 0-d case where S0={1,−1}. This describes how the direction changes the derivative for a function f:R→R.
I found it super surprising that the gradient determines the value of every directional derivative. Like, really?
When reading this comment, I was surprised for a moment, too, but now that you mention it—it’s because if the function is smooth at the point where you’re taking the directional derivative, then it has to locally resemble a plane, just like a how a differentiable function of a single variable is said to be “locally linear”. If the directional derivative varied in any other way, then the surface would have to have a “crinkle” at that point and it wouldn’t be differentiable. Right?
I have since learned that there are functions which do have all partial derivatives at a point but are not smooth. Wikipedia’s example is f(x,y)=y3x2+y2 with f(0,0)=0. And in this case, there is still a continuous function ϕ:S2→R that maps each point to the value of the directional derivative, but it’s ϕ(x,y)=y3, so different from the regular case.
So you can probably have all kinds of relationships between direction and {value of derivative in that direction}, but the class of smooth functions have a fixed relationship. It still feels surprising that ‘most’ functions we work with just happen to be smooth.
Yesterday, I spent some time thinking about how, if you have a function f:R2→R and some point x∈R2, the value of the directional derivative from x could change as a function of the angle. I.e., what does the function ϕ:[0,2π]→R+ look like? I thought that any relationship was probably possible as long as it has the property that ϕ(α)=−ϕ(2π−α). (The values of the derivative in two opposite directions need to be negatives of each other.)
Anyone reading this is hopefully better at Analysis than I am and realized that there is, in fact, no freedom at all because each directional derivative is entirely determined by the gradient through the equation ∇vf(x)=⟨∇f(x),vN⟩ (where vN=v||v||). This means that ϕ has to be the cosine function scaled by ||∇vf(x)||, it cannot be anything else.
I clearly failed to internalize what this equation means when I first heard it because I found it super surprising that the gradient determines the value of every directional derivative. Like, really? It’s impossible to have more than exactly two directions with equally large derivatives unless the function is constant? It’s impossible to turn 90 degree from the direction of the gradient and having anything but derivative 0 in that direction? I’m not asking that ϕ be discontinuous, only that it not be precisely ||∇f(α)||cos(α). But alas.
This also made me realize that cos if viewed as a function of the circle is just the dot product with the standard vector, i.e.,
cos:S2→[−1,+1]cos:x↦⟨x,(1,0)⟩
or even just cos(x,y)=x. Similarly, sin(x,y)=y.
I know what you’re thinking; you need sin and cos to map [0,2π] to S2 in the first place. But the circle seems like a good deal more fundamental than those two functions. Wouldn’t it make more sense to introduce trigonometry in terms of ‘how do we wrap R around S2?’. The function that does this is γ(x)=(cos(x),sin(x)), and then you can study the properties that this function needs to have and eventually call the coordinates cos and sin. This feels like a way better motivation than putting a right triangle onto the unit circle for some reason, which is how I always see the topic introduced (and how I’ve introduced it myself).
Looking further at the analogy with the gradient, this also suggests that there is a natural extension of cos to Sn for all n∈N. I.e., if we look at some point x∈Rn, we can again ask about the function ϕ that maps each angle to the value of the directional derivative on x in that direction, and if we associate these angles with points of Sn−1, then this yields the function ϕ:Sn−1→R, which is again just the dot product with (1,...,0) or the projection onto the first coordinate (scaled by ||∇f(x)||). This can then be considered a higher-dimensional cos function.
There’s also the 0-d case where S0={1,−1}. This describes how the direction changes the derivative for a function f:R→R.
When reading this comment, I was surprised for a moment, too, but now that you mention it—it’s because if the function is smooth at the point where you’re taking the directional derivative, then it has to locally resemble a plane, just like a how a differentiable function of a single variable is said to be “locally linear”. If the directional derivative varied in any other way, then the surface would have to have a “crinkle” at that point and it wouldn’t be differentiable. Right?
That’s probably right.
I have since learned that there are functions which do have all partial derivatives at a point but are not smooth. Wikipedia’s example is f(x,y)=y3x2+y2 with f(0,0)=0. And in this case, there is still a continuous function ϕ:S2→R that maps each point to the value of the directional derivative, but it’s ϕ(x,y)=y3, so different from the regular case.
So you can probably have all kinds of relationships between direction and {value of derivative in that direction}, but the class of smooth functions have a fixed relationship. It still feels surprising that ‘most’ functions we work with just happen to be smooth.