We have seen how differentials and the rules for computing them can be useful in applications involving slopes, velocities, and accelerations. The fact that they are infinitely small and difficult to rigorously define does not diminish their value as a problem-solving tool. Eventually, we will return to these foundational issues in Chapter 13.
As subtle as the concept of infinitely small changes such as \(\dx{x}\text{,}\)\(\dx{y}\text{,}\) or\(\dx{t}\) can be, we have seen how their ratios \(\dfdx{y}{x}\text{,}\)\(\dfdx{x}{t} \text{,}\) or \(\dfdxn{y}{x}{2} = \frac{\dx \left( \dfdx{y}{t} \right)
}{\dx{t}}\) can represent actual finite physical quantities. For later mathematicians, these ratios became the focal point for the understanding of calculus.
Because of concerns regarding the validity of differentials, mathematicians in the \(18\)th and \(19\)th centuries, had a strong motivation to skip over the differential concept and jump immediately to the more useful, and finite, differential ratio.
In his 1797 work Théorie des Fonctions Analytiques (The Theory of Analytic Functions), Joseph Louis Lagrange attempted to make Calculus more rigorous. He even coined a new term for the differential ratio. He called it the fonction dérivée (meaning a function derived from another function). He also replaced the differential ratio \(\dfdx{y}{x}\) with the more modern function notation \(y^\prime(x)\) (read this aloud as “\(y\) prime of \(x\)”).
Lagrange’s attempt to make Calculus rigorous was very clever, but ultimately unsuccessful. Full rigor had to wait for another hundred years, so we will not say much about Lagrange’s efforts here. But we will adopt his terminology and his notation.
Lagrange called the differential ratio \(\dfdx{y}{x}\text{,}\) a derived function. The “derived” part seems clear enough. After all, if \(y=x^3\) them \(\dfdx{y}{x}\) is obtained (derived) from \(y\) as follows:
In some contexts Lagrange’s prime notation has several advantages over the differential notation we’ve been using. Over time it has become the most common notation for the derivative in mathematics. But the fact that it took over \(100\) years to develop suggests that something more than mere notation is in play here.
Using multiple equivalent notations can be very confusing for beginners. Since our current task is simply to master the differentiation rules we will stick to Leibniz’s differential notation as much as possible. But there will come a time when Lagrange’s prime notation will be much more convenient. At that point we will casually use the two expressions \(\dfdx{y}{x}\) and \(y^\prime(x)\) interchangeably and we will think of them both as a function derived from the function \(y(x)\text{.}\)
When we do this the differential notation we’re currently emphasizing will take on two distinct “personalities.” On the one hand \(\dfdx{y}{x}\) represents a ratio of the differentials \(\dx{y}\text{,}\) and \(\dx{x}\) which are distinct infinitesimal quantities. On the other hand \(\dfdx{y}{x}\) is the name of a function -- it is all one symbol. When we are thinking of \(\dfdx{y}{x}\) as the derivative function we cannot detach the pieces of \(\dfdx{y}{x}\) any more than can delete the letter “n” from \(\sin(x)\) because \(\text{si}(x)\) has no meaning.
Eventually the differentials we’ve been using so casually will become a guilty secret. Given \(y=y(x)\) we’ll use them as a helpful aid while we compute. But as soon as we have \(\dfdx{y}{x}\) in hand we will view it as a single, complete symbol representing the (finite) derivative function. Often we will simply replace it with \(y^\prime(x)\) as if we are ashamed of having used differentials at all.
Recall that in Descartes’ Method of Normals, we had to find a double root of a polynomial. To deal with this problem, Johann van Waveren Hudde (1628–1704) developed an algebraic tool for determining such double roots. Calculus allows a development of Hudde’s Rule that does not require the complex algebraic reasoning that Hudde used and is much easier to follow.
The bottom line is that we will adopt the name derivative to indicate the result of dividing one differential by another. So the expression \(\dfdx{y}{x}\) is “the derivative of \(y\) with respect to \(x\text{.}\)” Although \(y^\prime(x) \) is simpler to write and to read the sheer simplicity of the notation can be a problem for beginners so we will only use it when the prime notation actually clarifies our meaning. The first place this occurs is in Section 7.2.
When computing a derivative you will eventually become sufficiently proficient that you will jump directly to the derivatives. But for now you should go through the two-step process of differentiating to obtain a differential and then dividing by another differential to obtain a derivative because the computational rules you’ve learned are differentiation rules, not derivative rules. If you do this, you will avoid some difficulties created by trying compute too much too soon. This can be illustrated in the following example, where we purposely use prime notation to highlight the difficulties involved in the computation.
The left side of equation (5.6) indicates that the variable is \(x\) but there is no \(x\) on the right side, only \(z\text{.}\) So this can’t be right. But what went wrong? We can avoid problems like this by using differentials:
The glaring question here is why is \(\dx(\dx{x})\) equal to zero in equation (5.7) but \(\dx(\dx{y})\) is not equal to zero in equation (5.8)? Or, at a more fundamental level, what do we mean by “the infinitely small change of an infinitely small change?” As we will see in Chapter 13 the early critics of Calculus cited this question specifically to argue that Calculus was invalid.
We will address these issues beginning in Chapter 13. For now we will make the following compromise: We will only differentiate finite quantities, be they functions, or derivatives. Since our ultimate goal is to compute some derivative this will suit our needs without getting caught up in the very problematic question of the nature of higher order differentials. So for this example we have
You’ve probably been taught all of your life to “simplify” complex looking expressions like \((-1)(-2)(-3)(-4)\) and you probably do it without thinking. So you may be wondering why we left the coefficients above in the form we did.
The reason is simple. We were looking for patterns not numbers. Writing the above formulas as \(\dfdxn{y}{x}{2} = 2x^{-3},\)\(\dfdxn{y}{x}{3} = -6x^{-4}\text{,}\) and \(\dfdxn{y}{x}{4} =
24x^{-5}\) obscures the pattern. Keep this in mind as you proceed. Algebraic or arithmetical “simplifications” often get in the way of recognizing patterns. Don’t do them until there is a compelling reason to.
Consider the circle \(x^2+y^2=1\text{.}\) Differentiating, we have \(2x\dx{x}+2y\dx{y}=0\text{,}\) or \(\dfdx{y}{x}=-\frac{x}{y}\text{.}\) Differentiating again we have
We know that it is not generally true that \(a^b=a\cdot b\) even though there are certain exceptions, like \(a=b=1\text{,}\)\(a=4\) and \(b=1/2\text{,}\) or \(a=b=2\text{.}\) In the same way, even though the Product Rule makes it very clear that
there are certain pairs of functions which are exceptions; for which (5.9)is true. For example, show that for each of the following it is true that \(\dfdx{(y\cdot z)}{x}= \dfdx{y}{x}\cdot\dfdx{z}{x}\text{.}\)
Adopting Lagrange’s terminology, but not his notation, we see that if the position of a point moving in a straight line (like the \(x\) axis) is given by \(x=x(t)\text{,}\) then the first derivative, \(\dfdx{x}{t}\text{,}\) will give its velocity, and its second derivative, \(\dfdxn{x}{t}{2}\text{,}\) will give its acceleration.
For each of the following \(x(t)\) represents the position of a point moving along the \(x\)-axis. Use the information given to determine if the point is slowing down or speeding up at the instant \(t_0\text{.}\)