Inverse Functions

Section 14.7 Inverse Functions

Although we have worked with the inverses of some specific functions we have not formally defined what we mean by an inverse. We will remedy that now. We have seen that not all functions can be inverted (see for example, DIGRESSION: The Tangent Function Has No Inverse so the first step is to define which functions are invertible.

Informally a function that never takes the same value twice is called a one–to–one function. Formally we have the following.

Definition 14.7.1. One-To-One Functions.

A function, \(f(x)\text{,}\) defined on a domain, \(D\text{,}\) is said to be one-to-one if, whenever \(x_1\) and \(x_2\) are in \(D\) and \(x_1\neq x_2\) then, \(f(x_1)\neq f(x_2)\text{.}\)

Recall that when we tried to invert \(\tan(x)\) (which is not one-to-one) in Section 6.5 we got the multifunction \(\arctan(x)\text{.}\) We had to restrict the domain of the tangent function to \(\frac{-\pi}{2}\lt x \lt \frac{\pi}{2}\text{,}\) in order to find an inverse. That restriction gave us a one-to-one function which we could invert because one-to-one functions are the only functions with inverses.

Definition 14.7.2. Inverse Functions.

Suppose \(f(x)\text{,}\) with domain \(D\) and range \(R\) is a one-to-one function. Then the inverse of \(f(x)\) is the function \(\inverse f(x)\) with domain \(R\) and range \(D\) which satisfies the following properties:

\(f\left(\inverse f(x)\right)=x\) for every value of \(x\) in \(R\text{.}\)
\(\inverse f\left( f(x)\right)=x\) for every value of \(x\) in \(D\text{.}\)

Loosely speaking, Definition 14.7.2 says that two functions are mutually inverse if they “undo” each other.

Our next task is to show that the derivatives of the in inverse trigonometric functions are what we expect them to be. Given that we have now obtained the derivatives of all of the trigonometric functions it appears that we could proceed just as we did in Section 6.6 and Section 6.7.

But that would require that we explicitly assume that each of the inverse trigonometric functions is differentiable, similar to the way we found the derivative of a quotient. This is a valid approach of course, but proceeding in that manner would mask some issues that will be of interest to us later. So we will approach the derivatives of inverse functions abstractly by (rigorously) finding a formula for the derivative of the inverse of a generic, invertible function. After that we’ll only need to apply the formula to each of the inverse trigonmetric formulas.

DIGRESSION: Inverse and Derivative Notation.

As we saw in Inverse Function Notation there are some difficulties with the notation we use to indicate inverse functions. These problems only get worse when we mix the standard derivative notations with the inverse function notation. Lagrange’s prime notation is especially problematic.

For example if \(f(x)\) is an invertible function the derivative of \(\inverse{f}(x)\) could be denoted either as:

\begin{equation*} \dfdx{(\inverse{f})}{x} \end{equation*}

or \({\inverse{f}}^\prime(x)\text{.}\)

But both of these are somewhat awkward. Mathematicians also sometimes use the operator notation:

\begin{equation*} \DD(f(x)) = f^\prime(x)=\dfdx{f}{x} \end{equation*}

and in this situation it minimizes the awkwardness a bit.

As we’ve seen there can also be some vagueness involving the distinction between functions and variables. For example suppose we want to sketch a graph of this relation between \(x\) and \(y\text{:}\)

\begin{equation*} y-x^3=0. \end{equation*}

The simplest thing to do is to choose a value for either \(x\) or \(y\) and then figure out what the corresponding \(y\) or \(x\) is. This is simpler to do that if we rearrange the relation so that we have one variable strictly in terms of (“as a function of”) the other. For this particular relation it is easiest to choose a value for \(x\) and compute the corresponding \(y\) value so we would normally rearrange it as

\begin{equation} y(x)=x^3.\tag{14.10} \end{equation}

equation (14.10) defines \(y\) as a function of \(x\text{.}\)

But we only solved for \(y\) because we could see it was a little easier to do. Otherwise our choice was completely arbitrary. We could also have solved for \(x\) giving,

\begin{equation} x(y)=\sqrt[3]{y}.\tag{14.11} \end{equation}

In this case we have \(x\) as a function of \(y\text{.}\)

The two functions, \(y(x)\text{,}\) (“cube”) and \(x(y)\) (“cube root”), clearly contain the same information as the original relation \(y-x^3=0\text{.}\) But they are different, related, functions. They are in fact mutually inverse.

For example suppose we choose \(x=2\) and use equation (14.10) to find \(y=8\text{.}\) If we then take \(y=8\) and use equation (14.11) we find that \(x=2\text{.}\) That is, \(x(y)\) has “undone” \(y(x)\) for the single pair \((2,8)\text{.}\) Problem 14.7.3 asks you to show that it is true for every pair \((x, y(x))\text{.}\) This “undoing” makes \(y(x)\) and \(x(y)\) a pair of mutually inverse functions.

But in function notation the variable (frequently \(x\) or \(t\)) is a placeholder. For example, each of \(y(x)=x^3,\) \(y(t)=t^3\text{,}\) \(f(\alpha)=\alpha^3\text{,}\) or even \(f(\circ)=\circ^3\) defines exactly the same function: The function which cubes its input. It doesn’t matter what we call the variable. It just holds a place in the formula that tells us what the input is and what to do with it. Since it doesn’t matter what we call the variable we usually call it \(x\) unless there is some compelling reason to use something else.

To avoid confusing variable names with function names we usually denote \(y(x)\) as \(f(x)\text{.}\) It’s inverse, \(x(y)\) should probably be denoted as \(\inverse{f}(y)\text{.}\) But sadly, recognizing that the variable is just a placeholder in function notation we use the same variable name in both the function and it’s inverse. So we denote the inverse of \(f(x)\) as \(\inverse{f}(x)\text{,}\) even though it would probably make it easier for beginners to use \(\inverse{f}(y)\text{,}\) as a reminder that both functions come from the same original relation.

Problem 14.7.3.

Prove that \(f(x)=x^3\) and \(\inverse{f}(x)=\sqrt[3]{x}\) are mutually inverse by showing that they satisfy the conditions stated in Definition 14.7.2

The notation for inverse functions is not great. It can be very confusing, especially for beginners. Be careful with it.

END OF DIGRESSION

Our next task is to show that if \(f(x)\) is invertible and differentiable, then \(\inverse{f}\) is also differentiable. (\(\inverse{f}\) is obviously invertible (why?). We do this by showing that the limit

\begin{equation} \DD\left(\inverse{f}(x)\right)=\limit{h}{0}{\frac{\inverse{f}(x+h)-\inverse{f}(x)}{h}}\tag{14.12} \end{equation}

exists.

In general this is true but there is one exception that has to be addressed. When \(f\) is differentiable at \(a\) and \(f^\prime(a) = 0\) then the limit in equation (14.12) does not exist. Hence \(\inverse f\) is not differentiable at \(f(a)\text{.}\) More formally, we have the following lemma.

Lemma 14.7.4.

If \(f\) is an invertible function, \(f(a)=b\text{,}\) \(f\) is differentiable at \(x=a\text{,}\) and \(f^\prime(a)=0\text{,}\) then \(\inverse f\) is not differentiable at \(x=b\text{.}\) That is \(\DD\left(\inverse f(b)\right)\) does not exist.

The following proof of this lemma is very challenging to read and understand for several reasons.

First, it is quite abstract. We don’t have a particular function to think about so we can’t simply write down formulas for the function and its inverse. Instead we have only the generic function, \(f\) and its inverse \(\inverse{f}\text{,}\) and we’ll need to remember what these symbols represent.

Second, we need to think about the functions \(f\) and \(\inverse{f}\) as well as their derivatives.

Third, instead of using the differential notation, \(\dfdx{f}{x}\) that we’ve grown very comfortable with we’ll be using the less familiar Lagrange prime notation and the operator notation we just introduced.

Finally, the nature of the problem forces us to mix these last two notations, using one here and the other there. This can make for difficult reading.

Read slowly. Remember that each symbol has meaning. Take time to understand that meaning and what each formula as a whole is telling you.

We include this proof in its full abstraction for two reasons:

To be as precise and as rigorous and we can.
We want to give you practice with higher level abstract reasoning in this (fairly) simple case.

The strategy behind the following proof follows the same general scheme as the Sherlock Holmes Maxim that we referred to in Problem 9.2.18. We will eliminate the impossible so that “whatever remains, however improbable, must be the truth.”

There are two possibilities: Either the derivative of \(\inverse{f}(b)\) exists or it does not exist. There are two steps:

Assume that the derivative of \(\inverse{f}\) does exist at \(x=b\) and calculate what \(\DD\left(\inverse{f}(b)\right)\) must be.
Show that our computed value is impossible. Then á la Holmes’ Maxim the only possibility left will be that the derivative of \(\inverse{f}\) does not exist at \(x=b\text{.}\)

Proof.

Assume that \(\inverse{f}\) is differentiable at \(x=b\text{.}\)

Because \(f\) and \(\inverse{f}\) are mutually inverse we know that

\begin{equation*} f\left(\inverse{f}(x)\right)=x. \end{equation*}

Therefore

\begin{equation*} \DD\left(f\left(\inverse{f}(x)\right)\right)=\DD\left(x\right). \end{equation*}

On the right we have

\begin{equation*} \DD(x)=1. \end{equation*}

On the left apply the Chain Rule:

\begin{equation} f^\prime\left(\inverse{f}(x)\right)\cdot\DD\left(\inverse{f}(x)\right)=1.\tag{14.13} \end{equation}

But when \(x=b\) we find that \(f^\prime\left(\inverse{f}(b)\right)=f^\prime\left(a\right)=0\text{,}\) so that

\begin{equation*} 0=\underbrace{f^\prime\left(\inverse{f}(b)\right)}_{=0}\cdot\DD\left(\inverse{f}(b)\right)=1 \end{equation*}

\begin{equation*} 0=1 \end{equation*}

which is ridiculous or in Holmes’ word, impossible. Therefore our assumption cannot true so \(\inverse{f}\) is not differentiable at \(x=b\text{.}\)

While valid and correct, this proof is not very enlightening. A well chosen sketch would be much more convincing, if less rigorous.

Drill 14.7.5.

Choose a function whose derivative is equal to zero at some point and sketch the graph of your function and its inverse on the same set of axes. Be sure to include the point where the derivative is zero. Use your graph to explain why the derivative of the inverse of your function does not exist.

We now understand what conditions are necessary for an arbitrary function, \(f(x)\text{,}\) to have a differentiable inverse.

Also, from (14.13) we know what the derivative of the inverse will be if it exists:

\begin{equation*} \DD\left(\inverse{f}(x)\right)=\frac{1}{f^\prime\left(\inverse{f}(x)\right)}. \end{equation*}

Problem 14.7.6.

Let \(y=\inverse{f}(x)\) and explain how the formula above is equivalent to

\begin{equation} \dfdx{y}{x}=\frac{1}{\dfdx{x}{y}}\tag{14.14} \end{equation}

The only thing left is to show that under the conditions on \(f\) in Lemma 14.7.4 the derivative (that is, the limit which defines the derivative) of the inverse does in fact exist.

Theorem 14.7.7. The Derivative of Inverse Functions.

Suppose that

\(f\) is differentiable at \(x=a\text{,}\)
\(f(a)=b\text{,}\)
\(f^\prime(a)\neq0\text{,}\)
\(\inverse{f}\) is continuous at \(x=b\text{.}\)

Then the inverse of \(f\) is differentiable at \(x=b\) and

\begin{equation*} \DD\left(\inverse{f}(b)\right)=\frac{1}{f^\prime\left(\inverse{f}(b)\right)}. \end{equation*}

Reading and understanding the notation in Theorem 14.7.7 presents the same difficulties we saw in the proof of Lemma 14.7.4. Read it carefully. Be patient with yourself and do not rush.

Proof of Theorem 14.7.7.

We want to show that the limit

\begin{equation*} \DD\left(\inverse{f}(b)\right)=\limit{h}{0}{\frac{\inverse{f}(b+h)-\inverse{f}(b)}{h}}=\frac{1}{f^\prime\left(\inverse{f}(b)\right)}. \end{equation*}

Since \(f(a)=b\) we know that \(\textcolor{red}{\inverse{f}(b)=a}\) so that

\begin{align*} \limit{h}{0}{\frac{\inverse{f}(b+h)-\textcolor{red}{\inverse{f}(b)}}{h}} \amp = \limit{h}{0}{\frac{\inverse{f}(b+h)-\textcolor{red}{a}}{h}}\\ \end{align*}

Observe that if \(b+h\) is in the domain of \(\inverse{f}\) then it is in the range of \(f\text{.}\) Thus there is some number, \(a+k\text{,}\) in the domain of \(f\) such that \(\textcolor{blue}{b+h}=\textcolor{blue}{f(a+k)}\text{.}\) (Note that \(k\ne 0\) since \(h\ne 0\) and \(f\) is one-to-one.) Thus

\begin{align*} \limit{h}{0}{\frac{\inverse{f}(\textcolor{blue}{b+h})-\inverse{f}(b)}{h}} \amp = \limit{h}{0} {\frac{\inverse{f}(\textcolor{blue}{f(a+k)})-a}{h}}.\\ \end{align*}

Again since \(f\) and \(\inverse{f}\) are mutually inverse they “undo” each other so \(\inverse{f}(f(a+k))=a+k\text{.}\) Thus

\begin{align*} \limit{h}{0}{\frac{\inverse{f}(b+h)-\inverse{f}(b)}{h}} \amp = \limit{h}{0}{\frac{k}{h}}.\\ \end{align*}

Solving \({b+h=f(a+k)}\) for \(h\) gives \(\textcolor{blue}{h}=\textcolor{blue}{f(a+k)-b}\) so

\begin{align*} \limit{h}{0}{\frac{k}{h}}\amp = \limit{h}{0}{\frac{k}{\textcolor{blue}{f(a+k)-b}}}\\ \end{align*}

and since \(b=f(a)\) we have

\begin{align*} \amp = \limit{h}{0}{\frac{k}{f(a+k)-f(a)}}\\ \amp = \limit{h}{0} {\frac{1} {\frac{f(a+k)-f(a)}{k}}}.\\ \amp = \frac{1} {\limit{h}{0}{\frac{f(a+k)-f(a)}{k}}}. \end{align*}

The expression \(\limit{h}{0}{\frac{f(a+k)-f(a)}{k}}\) would be \(f^\prime(a)\) if only we had \(k\rightarrow0\) instead of \(h\rightarrow0\text{.}\) What we need to show now is that if \(h\rightarrow0\) then \(k\rightarrow0\text{.}\) Then we could write

\begin{equation} \DD\left(\inverse{f}(b)\right) = \frac{1} {\limit{k}{0}{\frac{f(a+k)-f(a)}{k}}}=\frac{1}{f^\prime(a)}\tag{14.15} \end{equation}

and our proof would be complete. Written a little more carefully, what we need to show is that \(\tlimit{h}{0}{k}=0\text{.}\) Recall that \(a=\inverse{f}(b)\text{,}\) and that \(a+k=\inverse{f}(b+h)\) so we need to show that

\begin{equation*} \tlimit{h}{0}{k}=\limit{h}{0}{\left[(a+k)-a\right]}= \tlimit{h}{0}{\left[\inverse{f}(b+h)-\inverse{f}(b)\right]} = 0 \end{equation*}

or But we assumed that \(\inverse{f}\) is continuous at \(x=b\) which means that

\begin{equation*} \tlimit{h}{0}{\left[\inverse{f}(b+h)-\inverse{f}(b)\right]}=0, \end{equation*}

and the proof is complete. One last point: On the left side of (14.15) the variable is \(b\) and on the right it is \(a\text{.}\) While this is not strictly wrong it is a more useful theorem if we state it in terms of \(b\) alone. Since \(f(a)=b\) we see that \(\inverse{f}(b)=a\) so

\begin{equation*} \DD\left(\inverse{f}(b)\right) =\frac{1}{f^\prime(\inverse{f}(b))} \end{equation*}

and the proof is complete.

Using Theorem 14.7.7 we can now show that the derivatives of the inverse trigonometric functions and the natural logarithm are exactly what we expect them to be. The difference is that now there is no uncertainty or vagueness in our foundations. No modern Bishop Berkeley can step in and sew doubt.

Example 14.7.8. The Derivative of the Inverse Sine.

Suppose \(f(x)=\sin(x).\) Then \(\inverse{f}(x)=\inverse{\sin}(x)\) so

\begin{align*} \DD\left(\inverse{f}(x)\right) = \DD\left(\inverse\sin(x)\right)\\ \amp = \frac{1} {\textcolor{red}{f^\prime}(\textcolor{blue}{\inverse{f}}(x))}\\ \amp = \frac{1}{\textcolor{red}{\cos}(\textcolor{blue}{\inverse\sin}(x))}\\ \DD\left(\inverse{f}(x)\right) \amp = \frac{1}{\sqrt{1-x^2}}. \end{align*}

Problem 14.7.9.

Use Theorem 14.7.7 to show that each of the following differentiation rules is correct:

(a)

\(\DD\left(\inverse{\cos}(x)\right) = \frac{-1}{\sqrt{1-x^2}}\)

(b)

\(\DD\left(\inverse{\tan}(x)\right) = \frac{1}{1+x^2}\)

(c)

\(\DD\left(\inverse{\cot}(x)\right) = \frac{-1}{1+x^2}\)

(d)

\(\DD\left(\inverse{\sec}(x)\right) = \frac{1}{\abs{x}\sqrt{x^2-1}}\)

(e)

\(\DD\left(\inverse{\csc}(x)\right) = \frac{-1}{\abs{x}\sqrt{x^2-1}}\)

(f)

\(\DD\left(\inverse{\ln}(x)\right) = e^x\)

Wait a minute! Did we forget one? What about the natural exponential function? Don’t we also have to show that \(\DD\left(e^x\right)=e^x\text{?}\)

Drill 14.7.10.

Look back at Definition 8.2.2 and explain why it is not necessary to use limits to show that \(\DD\left(e^x\right)=e^x\text{.}\)

Prev Top Next