Calculus: Derivatives I

Constructing the Derivative

Motivation: How fast is a function changing? If the function is very steep, one would imagine the rate would be fairly high, and if the function is gradual or even flat, one would imagine the rate would be fairly low. How would one go about translating this concept into mathematics?

Secant Lines: A fine idea is to pick two points on the graph and draw a line through them. The slope of the line will show the average rate the function changes from one point to the next, as seen in Figure 1. The term for this kind of line between two points on a graph is called a secant line. 

Graph showing f(x)=sin(x) and a secant line approximating the slope.
Figure 1

But there's a problem - it's not exact. The function is clearly steeper than the line in some region and less so in others. Furthermore, it's not clear which part of the function the line is trying to estimate: The first point? The second? The middle? Something else? In fact, one could easily pick two points that give a line that any reasonable person would agree is not very helpful. For example, Figure 2 shows that the line will be flat if the two points selected are $(0,0)$ and $(\pi,0)$.

Graph showing f(x)=sin(x) and a very poor secant approximation of the slope
Figure 2

How to improve? Well, why not make the points closer together? The closer the points are, the more accurate the estimate should be. Figure 3 is definitely more accurate than the previous two figures.

Graph showing f(x)=sin(x) and a more accurate secant approximating the slope.
Figure 3

But this is math, you cry out, math is precise! Math is elegant! Is there nothing beyond this wasteland of approximations? Am I supposed to be satisfied with "just use very small differences" as the cruel failure of an answer to my thirst for precision?

In the name of the figs, no! The solution is to take the limit as the difference between the two points goes to $0$. That, young Newtonwalker, is the precise construction of a derivative.

Taking the limit: The slope between any two points $(x_0,y(x_0))$ and $(x_1,y(x_1))$ is given by the familiar formula of $\dfrac{\text{rise}}{\text{run}}$. A perk of doing harder math is that you get sexier notation, so instead we're going to denote rise, the change in $y$, as $\Delta y$, and run, the change in $x$, as $\Delta x$. The triangle symbol is the Greek letter capital delta, and means change. The expression the slope between two points is then given by:

$$\dfrac{\Delta y}{\Delta x} = \dfrac{y(x_1) - y(x_0)}{x_1 - x_0}$$

Because we're interested in the difference between the two points, rather than the two points themselves, we can make a new variable $h$ to denote this difference:

$$h = x_1 - x_0$$

Fixing up our original equation gives us the following:

$$\dfrac{\Delta y}{\Delta x} = \dfrac{y(x_0 + h) - y(x_0)}{h}$$

And now like the Eagles, we take it to the limit (although we're going to be doing this a lot more than one more time). Further, you now get even sexier notation: Capital delta usually refers to discrete changes, so to reflect the infinitesimal change, we use lowercase $d$ instead. Voilá, we have the definition of the derivative:

$$\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0}\dfrac{y(x_0 + h) - y(x_0)}{h}$$

Figure 4 below shows the derivative as the slope of the tangent line at a particular point:

Graph showing sin(x) and the derivative as the tangent line at a point
Figure 4

The derivative is itself a function. Plugging a value into the derivative, say $x=0$, produces a value, say $\dfrac{dy}{dx}(0)=1$, that is the slope of the tangent line to the original function when $x=0$.

What the derivative is not: As defined, the derivative of a function gives the slope of the tangent line at a given point. This bears repeating - the derivative of a function gives the slope of the tangent line at a given point. It is itself a new function that is distinct from the function of which it is the derivative. The number of ways new calculus students can get confused on this point is truly impressive. It is worth listing a few of the things a derivative is not:

  • The original function
  • Necessarily a line
  • Necessarily positive if the original function is positive
  • Necessarily negative if the original function is negative
  • Somehow the equation for a line that is magically tangent to the function everywhere.
  • Something other than the slope of the tangent line at a given point.

Note on notation: The notation $\dfrac{dy}{dx}$ is used when differentiating a function written as $y = \ldots$, and is referred to as Leibniz notation, after Newton's contemporary who independently discovered the field of calculus but for some reason gets less credit. The notation $f'(x)$ is also used to show the derivative of $f(x) = \ldots$, where the tick mark denotes that it's a derivative.

Solving limits for general classes of functions: You may have acquired a creeping sense of unease upon learning about how to take derivatives. This process looks tedious! Must we do all this limit work every time we take a derivative? Thankfully, no! The trick is to solve for general classes of functions, and then use the resulting general forms of the derivatives to solve specific derivatives. The problems below are based on deriving these general rules, and the other topics in this section are devoted to practicing the use of these rules.


  1. Linearity of Differentiation: In layman's terms, an operator is like a function that maps functions to other functions, rather than mapping numbers to other numbers. That is to say, functions eat numbers and poop out numbers, and operators eat functions and poop out functions. The differentiation operator $D$ can be written as follows:

    $$D(f) = \lim\limits_{h \rightarrow 0}\dfrac{f(x+h) - f(x)}{h}$$

    A linear operator is an operator $L$ that has the following properties:

    1. Constant scaling: For a constant $c$, $D(cf) = cD(f)$
    2. Additivity: $D(f + g) = D(f) + D(g)$

    Prove that differentiation is a linear operator.

    Hint: Recall the properties of limits

    1. $\lim\limits_{h \rightarrow 0} D(cf) = \lim\limits_{h \rightarrow 0} \dfrac{cf(x + h) - cf(x)}{h} \\ \lim\limits_{h \rightarrow 0} D(cf) = \lim\limits_{h \rightarrow 0} c\dfrac{f(x + h) - f(x)}{h} \\ \lim\limits_{h \rightarrow 0} D(cf) = c\lim\limits_{h \rightarrow 0} \dfrac{f(x + h) - f(x)}{h} \\ \lim\limits_{h \rightarrow 0} D(cf) = cD(f) \\ $
    2. $\lim\limits_{h \rightarrow 0} D(f) + D(g) = \left( \lim\limits_{h \rightarrow 0} \dfrac{f(x + h) - f(x)}{h} \right) + \left( \lim\limits_{h \rightarrow 0} \dfrac{g(x + h) - g(x)}{h} \right) \\ \lim\limits_{h \rightarrow 0} D(f) + D(g) = \lim\limits_{h \rightarrow 0} \dfrac{f(x + h) - f(x)}{h} + \dfrac{g(x + h) - g(x)}{h} \\ \lim\limits_{h \rightarrow 0} D(f) + D(g) = \lim\limits_{h \rightarrow 0} \dfrac{f(x + h) + g(x + h) - f(x) - g(x)}{h} \\ \lim\limits_{h \rightarrow 0} D(f) + D(g) = D(f+g) \\$
    Show Answer
  2. Polynomials: Compute the derivative for each of the following:

    1. $y = x$
    2. $y = x^2$
    3. $y = x^3$

    Show for the general case that for $y = x^a$, $\dfrac{dy}{dx} = ax^{a-1}$. 

    Hint: Remember the binomial coefficient.

    1. $\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{(x+h) - x}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{h}{h} \\ \dfrac{dy}{dx} = 1 \\ $
    2. $\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{(x+h)^2 - x^2}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{x^2 + 2xh + h^2 - x^2}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{2xh + h^2}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} 2x + h \\ \dfrac{dy}{dx} = 2x \\ $
    3. $\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{(x+h)^3 - x^3}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{x^3 + 3x^2h + 3xh^2 + h^3 - x^3}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{3x^2h + 3xh^2 + h^3}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} 3x^2 + 3xh + h^2 \\ \dfrac{dy}{dx} = 3x^2 \\ $
    4. $\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{(x+h)^a - x^a}{h}$

      The trick here is to know that $(x+h)^2 = \displaystyle\sum\limits_{i=0}^{n} \binom{n}{i}x^{n-i}h^i$:

      $\dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{\left( \sum\limits_{i=0}^{n} \binom{a}{i}x^{n-i}h^i \right) - x^a}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{\left( x^a + \sum\limits_{i=1}^{a} \binom{a}{i}x^{a-i}h^i \right) - x^a}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \dfrac{\sum\limits_{i=1}^{a} \binom{a}{i}x^{a-i}h^i}{h} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \displaystyle\sum\limits_{i=1}^{a} \binom{a}{i}x^{a-i}h^{i-1} \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \left[ \binom{a}{1}x^{a-1}h^{1-1} + \displaystyle\sum\limits_{i=2}^{a} \binom{a}{i}x^{a-i}h^{i-1}\right] \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \left[ ax^{a-1}h^0 + \displaystyle\sum\limits_{i=2}^{a} \binom{a}{i}x^{a-i}h^{i-1}\right] \\ \dfrac{dy}{dx} = \lim\limits_{h \rightarrow 0} \left[ ax^{a-1} + \displaystyle\sum\limits_{i=2}^{a} \binom{a}{i}x^{a-i}h^{i-1} \right]\\ \dfrac{dy}{dx} = ax^{a-1} + \displaystyle\sum\limits_{i=2}^{a} \binom{a}{i}x^{a-i}(0)^{i-1} \\ \dfrac{dy}{dx} = ax^{a-1} + \displaystyle\sum\limits_{i=2}^{a} 0 \\ \dfrac{dy}{dx} = ax^{a-1} \\ $
    Show Answer