Contents Supplements and Links
Objectives: To solve the problem of finding the generalization of a few simple results, using a new construction. To gain experience in computing interpolating polynomials. To develop an appreciation of mathematical induction and "infinite" problems. Necessary Background: It is assumed that students are familiar with the method of mathematical induction.
Summary: In Section I, students discover explicit formulas for the sum for p = 0, 1, 2, 3, and 4. In Section II, they are led to discover the general formula.


I. The Formula Skeleton

We would like to investigate sums of consecutive like powers p. Our goal is to arrive at an expression for the function

(1)

that captures any patterns that might be present in these sums. In particular, we would like to be able to reduce these sums computationally. For example, suppose we wished to compute the sum of the first billion square numbers. Is there any faster way of doing this than by simply adding up 1 + 4 + 9 + 16 + ... ? The answer is "yes", but first let's start with more simple cases.

Consider the power p = 0. What is f0(n) = 10 + 20 + 30 + ... + n0, the sum of the first n natural numbers, each raised to the 0th power?


For p = 1, the sum (1) becomes the sum

of the first n natural numbers. Find f1(n) for n = 1, 2, 3, ..., 10.


This sequence is called the sequence of triangular numbers because its members are the numbers of dots in the following isosceles right "triangles":

There is a story in the mathematical folklore about the great mathematician Gauss (1777–1855), who, while in elementary school, reportedly found the sum of the first hundred natural numbers in a matter of minutes. Gauss did this by pairing up the numbers into pairs that summed to 101:

Since there are fifty such pairs, the total sum is then 50 · 101 = 5050.

Use Gauss's method to find a general expression for the sum of the first n natural numbers in terms of n.


Now we move on to the case p = 2. Compute the sum of the first n squares for n = 1, 2, 3, 4, 5, and 6. Is there an obvious trick for computing these sums that works like Gauss's method for p = 1?


To find a general formula for , we will use an "interpolating polynomial". Interpolation is the process of fitting a curve to a set of data points. For example, there is a unique line that passes through any two points in the plane. If the coordinates of the plane are x and y and our two points are (x1, y1) and (x2, y2), then the line passing through them is given by the equation

We say that this "curve" (in this case, a line) fits the data points (x1, y1) and (x2, y2). (Here we assume that x1x2 so that y is actually a function of x. We need not worry about this in the case of sums of consecutive powers since for every p, fp(n) is a function of n by definition.) For example, let's fit a line to the points (x1, y1) = (1, f0(1)) = (1, 1) and (x2, y2) = (2, f0(2)) = (2, 2). Check that the above formula gives the same polynomial for f0(n) that you determined earlier.


Any three points in the plane (with distinct x values) can be fit to a polynomial of degree two; any four points can be fit by a polynomial of degree three; etc. One formula for the interpolating polynomial of p + 2 points (xi, yi) is the so-called Lagrange interpolating polynomial,

(The product is over 1 ≤ ji – 1 and i + 1 ≤ jn.) First note that y(xk) = yk for each k since the summand is 0 for every ik and yi for i = k. Secondly, note that the polynomial has degree at most p + 1. One computes the polynomial passing through p + 2 points merely by plugging the x- and y-coordinates of the points into the Lagrange interpolating polynomial. We might be lucky in that f2(n) is a polynomial in n of degree 3. Assuming that it is, find this polynomial using the Lagrange interpolating polynomial. Check whether this polynomial works for the six data points you found above.


Mathematical induction is a method of proof that is often useful for proving identities such as this one. Prove that this polynomial for f2(n) holds for all natural numbers n.


Possibly by implementing the Lagrange interpolating polynomial on a calculator or computer, use the same method to guess polynomial expressions for f3(n) and f4(n). Prove that these expressions hold for all natural numbers n.


Now that we have found fp(n) to be a polynomial for several special cases, we may start to look for patterns in the hope of discovering a general polynomial formula. What patterns can be observed in the formulas found for p = 0, 1, 2, 3, and 4?


At this point the reader is invited to depart from this outline of the project and to work independently toward a general formula for fp(n). It is possible to find this formula empirically; however, you will likely need outside information. What follows is an approach to the problem that is guided by foreknowledge of the answer. Certainly this is not how research is conducted in mathematics, but in this case the answer is made simpler by considering additional constructions that are not especially natural unless one knows that they will work. But that is not to say that they cannot be discovered without foreknowledge; the reader who chooses not to follow the rest of this outline should start by finding the polynomials fp(n) for many additional p and looking for patterns in this sequence of polynomials.


II. Bernoulli Polynomials

For each natural number p, let Bp(n) be a polynomial in n of degree p satisfying

(2)

This defines Bp(n) up to its constant term. We can see this by writing (for a fixed p) . Then

so bp = 1. The coefficient bp – 1 is determined by bp; bp – 2 is determined by bp and bp – 1; ...; and b1 = –(bp + bp – 1 + ... + b2). We call Bp(n) the pth Bernoulli polynomial, after Jakob Bernoulli (1654–1705), who discovered them in his attempt to find the general formula for sums of consecutive powers.

The constant term of Bp(n) is defined in mathematics, but we need not concern ourselves with it here. We write Bp = Bp(0) for the constant term of the pth Bernoulli polynomial. (These numbers are called Bernoulli numbers and appear throughout mathematics.) Let p > 1. By substituting n = 0 into the definition of Bernoulli polynomials, find another expression for the Bernoulli number Bp.


The first Bernoulli polynomial is B1(n) = n + B1. From the definition, find Bp(n) – Bp for p = 2, 3, and 4.


Now we notice a similarity between fp(n) and Bp(n). Using the definition of fp(n), find a simple expression for the difference fp – 1(n) – fp – 1(n – 1).


This statement is very reminiscent of the definition of Bernoulli polynomials. Multiply both sides of the equality by p. Since the pth Bernoulli polynomial is the unique polynomial (up to a constant) satisfying (2), find a relationship of the function p fp – 1(n) to a Bernoulli polynomial (assuming that fp – 1(n) is a polynomial in n of degree p).


At this point we have a conjectural formula for fp(n) as a polynomial of degree p + 1, assuming that it is a polynomial of this degree. We would like to show that it in fact is this polynomial. For each p we will do this by induction on n. For p > 0 we may redefine fp(n) as

since summing from m = 0 to n gives the same result as summing from m = 1 to n. Use this to find the constant term of fp(n).


For each p, prove the conjectural formula for fp(n) by induction on n.


One might object to this "solution" of finding a formula for sums of consecutive powers under the premise that Bernoulli polynomials aren't sufficiently explicit. After all, we have only changed the problem of computing sums of like powers into one of finding the coefficients of Bernoulli polynomials, and we don't have a closed-form expression for these coefficients. There are two points to make in response to such an objection.

The first is that not every problem in mathematics has its solution in terms of fundamental constructs such as the integers and π. While this accounts for much of the difficulty of mathematics, it is responsible for everything interesting and beautiful in mathematics as well. There is a hierarchy of mathematical objects that appear in nature, and the Bernoulli polynomials (and especially the Bernoulli numbers) happen to lie outside of any "direct" construction from the integers. This emphasizes their value.

The second is that, in a real sense, we have reduced the problem computationally, as it is much easier (from a computational standpoint) to find the p + 1 coefficients of Bp(n) and then to evaluate at n = 1000000 than it is to start adding up powers of natural numbers. Doing both by hand gets the point across, but it is even more appreciated when both methods are implemented on a computer.


References

Relevant links: