Saturday, September 12, 2009

Integration for Programmers

Calculus can be very useful for improving the performance of a function. Here, I’ll demonstrate how integration works using Visual Basic code instead of the Greek used in math class. I chose VB since many people can open Excel, Word or others and try it out for themselves. I have instructions for that as well.

You can watch the real work in a companion video on YouTube.
http://www.youtube.com/watch?v=xXxXxXxXxXx

Background
In calculus 101, you learn integration and differentiation for solving things like acceleration, area and volume. If programmers don’t know the formula for something like that, they know how to get a pretty good answer by writing a loop. If that’s not accurate enough, just loop a lot more. But at some point, such looping becomes exceedingly inefficient. Now just imagine what that was like for Newton when he was trying to predict some future location of a planet. He needed a better method. He and Leibniz separately formalized the rules of Calculus in the 1670s. And it’s the notations of Leibniz that are in use today.

Example
Let’s use a really simple example of the area under a line where the line is defined as the function y = x. This is a diagonal line. The area under it looks like a right triangle that is half of a square. So the formula is:

Area of half a square = x2/2


Now let’s say that we want the area under x between 1 and 3. You don’t need a computer or calculus to figure that out. It’s 4. The formula for that is:

Area = 32/2 – 12/2 = 4

We need a more general formula that can handle any interval like a and b.

Area = b2/2 – a2/2

Let’s pretend we don’t know all that but we do know the formula for summing integers up to N.

1+2+3+...+N = (N2+N)/2

This works great for integers but we don’t know how to use this for areas. So instead we write a loop that cuts the area up into 20 thin slices. Each slice is 1/10 wide by x high. We can calculate the area of each slice and add them together.

.1+.11+.12+...+.28+.29+.3 = 4.2

That’s a little off. So we could try carving it up into 200 slices.

.01+.011+.012+...+.028+.029+.03 = 4.02

Replacement
That’s better but still not exact. Y = X is an overly simple example. But for harder formulas, that’s just how they did it before calculus. Today, mathematicians call this a Riemann Sum. The idea is to keep carving it up into thinner and thinner slices. The problem is you can’t quite go all the way to infinitely thin. Instead the idea is to get the problem down to some fundamental loop that you can replace. In our case, we want to replace:

For i = 1 to N
A = A + i
Next

with

A = (N2+N)/2


Getting Rid of N
You can see just how that’s done in the video. After that, you’re left with all these Ns in the formula. You’ll see that there are ways of getting rid of those too. Hopefully, cancellation gets rid of most of them. For the rest, it helps to imagine those thin slices again. Lots of thin slices means N gets really big. If N is only on the bottom of a fraction, as N approaches infinity, that fraction gets really small. At that point, it’s safe to remove that fraction from the formula and you’re done.

Coding
Here’s the code you’ll start with. To try it, open Microsoft Word or Excel and press Alt + F11. That opens the Visual Basic editor. In the editor, double-click “ThisDocument” or “ThisWorkbook”. Paste in this code and click Run. A box pops up with the answer: 4.0002. That answer’s pretty close since the code loops 10,000 times. This code also works in a VBS file if you’re any good with that.

Sub FunctionVersion01()
a = 1 ' Start of interval
b = 3 ' End of interval
N = 10000 ' Number of slices
DeltaX = (b - a) / N ' Slice width
For i = 1 To N
x = a + i * DeltaX
y = x ' <---------------- Formula! (Slice height)
SliceArea = y * DeltaX
Area = Area + SliceArea
Next
MsgBox Area
End Sub


More Integration
If you think you’ve got it, try the same thing with the formula y = x2. Depending on your loop, you may need some other summation formula with which to replace it. And in most cases, you won’t have to go through this at all. You can just replace the whole thing with the integral.


Indefinite Integrals
Sequential Integer Sums
∫ x dx = x2/2
n
∑ i = (n2+n)/2
i=1
∫ x2 dx = x3/3
n
∑ i2 = (2n3+3n2+n)/6
i=1
∫ x3 dx = x4/4
n
∑ i3 = ((n2+n)/2)2
i=1
∫ 1/x dx = ln(x)

∫ cos(x) dx = sin(x)

∫ ex dx = ex


References
http://en.wikipedia.org/wiki/Summation
http://en.wikipedia.org/wiki/Riemann_sum
http://en.wikipedia.org/wiki/Riemann_integral
http://en.wikipedia.org/wiki/Integral
http://www.youtube.com/watch?v=gFpHHTxsDkI