Printer Friendly Version of this page
The Area Under A Curve
How To Calculate It Using A Spreadsheet
As with many things that I do with a spreadsheet, the following discussion was sparked off by an e-mail message flying in from the ether. The question that was asked follows and my solution then unfolds!
I sometimes come across situations at work where I need to calculate the area under a plot of a data series but I can't think of a way to do it in Excel (I still use Excel 97). Anybody any ideas?
I will use the Trapezoid Rule in this page and have to confess that my previous version of this page contained an error that Sameer Bhide spotted for me: thanks Sameer. The mathematics involved in this method is not especially advanced although at first sight there are more calculations than we might like! Then Rick Andrews wrote to me in December 2008 to point out an error with the integral calculus section of the page: excellent work Rick, thank you.
For matherically minded visitors, I have updated this page to include a simple treatment of the Integral Calculus way of working through such calculations. This additional section is at the end of the page.
Imagine we are presented with data relating to the cooling of a dish of water from boiling point for 10 minutes. The table below records the times (in minutes) and the temperature (C = degrees Celsius).
Plotted on a graph, we find:
Our task, then, is to estimate the area under the curve extending from 0 minutes and 100 degrees to 10 minutes and 3.1 degrees.
What we are going to do is really to draw a series of boxes, trapezoids, over the curve on the graph and then add together the area of the first trapezoid to the area of the second trapezoid to the area of the third and subsequent trapezoids. The method is explained below: do follow it through step by step.
We find the results of using the method in the following table:
I have used the Trapezoid rule here although it's not obvious yet.
In fact the area in the '0' row in fact shows the area in the range x = 0 to 1; similarly, the area in the '1' row shows the area in the range x = 1 to 2 and so on. Figure 3 is a simpler, although different, version of figure 2 and it demonstrates the trapezoid points.
When we take the area of the range covered by x = 0 to 1, the shape we would draw would be a trapezoid: the green shape on the bottom left of figure 3. To calculate the area of a trapezoid, we have to split it into two: we make a rectangle out of it and then a triangle. This is explained on the right hand side of figure 3. Leaving us with a formula of
Either the area of a trapezoid is ba + b(c-a)/2
For the range x = 0 to 1 and then for the range x = 1 to 2 we have
Do this for each range, increasing in steps of 1 (0 - 1, 1 - 2, 2 - 3 etc) and we arrive at table 2.
There is a variation on this method that improves the result: it involves using logarithms. The following table shows that when we use Natural Logarithms we get a slightly refined result: for this example anyway. For other examples, the logarithm method could improve the situation even more.
The logarithm method is really only applicable when we are dealing with non linear data sets and it works this way.
We need to add a column headed ln(y) meaning that it shows the value of the Natural Logarithm of the 'y' values. The area values in this case are exactly the same as before EXCEPT for the result for the range x= 8 - 10.
To evaluate the area for the range x = 8 - 10, we need to regress x on ln(y) to find the beta value of X coefficient as Excel calls it. We use Excel's in built Tools/Data Analysis function to carry out this calculation. The value of beta is -0.34831 and remember it is still in the form of a logarithm. To use this value, we need to find the antilog which is found by Excel to be =EXP(-0.34831) = 0.705881. We then enter the following formula in the cell to give us an estimate of the area for the range x = 8 - 10 = y/0.705881 = 6.2/0.705881 = 8.7834.
This method is simplified and it gives an approximate solution. If we split our data into more and more ranges, we need more and more trapezoids and that will make our results more accurate. If we take it a stage further we can use calculus and obtain exact answers.
NOTE: I used Microsoft Excel 2000 when developing this sheet but the method will work (with whatever minor amendments necessary) on virtually any spreadsheet I have ever come across.
BONUS: Using Integral Calculus to Find the Area Under a Curve
I have to confess that as I wrote and then rewrote this page I felt myself more and more drawn to including Integral Calculus here. Not that I'm a calculus freak or anything like it. Just that it gives us a much more direct answer that is highly accurate!
Let's take a really simple example of using integral calculus to find the area under a very simple curve: the graph of y= x over the range 0 to 5, or as mathematicians say:
This is what the graph of that looks like:
and we want to calculate the area of the blue section from x = 0 to 5 and y = 0 to 5
Since we're dealing here with a triangle, we learned at school that
the area of a triangle = ½ * base * height
Let's do that: ½ * 5 * 5 = 12.5 square units
Integrative calculus notation shows this as:
Translating this into a more user friendly style we get
Hang on ... where did the
come from? Well, that's what we need to do to integrate.
To evaluate this expression, we then find the value of y when x = 5 and subtract the value of y when x = 0 from it. In this case, we have:
NEW QUESTION: find the area under the curve for
... this gives
Back to the example that we started with now: you should be able to set out the function for the curve in figure 3: ... the answer is 165 square units whereas the answer shown in figure 3 says the area is 167.5. Well, that's not bad considering the Trapezoid Method is only an approximation.
FOR INTEGRAL CALCULUS PURISTS: please note I have omitted the constant 'k' that you are all familiar with. Forgive me but it doesn't affect the answers for these kinds of calculations, does it?
I have to be perfectly honest and say that whilst I have worked through this discussion of integral calculus and developed all of the graphics and explanations, I have had some help. The links on the left hand side of this page take you to some experts! I am indebted to them.
Printer Friendly Version of this page
© Duncan Williamson
© Webmaster Duncan Williamson 2001, 2004 & 2005