Non Accounting Home Page

Geometric Mean page

Means, Medians and Modes of Grouped Data

Arithmetic Mean page

Measures of Central Tendency with some MS Excel Tips

The Median and the Mode

Introduction

Following on directly from the arithmetic and weighted means page, here we are now with a discussion of the median and the mode. Two alternative ways of measuring the average of a data set.

The Median

The median is defined as the mid point of the values in a data set after they have been ordered from the smallest to the largest.

With the median, there are often no calculations to make: hurrah! All we need to do is to put the data set in order and then find the middle value. For example: find the median salary from the following monthly salary values for nine employees in my department:

Salaries
1200
2300
1950
2750
1900
3500
1650
2750
2100

All we have to do is to put these salaries in order starting with the smallest (1,200) and ending with the highest (3,500) and then find the middle (5th) salary value:

1st 1200
2nd 1650
3rd 1900
4th 1950
5th 2100
6th 2300
7th 2750
8th 2750
9th 3500

A neat trick to help you is to sort the data from smallest to largest, as we have done here; and THEN to sort the same data from largest to smallest; and THEN to put the two series next to each other and automatically the median is revealed! Take a look at what I mean:

small to
large
large to
small
1200 3500
1650 2750
1900 2750
1950 2300
2100 2100
2300 1950
2750 1900
2750 1650
3500 1200

You wouldn't need to use this trick with a small data set; but if you have many data points to consider this trick could be useful … especially if you are using a spreadsheet!

Microsoft Excel's median function is =MEDIAN(range of data) and it doesn't matter whether you enter your data in order or randomly, Excel will sort it out.

For example, assuming the data in the previous example were in the range A1:A9, enter =MEDIAN(A1:A9) in any cell outside the range A1:A9, press enter and hey presto all is revealed!

The Median and Outliers

The median can help with outliers. Remember the problem with outliers and the mean values is that they take all values into account. So, when there is an outlier, or very large or very small value, the mean is distorted. The median tends to overcome that problem. Take a look at this example:

House prices
small to large
House prices
large to small
60000 275000
65000 80000
70000 70000
80000 65000
275000 60000

In this case we have four values that are within 20,000 of each other and then another value that is 195,000 higher than the next highest value: an outlier.

Nevertheless, the median has returned a value of 70,000 which is representative of the majority of the house prices under review. It does this automatically, of course: by working in the way that it does, the median will effectively ignore the outliers.

A Problem with the Median

Did you spot anything problematic with the median? Wasn't it convenient that we had an odd number (nine) salary values and an odd number (five) of house prices!

Suppose that we had ten salary values, how would we have calculated the median then? Not sure what I'm on about? Well, try finding the median (the middle value) for the following rates of return that are already sorted for us:

Rates
17.2%
21.0%
22.6%
25.4%
28.5%
28.6%

Not so easy is it? There isn't a middle value because we have an even number of rates of return. The median, in fact, is somewhere between the third and the fourth values: between 22.6% and 25.4% and we find it by taking the arithmetic mean of these two values: the median is 24.0%.

How does Microsoft Excel cope with an even number of data points? Have a go! Enter the table of rates of return in your Excel spreadsheet, enter the MEDIAN function and see what happens!

Another point worth making about Excel's MEDIAN function is that we can use it to find the median of all of the data set or parts of it: that might be useful. Excel's Help file has a simple example of how to do this.

The Mode

The mode is the value that appears most frequently. That's it! So, in a medium to large sized data set, a number could easily appear several times and the number that appears most frequently is the mode, or modal value.

To find the mode, take your data and put it into ascending order then check off which number appears the most. Here is a list of 15 numbers from 5 to 15 that have been drawn at random. Let's find the modal value.

Raw
data
Ordered
data
Number of
numbers!
6 6 3
13 6
7 6
11 7 1
11 11 3
6 11
13 11
14 12 1
12 13 5
6 13
11 13
14 13
13 13
13 14 2
13 14

So, the modal value is 13 because it appears 5 times in the list whereas the others appear only once, or twice or three times …

Microsoft Excel can help us to take a look at the median very simply: a graphic that's not a graphic, by using the REPT function. Here is what it can look like:

6 |||
7 |
11 |||
12 |
13 |||||
14 ||

The left hand column in the table above shows the frequency of a set of numbers in a data set. I set up the REPT function in the right hand column of the table as follows:

=REPT("|";C1)

where REPT means repeat or redraw a number, letter or symbol
"|" is the symbol to repeat … I could have used "!" or "H" or "&" or anything
C1 is where I have put the frequency of occurrence of number 6 and number 6 is itself in cell A1.

Then copy the function down as usual and hey presto, you have a graphical/histogram type view of your data.

Please note, because we are dealing with small data sets, the examples here shouldn't be difficult to deal with. However, if we were to deal with, say, finding the mode of 100 or 1000 numbers, we would be wise to use Excel's Tools/Data Analysis/Histogram function to prepare the table we need to use the REPT function. See Excel's Help files for help on the Histogram function.

Microsoft Excel also has a MODE function that is as simple to use as the MEDIAN function. The MODE function is =MODE(range of data).

Use Excel's MODE function to find the modal value of the following data set. If you feel confident, why not use the REPT function too and prepare your own graphical view of the data ... you should use the Tools/Data Analysis/Histogram function to help you here.

As in the examples of =AVERAGE(range of data), use the MODE function in the same way.

For you to do: work out the mode from the data in the table that follows and then create a histogram using the REPT function.

5 7 9 10 13
5 9 12 12 8
10 8 5 15 12
11 6 13 12 8
13 9 10 7 9
5 10 11 9 9
11 9 10 15 6
8 14 5 5 11
15 9 13 14 12
5 14 14 12 9

All being well, you will have found the following:

Number Frequency REPT
5 7 |||||||
6 2 ||
7 2 ||
8 4 ||||
9 9 |||||||||
10 5 |||||
11 4 ||||
12 6 ||||||
13 4 ||||
14 4 ||||
15 3 |||
MODE Number 9  

Now you have a choice:

Go to the Geometric Mean page

Work through the Means, Medians and Modes of Grouped Data

Go to the Arithmetic Mean page

Duncan Williamson
24 February 2002

Write to me at any time

 
© Webmaster Duncan Williamson 2002