# How to calculate simple statistics

## Introduction

In statistics, the most important calculations are the mean, mode, median, variance, and standard deviation (std dev)

In this hub, I will cover the following:

- sample size
- population
- mean
- mode
- median
- variance
- standard deviation

## Sample Size and Population

Statistics begins with a set of numbers which are called the* sample*.

The set of all possible numbers is called the *population*.

Let's say that we ask 5 friends to rate a popular movie on the scale from 1 to 10.

Then, the *sample size* is 5 and the population is the set of all people who have seen or will see the movie.

## Calculating the mean

So, we ask our 5 friends to rate and movie and here's what we get:

Fred: 6

Sally: 9

Michael: 8

Raul: 9

Elena: 2

To calculate the *mean*, you sum up all the numbers in the sample and then divide by the sample size. The sum is 5+10+8+9+2= 34. Since the sample size is 5, the mean is 34/5 = 6.8.

This then is the average of the sample.

## Calculating the mode

The *mode* is the number that appears the most often in the sample.

To calculate the mode, we count the number of times each rating is made. So we have one 6, two 9's, one 8, and one 2. Since we have two 9's and one of everything else, 9 is the mode.

But what would happen if we have the following sequence: 2,2,8,9,9?

In this case, we would say that there is no unique mode. A mode is unique if and only if one number is more frequent than all others.

## Calculating the median

The *median* is the value we get when we order all of our numbers and then find the one in the middle.

If we order the numbers from smallest to largest, we get: 2, 6, 8, 9, 9

Since we have a sample size of 5, the number in the middle is 8.

But what happens if the sample size is even. In this case, we can add the two middle numbers and divide by 2.

So, if our numbers are: 2,6,8,9, then the median is (6+8)/2 = 7.

## Calculating Variance

The* variance* is a measure of the variation of the sample data. The larger the variance, the more random the answers appear. Many people find standard deviation to be a more useful measure of variability.

The method for calculating the variance is different depending on whether we are calculating the variance of a population (everyone) or the variance of a sample (some but not all).

Here are the steps:

(1) Figure out the mean. This is the sum of the numbers given divided by the sample size (i.e. the average).

(6+ 9+ 8 + 9 + 2)/5 = 34/5 = 6.8

(2) Figure out the difference between each number and its mean so that we have:

(6 - 6.8), (9 - 6.8), (8 - 6.8), (9 - 6.8), (2 - 6.8) = -0.8, 2.2, 1.2, 2.2, -4.8

(3) Get the square of each difference in step #2 so that we have:

(-0.8)*(-0.8), (2.2)*(2.2), (1.2)*(1.2), (2.2)*(2.2), (-4.8)*(-4.8) = 0.64, 4.84, 1.44, 4.84, 23.04

(4) Get the sum of all the squares in step #3 so that we have:

sum of squares = 0.64 + 4.84 + 1.44 + 4.84 + 23.04 = 34.8

(5) Now, for the sample variance, we divide the sum in step #4 by the sample size - 1

Variance = 34.8/(5-1) = 34.8/4 = 8.7

## Calculating the Standard Deviation

The *standard deviation*, like variance, is a measure of the variation of the sample
data. The larger the standard deviation, the more random the answers appear. Standard deviation is more popular as a measure than variance.

The method for calculating the standard deviation is different depending on whether we are calculating the variance of a population (everyone) or the variance of a sample (some but not all). The method is the same as variance with one additional step.

Here are the steps:

(1) Figure out the mean. This is the sum of the numbers given divided by the sample size (i.e. the average).

(6+ 9+ 8 + 9 + 2)/5 = 34/5 = 6.8

(2) Figure out the difference between each number and its mean so that we have:

(6 - 6.8), (9 - 6.8), (8 - 6.8), (9 - 6.8), (2 - 6.8) = -0.8, 2.2, 1.2, 2.2, -4.8

(3) Get the square of each difference in step #2 so that we have:

(-0.8)*(-0.8), (2.2)*(2.2), (1.2)*(1.2), (2.2)*(2.2), (-4.8)*(-4.8) = 0.64, 4.84, 1.44, 4.84, 23.04

(4) Get the sum of all the squares in step #3 so that we have:

sum of squares = 0.64 + 4.84 + 1.44 + 4.84 + 23.04 = 34.8

(5) We divide the sum in step #4 by the sample size - 1

34.8/(5-1) = 34.8/4 = 8.7

(6) Last, we take the square root of the value in step #5.

Standard Deviation = sqrt(8.7) = roughly 2.95

## Interpreting Standard Deviation

A smaller standard deviation means that there is more agreement between the numbers (less variation)and a larger standard deviation means that there is less agreement (more variation).

If the observations are random and fall in a bell curve, then we can use the standard deviation to make the following observations:

- 68% of the numbers lie within one standard deviations of the mean
- 95% of the numbers lie within two standard deviations of the mean

Now, movie ratings are, in theory, not random since they are based on the quality of a movie. Additionally, we can know that 100% are between 1 and 10 and are most likely whole numbers.

But, what would it say for another movie if the mean were 5 and the standard deviation was 1 and we assume that ratings form a bell curve.

With this information, we can expect:

- 68% of all people will rate the movie between 4 and 6 since 4= 5-1 and 6 = 5+1
- 95% of all people will rate the movie between 3 and 7 since 3 = 5 - 2*1 and 7 = 5 + 2*1

## Comments

**Guntur** on January 14, 2015:

Why are you using the sum fiuntcon to find the mean.. When there is just a fiuntcon specifically for finding the mean or the average? WAste of time.. I just want to know how to do standard..

**Cristobal** on January 13, 2015:

Hi Arizona,First many thanks for chnkiceg out my video, I hope you found it useful.The Mean Deviation is not a commonly used statistic (at least I don't teach it in class). However, it is easy to do in Excel use the AVEDEV function (Foumulas -> More Functions -> Statistical). All you need to do is select the data range and Excel will calculate the Mean Dev for your.Try it out with these data: 92, 97, 95, 90, 98. the Mean Dev should work out as 6.Hope this helps,Dr E.

**Marylada** on January 11, 2015:

Grade A stuff. I'm unqoastiunebly in your debt.

**ou** on June 03, 2014:

good

**Thien** on April 23, 2013:

Thanks for the last part.

**VIJAYAKUMAR** on December 19, 2012:

Rest of "Interpreting Standard Deviation step, i have clear everything. Please let me have bit explanation of last point. How we get the 68% & 95%.

**ann** on November 16, 2012:

my is not a question

**Debbie** on April 15, 2012:

It helped me well, i nw undrstnd it well, thnk u a lot

**zohaib noor** on March 30, 2012:

thank you very much in order to share that hub page

**Elijah ibrahim** on March 21, 2012:

U people ar doing great pls kep it up

**girlly** on February 28, 2012:

OMG, that was awesome. It saved me a lot of time. I couldn't understand the way my prof did it but I understand yours and i got the same result as her. Thank you sooooo very much

**your face** on November 19, 2011:

how do you graph a statistic

**Veronica Clark** on June 21, 2011:

Hi there!

Can you please explain Z-scores and provide an example on how to calculate them? Thank you!

**Horlah** from Oyo, Oyo, Nigeria on May 23, 2011:

Good one there. How do you explain the aspect of correlation and regression? Please help.

**frogpaul77** on May 10, 2011:

need help with variance and stanard diviation for the sample of numbers 12,4,16,14,10?

**Ruguru** on March 03, 2011:

Biostatistics jug my head that was a better way i have understood it now my lecturer made my life a hell. I knew can count on you nice job

**stuck** on November 23, 2010:

big up. Keep up the good work

**Stan** on November 11, 2010:

Excellent! Up.

http://transmissiondesignhub.blogspot.com/2010/11/...

**Smurf** on August 31, 2010:

Excellent way of explaining how to understand the "end results" of stdev!!! Awsome!!!

**radhhhhhhh ** on August 19, 2010:

really good to see. its very useful for people who are learning stats fundamentals. keep it up guys

**Manna in the wild** from Australia on May 12, 2010:

Math hubs will never go out of date !

**pinkhawk** from Pearl of the Orient on April 28, 2010:

...We are using softwares now but I still need to go back with the basics, it is a great help.... thank you for sharing this hub! ^.^

**myClone** from The Land of Confusion on April 24, 2010:

Nicely explained! I am wondering though--what is a simple way of calculating the Total Sum of Squares (SST)? Also, it would be awesome to have a detailed tutorial on how to analyze an ANOVA table!

**Gen R** on March 18, 2010:

Thank you!!!

**Helna** on February 22, 2010:

nice work

**beta** on November 13, 2009:

Great work, but I wish I get it, what is the shortest way?

**Manna in the wild** from Australia on November 01, 2009:

You explained this well.

**anothermathgeek (author)** from East Bay, California on October 02, 2009:

Hi D.S.,

1) Figure out the mean

2) Figure out the standard deviation

3) Count the number of observations that lie between (mean - std dev) and (mean + std dev)

**D.S.** on October 02, 2009:

How do you calculate the number of observations that lie within one standard deviation of the mean, given a list of 44 observations/numbers?

**Moon Daisy** from London on September 07, 2009:

Nice lesson, and very clearly explained! (Btw I'm a bit of a maths geek too!)

**Anonymous1** on September 06, 2009:

Nice mate, thanks for this

**Christenstock** from Mililani, HI & Rye, NY on July 09, 2009:

Supoerb Hub. Thanks MathGeek!