Standard deviation formula

The standard deviation formula is used to compute the standard deviation of a given set of data. Standard deviation is a measure of the average amount that the values in a set deviate from the mean of the set. The higher the standard deviation, the more spread out the values are from the mean, while a lower standard deviation indicates that the values tend to be closer to the mean.

There are two commonly used forms of the standard deviation formula: one for a population and one for a sample. The type of data available determines which formula to use.

Sample vs. population

In the context of statistics, a population is an entire group of objects or observations. A statistical population does not have to be some group of people; it can consist of heights, weights, test scores, temperatures, and so on.

While a population represents an entire group of objects or observations, a sample is any smaller collection of said objects or observations taken from a population. Sampling is often used in statistical experiments because in many cases, it may not be practical or even possible to collect data for an entire population. For example, it may not be practical to collect weight data for all the students attending a large university. However, data can be collected from a sample of the students, and statistical measures (including standard deviation) can be used to make inferences about the rest of the population based on the sample.

Population standard deviation

The formula for computing population standard deviation is

where xi is the ith element in the set, μ is the population mean, and N is the size of the population.

Example

The number of peaches that each of the 10 peach trees in an orchard yields in a given season is shown below. Find the standard deviation.

850, 800, 790, 750, 750, 700, 695, 600, 525, 500

The sample mean is:

The sum of squares is:

SS =
= (850 - 696)2 + (800 - 696)2 + (790 - 696)2
  + (750 - 696)2 + (750 - 696)2 + (700 - 696)2
  + (695 - 696)2 + (600 - 696)2 + (525 - 696)2
  + (500 - 696)2 = 126,090

The standard deviation is:

Thus, the standard deviation is around 112 peaches.

Sample standard deviation

The formula for computing sample standard deviation is

where xi is the ith element of the sample, x is the sample mean, and n is the sample size. Notice that the sample standard deviation formula is quite similar to the formula for a population, with a few important changes to account for their differences. As such, the process for computing the sample standard deviation is more or less the same as it is for a population.

It is worth noting that the sample standard deviation is more commonly used than the population standard deviation. This is mostly due to the fact that data for a population is often difficult to obtain.

The empirical rule

Standard deviation has many applications in statistics. The empirical rule is just one way in which standard deviation is used to make predictions about a given set of data. The empirical rule (also referred to as the 68-95-99.7 rule) states that for data that follows a normal distribution, almost all observed data will fall within 3 standard deviations (σ) of the mean. More specifically:

Example

Referencing the example above, where the mean (μ) number of peaches yielded was 696 with a standard deviation (σ) of 112 peaches, predict the distribution of peach tree yield using the empirical rule.

Add and subtract 1, 2, and 3 standard deviations from the mean to determine the ranges of peach yield.

68% of values lie within 1 standard deviation:

95% of values lie within 2 standard deviations:

99.7% of values lie within 3 standard deviations:

Thus:

  • 68% of the peach trees' yield in a given season will fall between 584 and 808 peaches.
  • 95% of the peach trees' yield in a given season will fall between 472 and 920 peaches.
  • 99.7% of the peach trees' yield in a given season will fall between 360 and 1,032 peaches.