Variance formula

The variance formula is used to compute the variance of a given set of data. Variance is a measure of variability that indicates how far a set of values varies from the mean of the set. A data set with a high variance indicates that the data tends to be further from the mean, while a low variance indicates that the data does not deviate much from the mean.

There are two commonly used forms of the variance formula: one for a population and for a sample. The type of data available determines which formula to use.

Sample vs. population

In the context of statistics, a population is an entire group of objects or observations. A statistical population does not have to be some group of people; it can consist of heights, weights, test scores, temperatures, and so on.

While a population represents an entire group of objects or observations, a sample is any smaller collection of said objects or observations taken from a population. Sampling is often used in statistical experiments because in many cases, it may not be practical or even possible to collect data for an entire population. For example, it may not be practical to collect weight data for all the students attending a large university. However, data can be collected from a sample of the students, and statistical measures (including variance) can be used to make inferences about the rest of the population based on the sample.

Population variance

The formula for calculating population variance is

where xi is the ith element in the set, μ is the population mean, and N is the population size.

Variance is often computed using some kind of calculator, but in cases where it may be necessary to compute by hand, the following form of the population variance formula can be used to simplify the calculation:

The above formula is obtained by expanding the standard population variance formula and simplifying it using algebra:

 
 
 
 
 
 
 

Sample variance

The formula for calculating sample variance is

where xi is the ith element in the set, x is the sample mean, and n is the sample size.

Like the population variance formula, the sample variance formula can be simplified to make computations by hand more manageable. The simplified formula is:

The formula is obtained by expanding the standard sample variance formula, then simplifying it using algebra:

 
 
 
 
 

Example

John has 5 jars of marbles. Find the population variance given that the jars contain the following number of marbles:

275, 252, 246, 230, 222

Use both the standard and simplified formulas for population variance.

Find the population mean:

Find the sum of squares (SS):

SS
 
 

Compute the variance:

Thus, the population variance is 341 marbles2.

Using the simplified formula, first calculate the summation:

= 301829

The mean, as calculated above, is 245. Plug these into the simplified formula:

Thus, the population variance is approximately 341 marbles2, the same result as above, but requires significant less computation, since the mean does not need to be subtracted from each value before being squared.