In statistics, a quartile is an element in a data set, ordered from smallest to largest, that divides the set into four parts, or quarters. The four parts are more or less equal in size, but can vary slightly depending on the number of elements in the set. In the set below, 9, 28, and 47 are quartiles:

{3, 5, 9, 10, 21, 28, 32, 38, 47, 58, 66}

In the above set, the quartiles divide the set into four equal parts, so each of the four parts contains 25% of the data. In cases where the data cannot be exactly divided into four equal parts, some parts will have more elements than others, but each part will still roughly comprise 25% of the data. The second quartile, referred to as Q2, is the median of the set. In this case Q2, or 28, is the median.

There are a number of different ways to refer to each of the quartiles, such as Q1, Q2, and Q3:

Quartiles can be depicted using a number of different graphical methods, such as a frequency distribution or a box plot.

Frequency distribution

Box plot

How to select quartile values

There is no one way to select quartile values that everyone agrees on, but there are some general guidelines for doing so. Before starting the process, the set of data must be arranged from smallest to largest:

{1, 4, 5, 6, 7, 10, 11, 13, 16, 18, 25}

The following formulas can be used to determine the position of each quartile. They are particularly useful when working with larger data sets. Given that the set is ordered from smallest to largest,

Q1 = 0.25(n + 1)

Q2 = 0.50(n + 1)

Q2 = 0.75(n + 1)

where n is the number of elements in the set. Note that the formulas output the position of the quartile, not the actual value of the element. For example, using the data set above:

Q1 = 0.25(11 + 1) = 3

Thus, Q1 is the 3rd element (5) in the set. If there are an even number of elements in the set, the formulas yield a decimal number, which indicates that the quartile lies between the values of the two elements on either side of the element denoted by the decimal number. For example, if Q1 = 2.3, Q1 is determined by computing the average of elements 2 and 3 in the set. Using the above set, Q1 would be (4 + 5)/2 = 4.5.

Using quartiles to find the interquartile range

The interquartile range (IQR) is comprised of the range of values between the first and third quartiles, and is a measure of how spread out the middle 50% of the data is. 50% of the data is contained within this range:

The interquartile range is found by subtracting Q1 from Q3:

IQR = Q3 - Q1

Using quartiles to find outliers

Quartiles can be used to find outliers (values in the set that lie significantly outside the expected value). Values that lie farther than 1.5 times the IQR away from either end of the IQR (Q1 or Q3) are considered outliers, as shown in the figure below:

Thus, the expected range of values is:

[Q1 - 1.5(IQR), Q3 + 1.5(IQR)]

Anything outside the above range of values is an outlier.