home / probability and statistics / inferential statistics / sample

Sample

A sample refers to a collection of individuals or objects selected from a statistical population using procedures that ensure the randomness and viability of the sample. The procedures used are commonly referred to as sampling methods.

Sampling is important for statistical studies because it is generally difficult or prohibitively expensive to collect data for an entire population, so smaller samples that are intended to represent the population are used instead.

Importance of sampling methods

Inferential statistics is largely based on obtaining samples that are representative of the population being studied. If the sample is properly collected using rigorous sampling methods, it can be used to draw conclusions about the population it represents. On the other hand, if the collected samples are not representative of the population, any inference or generalization made about the population may not be accurate. This is why it is important to use methods that decrease the likelihood of the sample misrepresenting the population, such as random sampling.

Random sampling is a procedure that, by design, seeks to ensure that each potential observation about a given population has an equal chance of being selected in a survey. This decreases the chances of there being bias in the sample that could lead to false conclusions about the population. One example of a random sampling procedure is a lottery in which numbers are drawn at random, and whoever holds the numbers drawn wins a prize.

As an example, a lottery may allow people to select a set of 6 numbers from 1-99. 6 numbers are then drawn at random from some container with all the numbers from 1-99. The 6 selected numbers qualify as a random sample because there is an equal chance of each number from 1-99 being selected.

Sample size is another aspect of sampling that is important. Generally, the larger the sample size (to an extent), the more representative the sample will be of the population it is drawn from. However, regardless of sample size, there will always be some degree of sampling error, since a sample can never truly represent a population. Sampling error is the difference between the true population parameters and the statistics measured in the sample.

There is no simple rule to determining the optimal sample size, as it is largely dependent on the study. However, some important aspects include the estimated variability among the observations and the acceptable amount of error based on the goal of the study.