Sampling
Topic Eight

### Types of Samples

Although there are different methods that might be used to create a sample, they generally can be grouped into one of two categories: probability samples or non-probability samples.

### Probability Samples

The idea behind this type is random selection. More specifically, each sample from the population of interest has a known probability of selection under a given sampling scheme. There are four categories of probability samples described below.

1. Simple Random

Sampling The most widely known type of a random sample is the simple random sample. This is characterized by the fact that the probability of selection is the same for every case in the population. Simple random sampling is a method of selecting n units from a population of size N such that every possible sample of size n has equal chance of being drawn. Imagine you want to carry out a survey of 100 voters in a small town with a population of 1,000 eligible voters. We could write the names of all voters on a piece of paper, put all pieces of paper into a box and draw 100 tickets at random. These 100 form our sample where every name in the box had the same probability of being chosen.

2. Stratified Random

Sampling In this form of sampling, the population is first divided into two or more mutually exclusive segments based on some categories of variables of interest in the research. It is designed to organize the population into homogenous subsets before sampling, then drawing a random sample within each subset. With stratified random sampling the population of N units is divided into subpopulations of units respectively. These subpopulations, called strata, are non-overlapping and together they comprise the whole of the population.
When these have been determined, a sample is drawn from each, with a separate draw for each of the different strata. The primary benefit of this method is to ensure that cases from smaller strata of the population are included in sufficient numbers to allow comparison. For example, you may be interested in how job satisfaction varies by ethnicity among a group of employees at a firm. To explore this issue, we need to create a sample of the employees of the firm. However, the employee population at this particular firm is predominantly Trinidadians, as the following chart illustrates:

If we were to take a simple random sample of employees, we may end up with very small numbers of Jamaicans, Barbadians and Puerto Ricans. That would not be good for research, since we might end up with too few cases for comparison in one or more of the smaller groups. Instead of taking a simple random sample from the population, a stratified sampling method can be used to ensure that appropriate numbers of elements are drawn from each ethnic group in proportion to the percentage of the population as a whole. For example, if we want a sample of 1000 employees - we would stratify the sample by ethnicity (group of Trinidadians employees, group of Jamaican employees, group of Barbadian employees and group of Puerto Rican employees), then randomly draw out 750 employees from the Trinidadian group, 90 from the Barbadian group, 100 from the Jamaican group and 60 from the Puerto Rican group. This yields a sample that is proportionately representative of the firm as a whole.

3. Systematic Sampling

This method of sampling is at first glance very different from simple random sampling. In practice, it is a variant of simple random sampling that involves some listing of elements - every nth element of list is then drawn for inclusion in the sample. Say you have a list of 10,000 people and you want a sample of 1,000.

Creating such a sample includes three steps:

1. Divide number of cases in the population by the desired sample size. In this example, dividing 10,000 by 1,000 gives a value of 10.
2. Select a random number between one and the value attained in Step 1. In this example, we choose a number between 1 and 10 - say we pick 7.
3. Starting with case number chosen in Step 2, take every tenth record (7, 17, 27, etc.).

More generally, suppose that the N units in the population are ranked 1 to N in some order (e.g., alphabetic). To select a sample of n units, we take a unit at random, from the 1st k units and take every k-th unit thereafter.

The advantages of systematic sampling method over simple random sampling include:

1. It is easier to draw a sample and often easier to execute without mistakes. This is a particular advantage when the drawing is done in the field.
2. Intuitively, you might think that systematic sampling might be more precise than simple random sampling. In effect it stratifies the population into n strata, consisting of the 1st k units, the 2nd k units, and so on. Thus, we might expect the systematic sample to be as precise as a stratified random sample with one unit per stratum. The difference is that with the systematic one the units occur at the same relative position in the stratum whereas with the stratified, the position in the stratum is determined separately by randomization within each stratum.
4. Cluster Sampling

In some instances the sampling unit consists of a group or cluster of smaller units that we call elements or subunits (these are the units of analysis for your study). There are two main reasons for the widespread application of cluster sampling. Although the first intention may be to use the elements as sampling units, it is found in many surveys that no reliable list of elements in the population is available and that it would be prohibitively expensive to construct such a list. In many countries there are no complete and updated lists of the people, the houses or the farms in any large geographical region.

Even when a list of individual houses is available, economic considerations may point to the choice of a larger cluster unit. For a given size of sample, a small unit usually gives more precise results than a large unit. For example a SRS of 600 houses covers a town more evenly than 20 city blocks containing an average of 30 houses apiece. But greater field costs are incurred in locating 600 houses and in traveling between them than in covering 20 city blocks. When cost is balanced against precision, the larger unit may prove superior.

### Non-probability Sampling

Social research is often conducted in situations where a researcher cannot select the kinds of probability samples used in large-scale social surveys. For example, say you wanted to study homelessness - there is no list of homeless individuals nor are you likely to create such a list. However, you need to get some kind of a sample of respondents in order to conduct your research. To gather such a sample, you would likely use some form of non- probability sampling.

To reiterate, the primary difference between probability methods of sampling and non-probability methods is that in the latter you do not know the likelihood that any element of a population will be selected for study.

There are two primary types of non-probability sampling methods:

1. Availability Sampling

Availability sampling is a method of choosing subjects who are available or easy to find. The primary advantage of the method is that it is very easy to conduct, relative to other methods. A researcher can merely stand out on his/her favorite street corner and distribute surveys. One place this method is popular is in university courses. For example, all students taking introductory sociology courses would have been given a survey and compelled to fill it out.

The primary problem with availability sampling is that you can never be certain what population the participants in the study represent. The population is unknown, the method for selecting cases is haphazard, and the cases studied probably do not represent any population you could come up with.

2. Quota Sampling

Quota sampling is designed to overcome the most obvious flaw of availability sampling. Rather than taking just anyone, you set quotas to ensure that the sample you obtain represents certain characteristics in proportion to their prevalence in the population. Note that for this method, you have to know something about the characteristics of the population ahead of time. Say you want to make sure you have a sample proportional to the population in terms of gender - you have to know what percentage of the population is male and female, then collect sample until yours matches.

The primary problem with this form of sampling is that even when we know that a quota sample is representative of the particular characteristics for which quotas have been set, we have no way of knowing if sample is representative in terms of any other characteristics. If we set quotas for gender and age, we are likely to attain a sample with good representativeness on age and gender, but one that may not be very representative in terms of income, education or other factors.