GLOBAL HEALTH RESEARCH CERTIFICATE
Module 2: Study Design and Sampling
Study Design
Cross-sectional studies are simple in design and are aimed at finding out the prevalence of a phenomenon, problem, attitude or issue by taking a snap-shot or cross-section of the population. This obtains an overall picture as it stands at the time of the study. For example, a cross-sectional design would be used to assess demographic characteristics or community attitudes. These studies usually involve one contact with the study population and are relatively cheap to undertake.
Pre-test/post-test studies measure the change in a situation, phenomenon, problem or attitude. Such studies are often used to measure the efficacy of a program. These studies can be seen as a variation of the cross-sectional design as they involve two sets of cross-sectional data collection on the same population to determine if a change has occurred.
Retrospective studies investigate a phenomenon or issue that has occurred in the past. Such studies most often involve secondary data collection, based upon data available from previous studies or databases. For example, a retrospective study would be needed to examine the relationship between levels of unemployment and street crime in NYC over the past 100 years.
Prospective studies seek to estimate the likelihood of an event or problem in the future. Thus, these studies attempt to predict what the outcome of an event is to be. General science experiments are often classified as prospective studies because the experimenter must wait until the experiment runs its course in order to examine the effects. Randomized controlled trials are always prospective studies and often involve following a “cohort” of individuals to determine the relationship between various variables.
Longitudinal studies follow study subjects over a long period of time with repeated data collection throughout. Some longitudinal studies last several months, while others can last decades. Most are observational studies that seek to identify a correlation among various factors. Thus, longitudinal studies do not manipulate variables and are not often able to detect causal relationships.
Sample
Once the researcher has chosen a hypothesis to test in a study, the next step is to select a pool of participants to be in that study. However, any research project must be able to extend the implications of the findings beyond the participants who actually participated in the study. For obvious reasons, it is nearly impossible for a researcher to study every person in the population of interest. In the example that we have been using thus far, the population of interest is “the developing world." The researcher must therefore make a decision to limit the research to a subset of that population, and this has important implications for the applicability of study results. The researcher must put some careful forethought into exactly how and why a certain group of individuals will be studied.(1)
Sampling Methods
Probability Sampling refers to sampling when the chance of any given individual being selected is known and these individuals are sampled independently of each other. This is also known as random sampling. A researcher can simply use a random number generator to choose participants (known as simple random sampling), or every nth individual (known as systematic sampling) can be included. Researchers also may break their target population into strata, and then apply these techniques within each strata to ensure that they are getting enough participants from each strata to be able to draw conclusions. For example, if there are several ethnic communities in one geographical area that a researcher wishes to study, that researcher might aim to have 30 participants from each group, selected randomly from within the groups, in order to have a good representation of all the relevant groups.
Non-Probability Sampling, or convenience sampling, refers to when researchers take whatever individuals happen to be easiest to access as participants in a study. This is only done when the processes the researchers are testing are assumed to be so basic and universal that they can be generalized beyond such a narrow sample.(2) For example, snowball sampling is an approach for locating information-rich key informants.(3) Using this approach, a few potential respondents are contacted and asked whether they know of anybody with the characteristics that you are looking for in your research. Snowball sampling is not a stand-alone tool; the tool is a way of selecting participants and then using other tools, such as interviews or surveys.
Sampling Challenges
Because researchers can seldom study the entire population, they must choose a subset of the population, which can result in several types of error. Sometimes, there are discrepancies between the sample and the population on a certain parameter that are due to random differences. This is known as sampling error and can occur through no fault of the researcher.
Far more problematic is systematic error, which refers to a difference between the sample and the population that is due to a systematic difference between the two rather than random chance alone. The response rate problem refers to the fact that the sample can become self-selecting, and that there may be something about people who choose to participate in the study that affects one of the variables of interest. For example, in our eye care case, we may experience this kind of error if we simply sample those who choose to come to an eye clinic for a free eye exam as our experimental group and those who have poor eyesight but do not seek eye care as our control group. It is very possible in this situation that the people who actively seek help happen to be more proactive than those who do not. Because these two groups vary systematically on an attribute that is not the dependent variable (economic productivity), it is very possible that it is this difference in personality trait and not the independent variable (if they received corrective lenses or not) that produces any effects that the researcher observes on the dependent variable. This would be considered a failure in internal validity.
Another type of systematic sampling error is coverage error, which refers to the fact that sometimes researchers mistakenly restrict their sampling frame to a subset of the population of interest. This means that the sample they are studying varies systematically from the population for which they wish to generalize their results. For example, a researcher may seek to generalize the results to the “population of developing countries,” yet may have a coverage error by sampling only heavily urban areas. This leaves out all of the more rural populations in developing countries, which have very different characteristics than the urban populations on several parameters. Thus, the researcher could not appropriately generalize the results to the broader population and would therefore have to restrict the conclusions to populations in urban areas of developing countries.(4)
First and foremost, a researcher must think very carefully about the population that will be included in the study and how to sample that population. Errors in sampling can often be avoided by good planning and careful consideration. However, in order to improve a sampling frame, a researcher can always seek more participants. The more participants a study has, the less likely the study is to suffer from sampling error. In the case of the response rate problem, the researcher can actively work on increasing the response rate, or can try to determine if there is in fact a difference between those who partake in the study and those who do not. The most important thing for a researcher to remember is to eliminate any and all variables that the researcher cannot control. While this is nearly impossible in field research, the closer a researcher comes to isolating the variable of interest, the better the results.(5)
Footnotes
(1) Pelham, B. W.; Blanton, H. Conducting Research in Psychology: Measuring the Weight of Smoke, 3rd Edition. Wadsworth Publishing (February 27, 2006).
(2) Trochim, W. M. K. “Probability Sampling” Research Methods Knowledge Base 2nd Edition.
(3) Patton, M (1990) Qualitative evaluation and research methods, Sage Publications, Newbury Park, California.
(4) Pelham, B. W.; Blanton, H. Conducting Research in Psychology: Measuring the Weight of Smoke, 3rd Edition. Wadsworth Publishing (February 27, 2006).
(5) Ibid.