Internal validity of an experiment refers to the extent to which the changes in the dependent variable are due to the independent variable.
There can be many scenarios in which the hypothesis of an experiment is supported by the results, but those results were not actually caused by the manipulated independent variable (we sometimes call this a false cause).
Threats to internal validity can come from factors including extraneous variables that taint the results, issues with the sample of research participants, experimenter bias, confounding variables, and many more factors!
Internal validity also applies to other research methodologies, not just the experiment. For instance, longitudinal studies that take place over decades face many challenges that can threaten the internal validity of the study.
Examples of Internal Validity
- Sample attrition: Studies can lose internal validity when research participants pull out part-way through.
- Confounding variables: An unexpected variable that changes along with the dependent variable can make the results unclear.
- Experimenter bias: If an experimenter has a desire for a certain result, they can cause that result to occur (or wrongly perceive that outcome to have occurred).
- History effects: Sometimes, events that occur during the course of the study can confound the study’s results.
- Testing effects: Research participants change their behaviors during the test because the test increases their awareness of their behavior.
- Social Desirability: Research participants change their behavior in an attempt to be seen in a positive light by the researchers.
- Selection Bias: If the sample group is not representative of a broader population, then the study may not hold up (see also: construct validity).
- Maturation: If the research participants age during the study, the results may change because of natural maturation rather than the effects of an intervention.
- Instrumentation: If the researcher’s instruments change during the study, then the internal validity may come into question. We can also include in this category changes in research assistants.
- Demand Effect: The demand effect happens when research participants try to guess what the study is about then change their behavior as a result. Often, they’re wrong, but nevertheless, their behavior changes!
- Placebo Effect: Some research participants will report effects because they think they have been given an intervention. Here, the dependent variable hasn’t caused changes, but rather the mere perception of it has caused the change.
See Next: External Validity Examples
Threats to Internal Validity Explained
1. Attrition of the Sample
In longitudinal research, participants are sometimes studied for months, years, or even decades. For example, medical researchers may investigate the role of lifestyle factors such as nutrition and exercise over a period 30, 40, or even 50 years.
It can be difficult to keep in contact with a large group of people for such a long period of time.
Sometimes people move or change their phone number and fail to contact the research team; people move out of the country and cannot continue their participation; and sometimes people just lose interest in participating.
All of those reasons mean the sample size gets smaller and smaller. If sample attrition because too severe, it can jeopardize the internal validity of the study.
2. Confounding Variables
A confound is when a variable changes along with the manipulated independent variable (IV).
Although the researcher did not intend for this other variable to change, it does, and this means that the observed changes in the dependent variable (DV) may not have been caused by the IV.
For example, if interested in the effects of caffeine on memory, a researcher may ask half of the research participants to come to the lab in the morning and drink a cup of coffee.
For the sake of convenience, the other half of the participants are asked to come to the lab and drink a cup of water in the afternoon.
One hour after each participant drinks either coffee or water, they are given a memory test.
The results indicate that the participants that drank coffee performed much better on the memory test than the those that drank water.
Unfortunately, we don’t know if scores on the memory test were due to the coffee, or the time of day being tested.
3. Experimenter Bias
The experimenter is the person responsible for interacting with the research participants and guiding them through the experimental procedures.
If the experimenter is aware of the hypothesis of the study, they may inadvertently act differently toward the participants in the treatment group.
When the results are analyzed, it will be impossible to know if the observed changes in the DV are because of the treatment or the actions of the experimenter.
For this reason, the experimenter is usually blind to the hypothesis of the study or they are blind as to which participants are assigned to treatment or control groups.
4. History Effects
Sometimes events in society can cause a change in the experiment.
For example, suppose a team of researchers were examining the effects of mindfulness on anxiety. The experiment involves asking the treatment group participants to use a mindfulness app for 20 minutes every day for 6 weeks. The control group participants were asked to just sit quietly each day for 20 minutes for 6 weeks.
However, during week 3 of the study, a local news show did a special 3-part series on the effectiveness of mindfulness apps. The series included various professionals that claimed the apps were not very effective and it is always better to receive such training by a trained professional, in-person.
At the end of the 6-week study, the results showed no difference in anxiety between the treatment and control groups. It was also discovered later that a high percentage of participants in both groups watched the local newscast about apps.
5. Testing Effects
Testing effects occur when the experimental procedure requires that the participants take a test, attitudinal survey, or personality inventory at the beginning of the study. That experience then changes the behavior of the participants in some manner.
“Testing becomes a threat to internal validity if the test itself can affect participants’ responses when they are tested again” (Flannelly et al., 2018, p. 5).
For example, if researchers are interested in how screentime affects interpersonal dynamics, they may ask participants to fill out a questionnaire regarding personality characteristics, such as introversion/extraversion, and then ask them to sit in a waiting room that happens to have other people present.
Asking all of those questions about preferring solitary activities versus preferring the company of others activates those orientations and makes them stronger. This alters participants’’ behavior.
So, instead of seeing how people react in an interpersonal situation, the participants’ actions are affected by their experience responding to the personality inventory.
6. Social Desirability
This threat to internal validity happens when research participants change their behavior to try to create a favorable impression.
When people know their behavior is being observed or studied, they often change their action in hopes of creating a favorable impression.
In a more specific example, maybe the experimenter possesses physical attributes that are considered attractive in that society.
When participants are going through the experimental procedures, they may become overly concerned with looking desirable to the experimenter. So, they act more politely than they would normally, try harder than usual to be friendly, or try to appear very tolerant of cultural differences.
7. Selection Bias
In most cases, researchers want to be able to generalize the results of their study to the overall population. However, since they cannot include every single person in their study, they have to rely on a sample of individuals.
This is where it gets tricky. If the people in the sample are not chosen according to some basic rules, the sample will not be representative of the overall population. One basic rule is that people in the sample should be selected from the population randomly. This is called random selection.
However, in many studies in psychology, university students participate in research. Since characteristics of university students are slightly different than the general population (i.e., age and SES profile), there is a selection bias of the sample.
This threatens both the internal and external validity of a study.
People change over time. This is called maturation. It’s just a normal part of life. With young children, maturation can happen very rapidly. Unfortunately, maturation can pose serious threats the internal validity of a study.
For example, a school board may be interested in the role of a healthy breakfast in the academic performance of young children. So, they conduct a study that involves providing a nutritious breakfast to a group of first graders at the beginning of the year.
Six months later, the students are given a reading and spelling test. The results show that the overall performance of the students improved dramatically.
Unfortunately, the internal validity of the study is threatened because the reading and writing skills of first graders naturally improves during that period of time as a function of maturation.
When the instruments that a researcher uses change over time it can threaten the internal validity of the study.
For example, if using battery-powered devices in field research to collect data, it is possible that with extended use, the sensitivity or precision of those devices weakens as batter power declines.
Similarly, instrumentation effects are “…not limited to electronic or mechanical instruments, and applied to any means of measuring the dependent variable, including human researchers or research assistants who observe, judge, rate, and/or otherwise measure a dependent variable” (Flannelly et al., 2018, p. 6).
For example, when observers rate behavior, no matter how well-trained, they often improve over time. They become more accurate with practice, which is perfectly understandable. This means that the instrument used to collect data (i.e., trained observers) has changed from the beginning to the end of the study.
10. Demand Effects
When people participate in a study, they are often curious about the study’s purpose. In psychological studies, participants are rarely informed about the study’s purpose or the hypothesis being tested.
Unfortunately, this doesn’t stop people from trying to figure it out. When participants try to guess what the study’s hypothesis is, and then act in a way to support that hypothesis, it is a major threat to internal validity.
Although it is nice of participants to try to help out in this way, it makes the study invalid. Plus, they might guess wrong and actually behave in a way that disconfirms the hypothesis.
There are many problems that can arise when conducting an experiment or any type of scientific study. Each problem can pose a threat to the internal validity of the research and make the conclusions invalid.
Common threats include confounding variables, which means that other variables are changing at the same time as the IV is changing; or experimenter bias that happens when the person interacting with the participants treats the treatment and control groups differently.
In other cases, instruments used to collect data may lose accuracy over time, or participants may try to guess the purpose of the study and behave accordingly.
Social science is difficult to conduct and it can often take many years to answer a single question about human behavior.
Campbell, D. T., & Stanley, J. C. (1966). Experimental and Quasi-experimental Designs for Research. Chicago, IL: Rand McNally & Company
Cook, T. D. and Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin, Boston.
Flannelly, K. J., Flannelly, L. T, & Jankowski, K. (2018). Threats to the internal validity of experimental and quasi-experimental research in healthcare. Journal of Health Care Chaplaincy, 24, 1-24. https://doi.org/10.1080/08854726.2017.1421019
Kenny, D. A. (2019). Enhancing validity in psychological research. The American Psychologist, 74(9), 1018–1028. https://doi.org/10.1037/amp0000531