Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.3 Quasi-Experimental Research

Learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001). Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952). But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate without receiving psychotherapy. This suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here:

http://psychclassics.yorku.ca/Eysenck/psychotherapy.htm

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980). They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Han Eysenck

In a classic 1952 article, researcher Hans Eysenck pointed out the shortcomings of the simple pretest-posttest design for evaluating the effectiveness of psychotherapy.

Wikimedia Commons – CC BY-SA 3.0.

Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979). Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Figure 7.5 A Hypothetical Interrupted Time-Series Design

A Hypothetical Interrupted Time-Series Design - The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not

The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

Discussion: Imagine that a group of obese children is recruited for a study in which their weight is measured, then they participate for 3 months in a program that encourages them to be more active, and finally their weight is measured again. Explain how each of the following might affect the results:

  • regression to the mean
  • spontaneous remission

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin.

Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324.

Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146.

Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Quasi Experimental Design Overview & Examples

By Jim Frost Leave a Comment

What is a Quasi Experimental Design?

A quasi experimental design is a method for identifying causal relationships that does not randomly assign participants to the experimental groups. Instead, researchers use a non-random process. For example, they might use an eligibility cutoff score or preexisting groups to determine who receives the treatment.

Image illustrating a quasi experimental design.

Quasi-experimental research is a design that closely resembles experimental research but is different. The term “quasi” means “resembling,” so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has. However, unlike true experimental research, participants are not randomly assigned to different groups.

Learn more about Experimental Designs: Definition & Types .

When to Use Quasi-Experimental Design

Researchers typically use a quasi-experimental design because they can’t randomize due to practical or ethical concerns. For example:

  • Practical Constraints : A school interested in testing a new teaching method can only implement it in preexisting classes and cannot randomly assign students.
  • Ethical Concerns : A medical study might not be able to randomly assign participants to a treatment group for an experimental medication when they are already taking a proven drug.

Quasi-experimental designs also come in handy when researchers want to study the effects of naturally occurring events, like policy changes or environmental shifts, where they can’t control who is exposed to the treatment.

Quasi-experimental designs occupy a unique position in the spectrum of research methodologies, sitting between observational studies and true experiments. This middle ground offers a blend of both worlds, addressing some limitations of purely observational studies while navigating the constraints often accompanying true experiments.

A significant advantage of quasi-experimental research over purely observational studies and correlational research is that it addresses the issue of directionality, determining which variable is the cause and which is the effect. In quasi-experiments, an intervention typically occurs during the investigation, and the researchers record outcomes before and after it, increasing the confidence that it causes the observed changes.

However, it’s crucial to recognize its limitations as well. Controlling confounding variables is a larger concern for a quasi-experimental design than a true experiment because it lacks random assignment.

In sum, quasi-experimental designs offer a valuable research approach when random assignment is not feasible, providing a more structured and controlled framework than observational studies while acknowledging and attempting to address potential confounders.

Types of Quasi-Experimental Designs and Examples

Quasi-experimental studies use various methods, depending on the scenario.

Natural Experiments

This design uses naturally occurring events or changes to create the treatment and control groups. Researchers compare outcomes between those whom the event affected and those it did not affect. Analysts use statistical controls to account for confounders that the researchers must also measure.

Natural experiments are related to observational studies, but they allow for a clearer causality inference because the external event or policy change provides both a form of quasi-random group assignment and a definite start date for the intervention.

For example, in a natural experiment utilizing a quasi-experimental design, researchers study the impact of a significant economic policy change on small business growth. The policy is implemented in one state but not in neighboring states. This scenario creates an unplanned experimental setup, where the state with the new policy serves as the treatment group, and the neighboring states act as the control group.

Researchers are primarily interested in small business growth rates but need to record various confounders that can impact growth rates. Hence, they record state economic indicators, investment levels, and employment figures. By recording these metrics across the states, they can include them in the model as covariates and control them statistically. This method allows researchers to estimate differences in small business growth due to the policy itself, separate from the various confounders.

Nonequivalent Groups Design

This method involves matching existing groups that are similar but not identical. Researchers attempt to find groups that are as equivalent as possible, particularly for factors likely to affect the outcome.

For instance, researchers use a nonequivalent groups quasi-experimental design to evaluate the effectiveness of a new teaching method in improving students’ mathematics performance. A school district considering the teaching method is planning the study. Students are already divided into schools, preventing random assignment.

The researchers matched two schools with similar demographics, baseline academic performance, and resources. The school using the traditional methodology is the control, while the other uses the new approach. Researchers are evaluating differences in educational outcomes between the two methods.

They perform a pretest to identify differences between the schools that might affect the outcome and include them as covariates to control for confounding. They also record outcomes before and after the intervention to have a larger context for the changes they observe.

Regression Discontinuity

This process assigns subjects to a treatment or control group based on a predetermined cutoff point (e.g., a test score). The analysis primarily focuses on participants near the cutoff point, as they are likely similar except for the treatment received. By comparing participants just above and below the cutoff, the design controls for confounders that vary smoothly around the cutoff.

For example, in a regression discontinuity quasi-experimental design focusing on a new medical treatment for depression, researchers use depression scores as the cutoff point. Individuals with depression scores just above a certain threshold are assigned to receive the latest treatment, while those just below the threshold do not receive it. This method creates two closely matched groups: one that barely qualifies for treatment and one that barely misses out.

By comparing the mental health outcomes of these two groups over time, researchers can assess the effectiveness of the new treatment. The assumption is that the only significant difference between the groups is whether they received the treatment, thereby isolating its impact on depression outcomes.

Controlling Confounders in a Quasi-Experimental Design

Accounting for confounding variables is a challenging but essential task for a quasi-experimental design.

In a true experiment, the random assignment process equalizes confounders across the groups to nullify their overall effect. It’s the gold standard because it works on all confounders, known and unknown.

Unfortunately, the lack of random assignment can allow differences between the groups to exist before the intervention. These confounding factors might ultimately explain the results rather than the intervention.

Consequently, researchers must use other methods to equalize the groups roughly using matching and cutoff values or statistically adjust for preexisting differences they measure to reduce the impact of confounders.

A key strength of quasi-experiments is their frequent use of “pre-post testing.” This approach involves conducting initial tests before collecting data to check for preexisting differences between groups that could impact the study’s outcome. By identifying these variables early on and including them as covariates, researchers can more effectively control potential confounders in their statistical analysis.

Additionally, researchers frequently track outcomes before and after the intervention to better understand the context for changes they observe.

Statisticians consider these methods to be less effective than randomization. Hence, quasi-experiments fall somewhere in the middle when it comes to internal validity , or how well the study can identify causal relationships versus mere correlation . They’re more conclusive than correlational studies but not as solid as true experiments.

In conclusion, quasi-experimental designs offer researchers a versatile and practical approach when random assignment is not feasible. This methodology bridges the gap between controlled experiments and observational studies, providing a valuable tool for investigating cause-and-effect relationships in real-world settings. Researchers can address ethical and logistical constraints by understanding and leveraging the different types of quasi-experimental designs while still obtaining insightful and meaningful results.

Cook, T. D., & Campbell, D. T. (1979).  Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin

Share this:

quasi experiment duration

Reader Interactions

Comments and questions cancel reply.

A Modern Guide to Understanding and Conducting Research in Psychology

Chapter 7 quasi-experimental research, learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions ( Cook et al., 1979 ) . Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here, focusing first on nonequivalent groups, pretest-posttest, interrupted time series, and combination designs before turning to single subject designs (including reversal and multiple-baseline designs).

7.1 Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

7.2 Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an STEM education program on elementary school students’ attitudes toward science, technology, engineering and math. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the STEM program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an science program aired on television and many of the students watched it, or perhaps a major scientific discover occured and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become more exposed to STEM subjects in class or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all ( Posternak & Miller, 2001 ) . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Finally, it is possible that the act of taking a pretest can sensitize participants to the measurement process or heighten their awareness of the variable under investigation. This heightened sensitivity, called a testing effect , can subsequently lead to changes in their posttest responses, even in the absence of any external intervention effect.

7.3 Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In a recent COVID-19 study, the intervention involved the implementation of state-issued mask mandates and restrictions on on-premises restaurant dining. The researchers examined the impact of these measures on COVID-19 cases and deaths ( Guy Jr et al., 2021 ) . Since there was a rapid reduction in daily case and death growth rates following the implementation of mask mandates, and this effect persisted for an extended period, the researchers concluded that the implementation of mask mandates was the cause of the decrease in COVID-19 transmission. This study employed an interrupted time series design, similar to a pretest-posttest design, as it involved measuring the outcomes before and after the intervention. However, unlike the pretest-posttest design, it incorporated multiple measurements before and after the intervention, providing a more comprehensive analysis of the policy impacts.

Figure 7.1 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.1 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.1 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Two line graphs. The x-axes on both are labeled Week and range from 0 to 14. The y-axes on both are labeled Absences and range from 0 to 8. Between weeks 7 and 8 a vertical dotted line indicates when a treatment was introduced. Both graphs show generally high levels of absences from weeks 1 through 7 (before the treatment) and only 2 absences in week 8 (the first observation after the treatment). The top graph shows the absence level staying low from weeks 9 to 14. The bottom graph shows the absence level for weeks 9 to 15 bouncing around at the same high levels as before the treatment.

Figure 7.1: Hypothetical interrupted time-series design. The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

7.4 Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their current level of engagement in pro-environmental behaviors (i.e., recycling, eating less red meat, abstaining for single-use plastics, etc.), then are exposed to an pro-environmental program in which they learn about the effects of human caused climate change on the planet, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an pro-environmental program, and finally are given a posttest. Again, if students in the treatment condition become more involved in pro-environmental behaviors, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become engage in more pro-environmental behaviors than students in the control condition. But if it is a matter of history (e.g., news of a forest fire or drought) or maturation (e.g., improved reasoning or sense of responsibility), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a local heat wave with record high temperatures), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, this kind of design has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

KEY TAKEAWAYS

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

regression to the mean

Spontaneous remission, 7.5 single-subject research.

  • Explain what single-subject research is, including how it differs from other types of psychological research and who uses single-subject research and why.
  • Design simple single-subject studies using reversal and multiple-baseline designs.
  • Explain how single-subject research designs address the issue of internal validity.
  • Interpret the results of simple single-subject studies based on the visual inspection of graphed data.
  • Explain some of the points of disagreement between advocates of single-subject research and advocates of group research.

Researcher Vance Hall and his colleagues were faced with the challenge of increasing the extent to which six disruptive elementary school students stayed focused on their schoolwork ( Hall et al., 1968 ) . For each of several days, the researchers carefully recorded whether or not each student was doing schoolwork every 10 seconds during a 30-minute period. Once they had established this baseline, they introduced a treatment. The treatment was that when the student was doing schoolwork, the teacher gave him or her positive attention in the form of a comment like “good work” or a pat on the shoulder. The result was that all of the students dramatically increased their time spent on schoolwork and decreased their disruptive behavior during this treatment phase. For example, a student named Robbie originally spent 25% of his time on schoolwork and the other 75% “snapping rubber bands, playing with toys from his pocket, and talking and laughing with peers” (p. 3). During the treatment phase, however, he spent 71% of his time on schoolwork and only 29% on other activities. Finally, when the researchers had the teacher stop giving positive attention, the students all decreased their studying and increased their disruptive behavior. This was consistent with the claim that it was, in fact, the positive attention that was responsible for the increase in studying. This was one of the first studies to show that attending to positive behavior—and ignoring negative behavior—could be a quick and effective way to deal with problem behavior in an applied setting.

Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. *Photo by Jerry Wang on Unsplash.*

Figure 7.2: Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. Photo by Jerry Wang on Unsplash.

Most of this book is about what can be called group research, which typically involves studying a large number of participants and combining their data to draw general conclusions about human behavior. The study by Hall and his colleagues, in contrast, is an example of single-subject research, which typically involves studying a small number of participants and focusing closely on each individual. In this section, we consider this alternative approach. We begin with an overview of single-subject research, including some assumptions on which it is based, who conducts it, and why they do. We then look at some basic single-subject research designs and how the data from those designs are analyzed. Finally, we consider some of the strengths and weaknesses of single-subject research as compared with group research and see how these two approaches can complement each other.

Overview of Single-Subject Research

What is single-subject research.

Single-subject research is a type of quantitative, quasi-experimental research that involves studying in detail the behavior of each of a small number of participants. Note that the term single-subject does not mean that only one participant is studied; it is more typical for there to be somewhere between two and 10 participants. (This is why single-subject research designs are sometimes called small-n designs, where n is the statistical symbol for the sample size.) Single-subject research can be contrasted with group research , which typically involves studying large numbers of participants and examining their behavior primarily in terms of group means, standard deviations, and so on. The majority of this book is devoted to understanding group research, which is the most common approach in psychology. But single-subject research is an important alternative, and it is the primary approach in some areas of psychology.

Before continuing, it is important to distinguish single-subject research from two other approaches, both of which involve studying in detail a small number of participants. One is qualitative research, which focuses on understanding people’s subjective experience by collecting relatively unstructured data (e.g., detailed interviews) and analyzing those data using narrative rather than quantitative techniques (see. Single-subject research, in contrast, focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

It is also important to distinguish single-subject research from case studies. A case study is a detailed description of an individual, which can include both qualitative and quantitative analyses. (Case studies that include only qualitative analyses can be considered a type of qualitative research.) The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see box “The Case of ‘Anna O.’”) and John Watson and Rosalie Rayner’s description of Little Albert ( Watson & Rayner, 1920 ) who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat. Case studies can be useful for suggesting new research questions and for illustrating general principles. They can also help researchers understand rare phenomena, such as the effects of damage to a specific part of the human brain. As a general rule, however, case studies cannot substitute for carefully designed group or single-subject research studies. One reason is that case studies usually do not allow researchers to determine whether specific events are causally related, or even related at all. For example, if a patient is described in a case study as having been sexually abused as a child and then as having developed an eating disorder as a teenager, there is no way to determine whether these two events had anything to do with each other. A second reason is that an individual case can always be unusual in some way and therefore be unrepresentative of people more generally. Thus case studies have serious problems with both internal and external validity.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis ( Freud, 1957 ) . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst (p. 9).

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return.

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

"Anna O." was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Figure 7.3: “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Assumptions of Single-Subject Research

Again, single-subject research involves studying a small number of participants and focusing intensively on the behavior of each one. But why take this approach instead of the group approach? There are two important assumptions underlying single-subject research, and it will help to consider them now.

First and foremost is the assumption that it is important to focus intensively on the behavior of individual participants. One reason for this is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half the people exposed to it but a negative effect for the other half would, on average, appear to have no effect at all. Single-subject research, however, would likely reveal these individual differences. A second reason to focus intensively on individuals is that sometimes it is the behavior of a particular individual that is primarily of interest. A school psychologist, for example, might be interested in changing the behavior of a particular disruptive student. Although previous published research (both single-subject and group research) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective.

Another assumption of single-subject research is that it is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur. This is sometimes referred to as social validity ( Wolf, 1978 ) . The study by Hall and his colleagues, for example, had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often chaotic elementary school classrooms.

Who Uses Single-Subject Research?

Single-subject research has been around as long as the field of psychology itself. In the late 1800s, one of psychology’s founders, Wilhelm Wundt, studied sensation and consciousness by focusing intensively on each of a small number of research participants. Herman Ebbinghaus’s research on memory and Ivan Pavlov’s research on classical conditioning are other early examples, both of which are still described in almost every introductory psychology textbook.

In the middle of the 20th century, B. F. Skinner clarified many of the assumptions underlying single-subject research and refined many of its techniques ( Skinner, 1938 ) . He and other researchers then used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out primarily using nonhuman subjects—mostly rats and pigeons. This approach, which Skinner called the experimental analysis of behavior —remains an important subfield of psychology and continues to rely almost exclusively on single-subject research. For examples of this work, look at any issue of the Journal of the Experimental Analysis of Behavior . By the 1960s, many researchers were interested in using this approach to conduct applied research primarily with humans—a subfield now called applied behavior analysis ( Baer et al., 1968 ) . Applied behavior analysis plays a significant role in contemporary research on developmental disabilities, education, organizational behavior, and health, among many other areas. Examples of this work (including the study by Hall and his colleagues) can be found in the Journal of Applied Behavior Analysis . The single-subject approach can also be used by clinicians who take any theoretical perspective—behavioral, cognitive, psychodynamic, or humanistic—to study processes of therapeutic change with individual clients and to document their clients’ improvement ( Kazdin, 2019 ) .

Single-Subject Research Designs

General features of single-subject designs.

Before looking at any specific single-subject research designs, it will be helpful to consider some features that are common to most of them. Many of these features are illustrated in Figure 7.4 , which shows the results of a generic single-subject study. First, the dependent variable (represented on the y-axis of the graph) is measured repeatedly over time (represented by the x-axis) at regular intervals. Second, the study is divided into distinct phases, and the participant is tested under one condition per phase. The conditions are often designated by capital letters: A, B, C, and so on. Thus Figure 7.4 represents a design in which the participant was tested first in one condition (A), then tested in another condition (B), and finally retested in the original condition (A). (This is called a reversal design and will be discussed in more detail shortly.)

Results of a generic single-subject study illustrating several principles of single-subject research.

Figure 7.4: Results of a generic single-subject study illustrating several principles of single-subject research.

Another important aspect of single-subject research is that the change from one condition to the next does not usually occur after a fixed amount of time or number of observations. Instead, it depends on the participant’s behavior. Specifically, the researcher waits until the participant’s behavior in one condition becomes fairly consistent from observation to observation before changing conditions. This is sometimes referred to as the steady state strategy ( Sidman, 1960 ) . The idea is that when the dependent variable has reached a steady state, then any change across conditions will be relatively easy to detect. Recall that we encountered this same principle when discussing experimental research more generally. The effect of an independent variable is easier to detect when the “noise” in the data is minimized.

Reversal Designs

The most basic single-subject research design is the reversal design , also called the ABA design . During the first phase, A, a baseline is established for the dependent variable. This is the level of responding before any treatment is introduced, and therefore the baseline phase is a kind of control condition. When steady state responding is reached, phase B begins as the researcher introduces the treatment. Again, the researcher waits until that dependent variable reaches a steady state so that it is clear whether and how much it has changed. Finally, the researcher removes the treatment and again waits until the dependent variable reaches a steady state. This basic reversal design can also be extended with the reintroduction of the treatment (ABAB), another return to baseline (ABABA), and so on. The study by Hall and his colleagues was an ABAB reversal design (Figure 7.5 ).

An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Figure 7.5: An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Why is the reversal—the removal of the treatment—considered to be necessary in this type of design? If the dependent variable changes after the treatment is introduced, it is not always clear that the treatment was responsible for the change. It is possible that something else changed at around the same time and that this extraneous variable is responsible for the change in the dependent variable. But if the dependent variable changes with the introduction of the treatment and then changes back with the removal of the treatment, it is much clearer that the treatment (and removal of the treatment) is the cause. In other words, the reversal greatly increases the internal validity of the study.

Multiple-Baseline Designs

There are two potential problems with the reversal design—both of which have to do with the removal of the treatment. One is that if a treatment is working, it may be unethical to remove it. For example, if a treatment seemed to reduce the incidence of self-injury in a developmentally disabled child, it would be unethical to remove that treatment just to show that the incidence of self-injury increases. The second problem is that the dependent variable may not return to baseline when the treatment is removed. For example, when positive attention for studying is removed, a student might continue to study at an increased rate. This could mean that the positive attention had a lasting effect on the student’s studying, which of course would be good, but it could also mean that the positive attention was not really the cause of the increased studying in the first place.

One solution to these problems is to use a multiple-baseline design , which is represented in Figure 7.6 . In one version of the design, a baseline is established for each of several participants, and the treatment is then introduced for each one. In essence, each participant is tested in an AB design. The key to this design is that the treatment is introduced at a different time for each participant. The idea is that if the dependent variable changes when the treatment is introduced for one participant, it might be a coincidence. But if the dependent variable changes when the treatment is introduced for multiple participants—especially when the treatment is introduced at different times for the different participants—then it is less likely to be a coincidence.

Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

Figure 7.6: Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

As an example, consider a study by Scott Ross and Robert Horner ( Ross et al., 2009 ) . They were interested in how a school-wide bullying prevention program affected the bullying behavior of particular problem students. At each of three different schools, the researchers studied two students who had regularly engaged in bullying. During the baseline phase, they observed the students for 10-minute periods each day during lunch recess and counted the number of aggressive behaviors they exhibited toward their peers. (The researchers used handheld computers to help record the data.) After 2 weeks, they implemented the program at one school. After 2 more weeks, they implemented it at the second school. And after 2 more weeks, they implemented it at the third school. They found that the number of aggressive behaviors exhibited by each student dropped shortly after the program was implemented at his or her school. Notice that if the researchers had only studied one school or if they had introduced the treatment at the same time at all three schools, then it would be unclear whether the reduction in aggressive behaviors was due to the bullying program or something else that happened at about the same time it was introduced (e.g., a holiday, a television program, a change in the weather). But with their multiple-baseline design, this kind of coincidence would have to happen three separate times—an unlikely occurrence—to explain their results.

Data Analysis in Single-Subject Research

In addition to its focus on individual participants, single-subject research differs from group research in the way the data are typically analyzed. As we have seen throughout the book, group research involves combining data across participants. Inferential statistics are used to help decide whether the result for the sample is likely to generalize to the population. Single-subject research, by contrast, relies heavily on a very different approach called visual inspection . This means plotting individual participants’ data as shown throughout this chapter, looking carefully at those data, and making judgments about whether and to what extent the independent variable had an effect on the dependent variable. Inferential statistics are typically not used.

In visually inspecting their data, single-subject researchers take several factors into account. One of them is changes in the level of the dependent variable from condition to condition. If the dependent variable is much higher or much lower in one condition than another, this suggests that the treatment had an effect. A second factor is trend , which refers to gradual increases or decreases in the dependent variable across observations. If the dependent variable begins increasing or decreasing with a change in conditions, then again this suggests that the treatment had an effect. It can be especially telling when a trend changes directions—for example, when an unwanted behavior is increasing during baseline but then begins to decrease with the introduction of the treatment. A third factor is latency , which is the time it takes for the dependent variable to begin changing after a change in conditions. In general, if a change in the dependent variable begins shortly after a change in conditions, this suggests that the treatment was responsible.

In the top panel of Figure 7.7 , there are fairly obvious changes in the level and trend of the dependent variable from condition to condition. Furthermore, the latencies of these changes are short; the change happens immediately. This pattern of results strongly suggests that the treatment was responsible for the changes in the dependent variable. In the bottom panel of Figure 7.7 , however, the changes in level are fairly small. And although there appears to be an increasing trend in the treatment condition, it looks as though it might be a continuation of a trend that had already begun during baseline. This pattern of results strongly suggests that the treatment was not responsible for any changes in the dependent variable—at least not to the extent that single-subject researchers typically hope to see.

Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

Figure 7.7: Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

The results of single-subject research can also be analyzed using statistical procedures—and this is becoming more common. There are many different approaches, and single-subject researchers continue to debate which are the most useful. One approach parallels what is typically done in group research. The mean and standard deviation of each participant’s responses under each condition are computed and compared, and inferential statistical tests such as the t test or analysis of variance are applied ( Fisch, 2001 ) . (Note that averaging across participants is less common.) Another approach is to compute the percentage of nonoverlapping data (PND) for each participant ( Scruggs & Mastropieri, 2021 ) . This is the percentage of responses in the treatment condition that are more extreme than the most extreme response in a relevant control condition. In the study of Hall and his colleagues, for example, all measures of Robbie’s study time in the first treatment condition were greater than the highest measure in the first baseline, for a PND of 100%. The greater the percentage of nonoverlapping data, the stronger the treatment effect. Still, formal statistical approaches to data analysis in single-subject research are generally considered a supplement to visual inspection, not a replacement for it.

The Single-Subject Versus Group “Debate”

Single-subject research is similar to group research—especially experimental group research—in many ways. They are both quantitative approaches that try to establish causal relationships by manipulating an independent variable, measuring a dependent variable, and controlling extraneous variables. As we will see, single-subject research and group research are probably best conceptualized as complementary approaches.

Data Analysis

One set of disagreements revolves around the issue of data analysis. Some advocates of group research worry that visual inspection is inadequate for deciding whether and to what extent a treatment has affected a dependent variable. One specific concern is that visual inspection is not sensitive enough to detect weak effects. A second is that visual inspection can be unreliable, with different researchers reaching different conclusions about the same set of data ( Danov & Symons, 2008 ) . A third is that the results of visual inspection—an overall judgment of whether or not a treatment was effective—cannot be clearly and efficiently summarized or compared across studies (unlike the measures of relationship strength typically used in group research).

In general, single-subject researchers share these concerns. However, they also argue that their use of the steady state strategy, combined with their focus on strong and consistent effects, minimizes most of them. If the effect of a treatment is difficult to detect by visual inspection because the effect is weak or the data are noisy, then single-subject researchers look for ways to increase the strength of the effect or reduce the noise in the data by controlling extraneous variables (e.g., by administering the treatment more consistently). If the effect is still difficult to detect, then they are likely to consider it neither strong enough nor consistent enough to be of further interest. Many single-subject researchers also point out that statistical analysis is becoming increasingly common and that many of them are using it as a supplement to visual inspection—especially for the purpose of comparing results across studies ( Scruggs & Mastropieri, 2021 ) .

Turning the tables, some advocates of single-subject research worry about the way that group researchers analyze their data. Specifically, they point out that focusing on group means can be highly misleading. Again, imagine that a treatment has a strong positive effect on half the people exposed to it and an equally strong negative effect on the other half. In a traditional between-subjects experiment, the positive effect on half the participants in the treatment condition would be statistically cancelled out by the negative effect on the other half. The mean for the treatment group would then be the same as the mean for the control group, making it seem as though the treatment had no effect when in fact it had a strong effect on every single participant!

But again, group researchers share this concern. Although they do focus on group statistics, they also emphasize the importance of examining distributions of individual scores. For example, if some participants were positively affected by a treatment and others negatively affected by it, this would produce a bimodal distribution of scores and could be detected by looking at a histogram of the data. The use of within-subjects designs is another strategy that allows group researchers to observe effects at the individual level and even to specify what percentage of individuals exhibit strong, medium, weak, and even negative effects.

External Validity

The second issue about which single-subject and group researchers sometimes disagree has to do with external validity—the ability to generalize the results of a study beyond the people and situation actually studied. In particular, advocates of group research point out the difficulty in knowing whether results for just a few participants are likely to generalize to others in the population. Imagine, for example, that in a single-subject study, a treatment has been shown to reduce self-injury for each of two developmentally disabled children. Even if the effect is strong for these two children, how can one know whether this treatment is likely to work for other developmentally disabled children?

Again, single-subject researchers share this concern. In response, they note that the strong and consistent effects they are typically interested in—even when observed in small samples—are likely to generalize to others in the population. Single-subject researchers also note that they place a strong emphasis on replicating their research results. When they observe an effect with a small sample of participants, they typically try to replicate it with another small sample—perhaps with a slightly different type of participant or under slightly different conditions. Each time they observe similar results, they rightfully become more confident in the generality of those results. Single-subject researchers can also point to the fact that the principles of classical and operant conditioning—most of which were discovered using the single-subject approach—have been successfully generalized across an incredibly wide range of species and situations.

And again turning the tables, single-subject researchers have concerns of their own about the external validity of group research. One extremely important point they make is that studying large groups of participants does not entirely solve the problem of generalizing to other individuals. Imagine, for example, a treatment that has been shown to have a small positive effect on average in a large group study. It is likely that although many participants exhibited a small positive effect, others exhibited a large positive effect, and still others exhibited a small negative effect. When it comes to applying this treatment to another large group , we can be fairly sure that it will have a small effect on average. But when it comes to applying this treatment to another individual , we cannot be sure whether it will have a small, a large, or even a negative effect. Another point that single-subject researchers make is that group researchers also face a similar problem when they study a single situation and then generalize their results to other situations. For example, researchers who conduct a study on the effect of cell phone use on drivers on a closed oval track probably want to apply their results to drivers in many other real-world driving situations. But notice that this requires generalizing from a single situation to a population of situations. Thus the ability to generalize is based on much more than just the sheer number of participants one has studied. It requires a careful consideration of the similarity of the participants and situations studied to the population of participants and situations that one wants to generalize to ( Shadish et al., 2002 ) .

Single-Subject and Group Research as Complementary Methods

As with quantitative and qualitative research, it is probably best to conceptualize single-subject research and group research as complementary methods that have different strengths and weaknesses and that are appropriate for answering different kinds of research questions ( Kazdin, 2019 ) . Single-subject research is particularly good for testing the effectiveness of treatments on individuals when the focus is on strong, consistent, and biologically or socially important effects. It is especially useful when the behavior of particular individuals is of interest. Clinicians who work with only one individual at a time may find that it is their only option for doing systematic quantitative research.

Group research, on the other hand, is good for testing the effectiveness of treatments at the group level. Among the advantages of this approach is that it allows researchers to detect weak effects, which can be of interest for many reasons. For example, finding a weak treatment effect might lead to refinements of the treatment that eventually produce a larger and more meaningful effect. Group research is also good for studying interactions between treatments and participant characteristics. For example, if a treatment is effective for those who are high in motivation to change and ineffective for those who are low in motivation to change, then a group design can detect this much more efficiently than a single-subject design. Group research is also necessary to answer questions that cannot be addressed using the single-subject approach, including questions about independent variables that cannot be manipulated (e.g., number of siblings, extroversion, culture).

  • Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology.
  • Single-subject studies must be distinguished from case studies, in which an individual case is described in detail. Case studies can be useful for generating new research questions, for studying rare phenomena, and for illustrating general principles. However, they cannot substitute for carefully controlled experimental or correlational studies because they are low in internal and external validity.
  • Single-subject research designs typically involve measuring the dependent variable repeatedly over time and changing conditions (e.g., from baseline to treatment) when the dependent variable has reached a steady state. This approach allows the researcher to see whether changes in the independent variable are causing changes in the dependent variable.
  • Single-subject researchers typically analyze their data by graphing them and making judgments about whether the independent variable is affecting the dependent variable based on level, trend, and latency.
  • Differences between single-subject research and group research sometimes lead to disagreements between single-subject and group researchers. These disagreements center on the issues of data analysis and external validity (especially generalization to other people). Single-subject research and group research are probably best seen as complementary methods, with different strengths and weaknesses, that are appropriate for answering different kinds of research questions.
  • Does positive attention from a parent increase a child’s toothbrushing behavior?
  • Does self-testing while studying improve a student’s performance on weekly spelling tests?
  • Does regular exercise help relieve depression?
  • Practice: Create a graph that displays the hypothetical results for the study you designed in Exercise 1. Write a paragraph in which you describe what the results show. Be sure to comment on level, trend, and latency.
  • Discussion: Imagine you have conducted a single-subject study showing a positive effect of a treatment on the behavior of a man with social anxiety disorder. Your research has been criticized on the grounds that it cannot be generalized to others. How could you respond to this criticism?
  • Discussion: Imagine you have conducted a group study showing a positive effect of a treatment on the behavior of a group of people with social anxiety disorder, but your research has been criticized on the grounds that “average” effects cannot be generalized to individuals. How could you respond to this criticism?

7.6 Glossary

The simplest reversal design, in which there is a baseline condition (A), followed by a treatment condition (B), followed by a return to baseline (A).

applied behavior analysis

A subfield of psychology that uses single-subject research and applies the principles of behavior analysis to real-world problems in areas that include education, developmental disabilities, organizational behavior, and health behavior.

A condition in a single-subject research design in which the dependent variable is measured repeatedly in the absence of any treatment. Most designs begin with a baseline condition, and many return to the baseline condition at least once.

A detailed description of an individual case.

experimental analysis of behavior

A subfield of psychology founded by B. F. Skinner that uses single-subject research—often with nonhuman animals—to study relationships primarily between environmental conditions and objectively observable behaviors.

group research

A type of quantitative research that involves studying a large number of participants and examining their behavior in terms of means, standard deviations, and other group-level statistics.

interrupted time-series design

A research design in which a series of measurements of the dependent variable are taken both before and after a treatment.

item-order effect

The effect of responding to one survey item on responses to a later survey item.

Refers collectively to extraneous developmental changes in participants that can occur between a pretest and posttest or between the first and last measurements in a time series. It can provide an alternative explanation for an observed change in the dependent variable.

multiple-baseline design

A single-subject research design in which multiple baselines are established for different participants, different dependent variables, or different contexts and the treatment is introduced at a different time for each baseline.

naturalistic observation

An approach to data collection in which the behavior of interest is observed in the environment in which it typically occurs.

nonequivalent groups design

A between-subjects research design in which participants are not randomly assigned to conditions, usually because participants are in preexisting groups (e.g., students at different schools).

nonexperimental research

Research that lacks the manipulation of an independent variable or the random assignment of participants to conditions or orders of conditions.

open-ended item

A questionnaire item that asks a question and allows respondents to respond in whatever way they want.

percentage of nonoverlapping data

A statistic sometimes used in single-subject research. The percentage of observations in a treatment condition that are more extreme than the most extreme observation in a relevant baseline condition.

pretest-posttest design

A research design in which the dependent variable is measured (the pretest), a treatment is given, and the dependent variable is measured again (the posttest) to see if there is a change in the dependent variable from pretest to posttest.

quasi-experimental research

Research that involves the manipulation of an independent variable but lacks the random assignment of participants to conditions or orders of conditions. It is generally used in field settings to test the effectiveness of a treatment.

rating scale

An ordered set of response options to a closed-ended questionnaire item.

The statistical fact that an individual who scores extremely on one occasion will tend to score less extremely on the next occasion.

A term often used to refer to a participant in survey research.

reversal design

A single-subject research design that begins with a baseline condition with no treatment, followed by the introduction of a treatment, and after that a return to the baseline condition. It can include additional treatment conditions and returns to baseline.

single-subject research

A type of quantitative research that involves examining in detail the behavior of each of a small number of participants.

single-variable research

Research that focuses on a single variable rather than on a statistical relationship between variables.

social validity

The extent to which a single-subject study focuses on an intervention that has a substantial effect on an important behavior and can be implemented reliably in the real-world contexts (e.g., by teachers in a classroom) in which that behavior occurs.

Improvement in a psychological or medical problem over time without any treatment.

steady state strategy

In single-subject research, allowing behavior to become fairly consistent from one observation to the next before changing conditions. This makes any effect of the treatment easier to detect.

survey research

A quantitative research approach that uses self-report measures and large, carefully selected samples.

testing effect

A bias in participants’ responses in which scores on the posttest are influenced by simple exposure to the pretest

visual inspection

The primary approach to data analysis in single-subject research, which involves graphing the data and making a judgment as to whether and to what extent the independent variable affected the dependent variable.

  • Privacy Policy

Research Method

Home » Quasi-Experimental Research Design – Types, Methods

Quasi-Experimental Research Design – Types, Methods

Table of Contents

Quasi-Experimental Design

Quasi-Experimental Design

Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable(s) that is available in a true experimental design.

In a quasi-experimental design, the researcher uses an existing group of participants that is not randomly assigned to the experimental and control groups. Instead, the groups are selected based on pre-existing characteristics or conditions, such as age, gender, or the presence of a certain medical condition.

Types of Quasi-Experimental Design

There are several types of quasi-experimental designs that researchers use to study causal relationships between variables. Here are some of the most common types:

Non-Equivalent Control Group Design

This design involves selecting two groups of participants that are similar in every way except for the independent variable(s) that the researcher is testing. One group receives the treatment or intervention being studied, while the other group does not. The two groups are then compared to see if there are any significant differences in the outcomes.

Interrupted Time-Series Design

This design involves collecting data on the dependent variable(s) over a period of time, both before and after an intervention or event. The researcher can then determine whether there was a significant change in the dependent variable(s) following the intervention or event.

Pretest-Posttest Design

This design involves measuring the dependent variable(s) before and after an intervention or event, but without a control group. This design can be useful for determining whether the intervention or event had an effect, but it does not allow for control over other factors that may have influenced the outcomes.

Regression Discontinuity Design

This design involves selecting participants based on a specific cutoff point on a continuous variable, such as a test score. Participants on either side of the cutoff point are then compared to determine whether the intervention or event had an effect.

Natural Experiments

This design involves studying the effects of an intervention or event that occurs naturally, without the researcher’s intervention. For example, a researcher might study the effects of a new law or policy that affects certain groups of people. This design is useful when true experiments are not feasible or ethical.

Data Analysis Methods

Here are some data analysis methods that are commonly used in quasi-experimental designs:

Descriptive Statistics

This method involves summarizing the data collected during a study using measures such as mean, median, mode, range, and standard deviation. Descriptive statistics can help researchers identify trends or patterns in the data, and can also be useful for identifying outliers or anomalies.

Inferential Statistics

This method involves using statistical tests to determine whether the results of a study are statistically significant. Inferential statistics can help researchers make generalizations about a population based on the sample data collected during the study. Common statistical tests used in quasi-experimental designs include t-tests, ANOVA, and regression analysis.

Propensity Score Matching

This method is used to reduce bias in quasi-experimental designs by matching participants in the intervention group with participants in the control group who have similar characteristics. This can help to reduce the impact of confounding variables that may affect the study’s results.

Difference-in-differences Analysis

This method is used to compare the difference in outcomes between two groups over time. Researchers can use this method to determine whether a particular intervention has had an impact on the target population over time.

Interrupted Time Series Analysis

This method is used to examine the impact of an intervention or treatment over time by comparing data collected before and after the intervention or treatment. This method can help researchers determine whether an intervention had a significant impact on the target population.

Regression Discontinuity Analysis

This method is used to compare the outcomes of participants who fall on either side of a predetermined cutoff point. This method can help researchers determine whether an intervention had a significant impact on the target population.

Steps in Quasi-Experimental Design

Here are the general steps involved in conducting a quasi-experimental design:

  • Identify the research question: Determine the research question and the variables that will be investigated.
  • Choose the design: Choose the appropriate quasi-experimental design to address the research question. Examples include the pretest-posttest design, non-equivalent control group design, regression discontinuity design, and interrupted time series design.
  • Select the participants: Select the participants who will be included in the study. Participants should be selected based on specific criteria relevant to the research question.
  • Measure the variables: Measure the variables that are relevant to the research question. This may involve using surveys, questionnaires, tests, or other measures.
  • Implement the intervention or treatment: Implement the intervention or treatment to the participants in the intervention group. This may involve training, education, counseling, or other interventions.
  • Collect data: Collect data on the dependent variable(s) before and after the intervention. Data collection may also include collecting data on other variables that may impact the dependent variable(s).
  • Analyze the data: Analyze the data collected to determine whether the intervention had a significant impact on the dependent variable(s).
  • Draw conclusions: Draw conclusions about the relationship between the independent and dependent variables. If the results suggest a causal relationship, then appropriate recommendations may be made based on the findings.

Quasi-Experimental Design Examples

Here are some examples of real-time quasi-experimental designs:

  • Evaluating the impact of a new teaching method: In this study, a group of students are taught using a new teaching method, while another group is taught using the traditional method. The test scores of both groups are compared before and after the intervention to determine whether the new teaching method had a significant impact on student performance.
  • Assessing the effectiveness of a public health campaign: In this study, a public health campaign is launched to promote healthy eating habits among a targeted population. The behavior of the population is compared before and after the campaign to determine whether the intervention had a significant impact on the target behavior.
  • Examining the impact of a new medication: In this study, a group of patients is given a new medication, while another group is given a placebo. The outcomes of both groups are compared to determine whether the new medication had a significant impact on the targeted health condition.
  • Evaluating the effectiveness of a job training program : In this study, a group of unemployed individuals is enrolled in a job training program, while another group is not enrolled in any program. The employment rates of both groups are compared before and after the intervention to determine whether the training program had a significant impact on the employment rates of the participants.
  • Assessing the impact of a new policy : In this study, a new policy is implemented in a particular area, while another area does not have the new policy. The outcomes of both areas are compared before and after the intervention to determine whether the new policy had a significant impact on the targeted behavior or outcome.

Applications of Quasi-Experimental Design

Here are some applications of quasi-experimental design:

  • Educational research: Quasi-experimental designs are used to evaluate the effectiveness of educational interventions, such as new teaching methods, technology-based learning, or educational policies.
  • Health research: Quasi-experimental designs are used to evaluate the effectiveness of health interventions, such as new medications, public health campaigns, or health policies.
  • Social science research: Quasi-experimental designs are used to investigate the impact of social interventions, such as job training programs, welfare policies, or criminal justice programs.
  • Business research: Quasi-experimental designs are used to evaluate the impact of business interventions, such as marketing campaigns, new products, or pricing strategies.
  • Environmental research: Quasi-experimental designs are used to evaluate the impact of environmental interventions, such as conservation programs, pollution control policies, or renewable energy initiatives.

When to use Quasi-Experimental Design

Here are some situations where quasi-experimental designs may be appropriate:

  • When the research question involves investigating the effectiveness of an intervention, policy, or program : In situations where it is not feasible or ethical to randomly assign participants to intervention and control groups, quasi-experimental designs can be used to evaluate the impact of the intervention on the targeted outcome.
  • When the sample size is small: In situations where the sample size is small, it may be difficult to randomly assign participants to intervention and control groups. Quasi-experimental designs can be used to investigate the impact of an intervention without requiring a large sample size.
  • When the research question involves investigating a naturally occurring event : In some situations, researchers may be interested in investigating the impact of a naturally occurring event, such as a natural disaster or a major policy change. Quasi-experimental designs can be used to evaluate the impact of the event on the targeted outcome.
  • When the research question involves investigating a long-term intervention: In situations where the intervention or program is long-term, it may be difficult to randomly assign participants to intervention and control groups for the entire duration of the intervention. Quasi-experimental designs can be used to evaluate the impact of the intervention over time.
  • When the research question involves investigating the impact of a variable that cannot be manipulated : In some situations, it may not be possible or ethical to manipulate a variable of interest. Quasi-experimental designs can be used to investigate the relationship between the variable and the targeted outcome.

Purpose of Quasi-Experimental Design

The purpose of quasi-experimental design is to investigate the causal relationship between two or more variables when it is not feasible or ethical to conduct a randomized controlled trial (RCT). Quasi-experimental designs attempt to emulate the randomized control trial by mimicking the control group and the intervention group as much as possible.

The key purpose of quasi-experimental design is to evaluate the impact of an intervention, policy, or program on a targeted outcome while controlling for potential confounding factors that may affect the outcome. Quasi-experimental designs aim to answer questions such as: Did the intervention cause the change in the outcome? Would the outcome have changed without the intervention? And was the intervention effective in achieving its intended goals?

Quasi-experimental designs are useful in situations where randomized controlled trials are not feasible or ethical. They provide researchers with an alternative method to evaluate the effectiveness of interventions, policies, and programs in real-life settings. Quasi-experimental designs can also help inform policy and practice by providing valuable insights into the causal relationships between variables.

Overall, the purpose of quasi-experimental design is to provide a rigorous method for evaluating the impact of interventions, policies, and programs while controlling for potential confounding factors that may affect the outcome.

Advantages of Quasi-Experimental Design

Quasi-experimental designs have several advantages over other research designs, such as:

  • Greater external validity : Quasi-experimental designs are more likely to have greater external validity than laboratory experiments because they are conducted in naturalistic settings. This means that the results are more likely to generalize to real-world situations.
  • Ethical considerations: Quasi-experimental designs often involve naturally occurring events, such as natural disasters or policy changes. This means that researchers do not need to manipulate variables, which can raise ethical concerns.
  • More practical: Quasi-experimental designs are often more practical than experimental designs because they are less expensive and easier to conduct. They can also be used to evaluate programs or policies that have already been implemented, which can save time and resources.
  • No random assignment: Quasi-experimental designs do not require random assignment, which can be difficult or impossible in some cases, such as when studying the effects of a natural disaster. This means that researchers can still make causal inferences, although they must use statistical techniques to control for potential confounding variables.
  • Greater generalizability : Quasi-experimental designs are often more generalizable than experimental designs because they include a wider range of participants and conditions. This can make the results more applicable to different populations and settings.

Limitations of Quasi-Experimental Design

There are several limitations associated with quasi-experimental designs, which include:

  • Lack of Randomization: Quasi-experimental designs do not involve randomization of participants into groups, which means that the groups being studied may differ in important ways that could affect the outcome of the study. This can lead to problems with internal validity and limit the ability to make causal inferences.
  • Selection Bias: Quasi-experimental designs may suffer from selection bias because participants are not randomly assigned to groups. Participants may self-select into groups or be assigned based on pre-existing characteristics, which may introduce bias into the study.
  • History and Maturation: Quasi-experimental designs are susceptible to history and maturation effects, where the passage of time or other events may influence the outcome of the study.
  • Lack of Control: Quasi-experimental designs may lack control over extraneous variables that could influence the outcome of the study. This can limit the ability to draw causal inferences from the study.
  • Limited Generalizability: Quasi-experimental designs may have limited generalizability because the results may only apply to the specific population and context being studied.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Textual Analysis

Textual Analysis – Types, Examples and Guide

Questionnaire

Questionnaire – Definition, Types, and Examples

One-to-One Interview in Research

One-to-One Interview – Methods and Guide

Transformative Design

Transformative Design – Methods, Types, Guide

Exploratory Research

Exploratory Research – Types, Methods and...

Focus Groups in Qualitative Research

Focus Groups – Steps, Examples and Guide

Quasi-Experimental Design: Definition, Types, Examples

Appinio Research · 19.12.2023 · 37min read

Quasi-Experimental Design Definition Types Examples

Ever wondered how researchers uncover cause-and-effect relationships in the real world, where controlled experiments are often elusive? Quasi-experimental design holds the key. In this guide, we'll unravel the intricacies of quasi-experimental design, shedding light on its definition, purpose, and applications across various domains. Whether you're a student, a professional, or simply curious about the methods behind meaningful research, join us as we delve into the world of quasi-experimental design, making complex concepts sound simple and embarking on a journey of knowledge and discovery.

What is Quasi-Experimental Design?

Quasi-experimental design is a research methodology used to study the effects of independent variables on dependent variables when full experimental control is not possible or ethical. It falls between controlled experiments, where variables are tightly controlled, and purely observational studies, where researchers have little control over variables. Quasi-experimental design mimics some aspects of experimental research but lacks randomization.

The primary purpose of quasi-experimental design is to investigate cause-and-effect relationships between variables in real-world settings. Researchers use this approach to answer research questions, test hypotheses, and explore the impact of interventions or treatments when they cannot employ traditional experimental methods. Quasi-experimental studies aim to maximize internal validity and make meaningful inferences while acknowledging practical constraints and ethical considerations.

Quasi-Experimental vs. Experimental Design

It's essential to understand the distinctions between Quasi-Experimental and Experimental Design to appreciate the unique characteristics of each approach:

  • Randomization:  In Experimental Design, random assignment of participants to groups is a defining feature. Quasi-experimental design, on the other hand, lacks randomization due to practical constraints or ethical considerations.
  • Control Groups :  Experimental Design typically includes control groups that are subjected to no treatment or a placebo. The quasi-experimental design may have comparison groups but lacks the same level of control.
  • Manipulation of IV:  Experimental Design involves the intentional manipulation of the independent variable. Quasi-experimental design often deals with naturally occurring independent variables.
  • Causal Inference:  Experimental Design allows for stronger causal inferences due to randomization and control. Quasi-experimental design permits causal inferences but with some limitations.

When to Use Quasi-Experimental Design?

A quasi-experimental design is particularly valuable in several situations:

  • Ethical Constraints:  When manipulating the independent variable is ethically unacceptable or impractical, quasi-experimental design offers an alternative to studying naturally occurring variables.
  • Real-World Settings:  When researchers want to study phenomena in real-world contexts, quasi-experimental design allows them to do so without artificial laboratory settings.
  • Limited Resources:  In cases where resources are limited and conducting a controlled experiment is cost-prohibitive, quasi-experimental design can provide valuable insights.
  • Policy and Program Evaluation:  Quasi-experimental design is commonly used in evaluating the effectiveness of policies, interventions, or programs that cannot be randomly assigned to participants.

Importance of Quasi-Experimental Design in Research

Quasi-experimental design plays a vital role in research for several reasons:

  • Addressing Real-World Complexities:  It allows researchers to tackle complex real-world issues where controlled experiments are not feasible. This bridges the gap between controlled experiments and purely observational studies.
  • Ethical Research:  It provides an honest approach when manipulating variables or assigning treatments could harm participants or violate ethical standards.
  • Policy and Practice Implications:  Quasi-experimental studies generate findings with direct applications in policy-making and practical solutions in fields such as education, healthcare, and social sciences.
  • Enhanced External Validity:  Findings from Quasi-Experimental research often have high external validity, making them more applicable to broader populations and contexts.

By embracing the challenges and opportunities of quasi-experimental design, researchers can contribute valuable insights to their respective fields and drive positive changes in the real world.

Key Concepts in Quasi-Experimental Design

In quasi-experimental design, it's essential to grasp the fundamental concepts underpinning this research methodology. Let's explore these key concepts in detail.

Independent Variable

The independent variable (IV) is the factor you aim to study or manipulate in your research. Unlike controlled experiments, where you can directly manipulate the IV, quasi-experimental design often deals with naturally occurring variables. For example, if you're investigating the impact of a new teaching method on student performance, the teaching method is your independent variable.

Dependent Variable

The dependent variable (DV) is the outcome or response you measure to assess the effects of changes in the independent variable. Continuing with the teaching method example, the dependent variable would be the students' academic performance, typically measured using test scores, grades, or other relevant metrics.

Control Groups vs. Comparison Groups

While quasi-experimental design lacks the luxury of randomly assigning participants to control and experimental groups, you can still establish comparison groups to make meaningful inferences. Control groups consist of individuals who do not receive the treatment, while comparison groups are exposed to different levels or variations of the treatment. These groups help researchers gauge the effect of the independent variable.

Pre-Test and Post-Test Measures

In quasi-experimental design, it's common practice to collect data both before and after implementing the independent variable. The initial data (pre-test) serves as a baseline, allowing you to measure changes over time (post-test). This approach helps assess the impact of the independent variable more accurately. For instance, if you're studying the effectiveness of a new drug, you'd measure patients' health before administering the drug (pre-test) and afterward (post-test).

Threats to Internal Validity

Internal validity is crucial for establishing a cause-and-effect relationship between the independent and dependent variables. However, in a quasi-experimental design, several threats can compromise internal validity. These threats include:

  • Selection Bias :  When non-randomized groups differ systematically in ways that affect the study's outcome.
  • History Effects:  External events or changes over time that influence the results.
  • Maturation Effects:  Natural changes or developments that occur within participants during the study.
  • Regression to the Mean:  The tendency for extreme scores on a variable to move closer to the mean upon retesting.
  • Attrition and Mortality:  The loss of participants over time, potentially skewing the results.
  • Testing Effects:  The mere act of testing or assessing participants can impact their subsequent performance.

Understanding these threats is essential for designing and conducting Quasi-Experimental studies that yield valid and reliable results.

Randomization and Non-Randomization

In traditional experimental designs, randomization is a powerful tool for ensuring that groups are equivalent at the outset of a study. However, quasi-experimental design often involves non-randomization due to the nature of the research. This means that participants are not randomly assigned to treatment and control groups. Instead, researchers must employ various techniques to minimize biases and ensure that the groups are as similar as possible.

For example, if you are conducting a study on the effects of a new teaching method in a real classroom setting, you cannot randomly assign students to the treatment and control groups. Instead, you might use statistical methods to match students based on relevant characteristics such as prior academic performance or socioeconomic status. This matching process helps control for potential confounding variables, increasing the validity of your study.

Types of Quasi-Experimental Designs

In quasi-experimental design, researchers employ various approaches to investigate causal relationships and study the effects of independent variables when complete experimental control is challenging. Let's explore these types of quasi-experimental designs.

One-Group Posttest-Only Design

The One-Group Posttest-Only Design is one of the simplest forms of quasi-experimental design. In this design, a single group is exposed to the independent variable, and data is collected only after the intervention has taken place. Unlike controlled experiments, there is no comparison group. This design is useful when researchers cannot administer a pre-test or when it is logistically difficult to do so.

Example : Suppose you want to assess the effectiveness of a new time management seminar. You offer the seminar to a group of employees and measure their productivity levels immediately afterward to determine if there's an observable impact.

One-Group Pretest-Posttest Design

Similar to the One-Group Posttest-Only Design, this approach includes a pre-test measure in addition to the post-test. Researchers collect data both before and after the intervention. By comparing the pre-test and post-test results within the same group, you can gain a better understanding of the changes that occur due to the independent variable.

Example : If you're studying the impact of a stress management program on participants' stress levels, you would measure their stress levels before the program (pre-test) and after completing the program (post-test) to assess any changes.

Non-Equivalent Groups Design

The Non-Equivalent Groups Design involves multiple groups, but they are not randomly assigned. Instead, researchers must carefully match or control for relevant variables to minimize biases. This design is particularly useful when random assignment is not possible or ethical.

Example : Imagine you're examining the effectiveness of two teaching methods in two different schools. You can't randomly assign students to the schools, but you can carefully match them based on factors like age, prior academic performance, and socioeconomic status to create equivalent groups.

Time Series Design

Time Series Design is an approach where data is collected at multiple time points before and after the intervention. This design allows researchers to analyze trends and patterns over time, providing valuable insights into the sustained effects of the independent variable.

Example : If you're studying the impact of a new marketing campaign on product sales, you would collect sales data at regular intervals (e.g., monthly) before and after the campaign's launch to observe any long-term trends.

Regression Discontinuity Design

Regression Discontinuity Design is employed when participants are assigned to different groups based on a specific cutoff score or threshold. This design is often used in educational and policy research to assess the effects of interventions near a cutoff point.

Example : Suppose you're evaluating the impact of a scholarship program on students' academic performance. Students who score just above or below a certain GPA threshold are assigned differently to the program. This design helps assess the program's effectiveness at the cutoff point.

Propensity Score Matching

Propensity Score Matching is a technique used to create comparable treatment and control groups in non-randomized studies. Researchers calculate propensity scores based on participants' characteristics and match individuals in the treatment group to those in the control group with similar scores.

Example : If you're studying the effects of a new medication on patient outcomes, you would use propensity scores to match patients who received the medication with those who did not but have similar health profiles.

Interrupted Time Series Design

The Interrupted Time Series Design involves collecting data at multiple time points before and after the introduction of an intervention. However, in this design, the intervention occurs at a specific point in time, allowing researchers to assess its immediate impact.

Example : Let's say you're analyzing the effects of a new traffic management system on traffic accidents. You collect accident data before and after the system's implementation to observe any abrupt changes right after its introduction.

Each of these quasi-experimental designs offers unique advantages and is best suited to specific research questions and scenarios. Choosing the right design is crucial for conducting robust and informative studies.

Advantages and Disadvantages of Quasi-Experimental Design

Quasi-experimental design offers a valuable research approach, but like any methodology, it comes with its own set of advantages and disadvantages. Let's explore these in detail.

Quasi-Experimental Design Advantages

Quasi-experimental design presents several advantages that make it a valuable tool in research:

  • Real-World Applicability:  Quasi-experimental studies often take place in real-world settings, making the findings more applicable to practical situations. Researchers can examine the effects of interventions or variables in the context where they naturally occur.
  • Ethical Considerations:  In situations where manipulating the independent variable in a controlled experiment would be unethical, quasi-experimental design provides an ethical alternative. For example, it would be unethical to assign participants to smoke for a study on the health effects of smoking, but you can study naturally occurring groups of smokers and non-smokers.
  • Cost-Efficiency:  Conducting Quasi-Experimental research is often more cost-effective than conducting controlled experiments. The absence of controlled environments and extensive manipulations can save both time and resources.

These advantages make quasi-experimental design an attractive choice for researchers facing practical or ethical constraints in their studies.

Quasi-Experimental Design Disadvantages

However, quasi-experimental design also comes with its share of challenges and disadvantages:

  • Limited Control:  Unlike controlled experiments, where researchers have full control over variables, quasi-experimental design lacks the same level of control. This limited control can result in confounding variables that make it difficult to establish causality.
  • Threats to Internal Validity:  Various threats to internal validity, such as selection bias, history effects, and maturation effects, can compromise the accuracy of causal inferences. Researchers must carefully address these threats to ensure the validity of their findings.
  • Causality Inference Challenges:  Establishing causality can be challenging in quasi-experimental design due to the absence of randomization and control. While you can make strong arguments for causality, it may not be as conclusive as in controlled experiments.
  • Potential Confounding Variables:  In a quasi-experimental design, it's often challenging to control for all possible confounding variables that may affect the dependent variable. This can lead to uncertainty in attributing changes solely to the independent variable.

Despite these disadvantages, quasi-experimental design remains a valuable research tool when used judiciously and with a keen awareness of its limitations. Researchers should carefully consider their research questions and the practical constraints they face before choosing this approach.

How to Conduct a Quasi-Experimental Study?

Conducting a Quasi-Experimental study requires careful planning and execution to ensure the validity of your research. Let's dive into the essential steps you need to follow when conducting such a study.

1. Define Research Questions and Objectives

The first step in any research endeavor is clearly defining your research questions and objectives. This involves identifying the independent variable (IV) and the dependent variable (DV) you want to study. What is the specific relationship you want to explore, and what do you aim to achieve with your research?

  • Specify Your Research Questions :  Start by formulating precise research questions that your study aims to answer. These questions should be clear, focused, and relevant to your field of study.
  • Identify the Independent Variable:  Define the variable you intend to manipulate or study in your research. Understand its significance in your study's context.
  • Determine the Dependent Variable:  Identify the outcome or response variable that will be affected by changes in the independent variable.
  • Establish Hypotheses (If Applicable):  If you have specific hypotheses about the relationship between the IV and DV, state them clearly. Hypotheses provide a framework for testing your research questions.

2. Select the Appropriate Quasi-Experimental Design

Choosing the right quasi-experimental design is crucial for achieving your research objectives. Select a design that aligns with your research questions and the available data. Consider factors such as the feasibility of implementing the design and the ethical considerations involved.

  • Evaluate Your Research Goals:  Assess your research questions and objectives to determine which type of quasi-experimental design is most suitable. Each design has its strengths and limitations, so choose one that aligns with your goals.
  • Consider Ethical Constraints:  Take into account any ethical concerns related to your research. Depending on your study's context, some designs may be more ethically sound than others.
  • Assess Data Availability:  Ensure you have access to the necessary data for your chosen design. Some designs may require extensive historical data, while others may rely on data collected during the study.

3. Identify and Recruit Participants

Selecting the right participants is a critical aspect of Quasi-Experimental research. The participants should represent the population you want to make inferences about, and you must address ethical considerations, including informed consent.

  • Define Your Target Population:  Determine the population that your study aims to generalize to. Your sample should be representative of this population.
  • Recruitment Process:  Develop a plan for recruiting participants. Depending on your design, you may need to reach out to specific groups or institutions.
  • Informed Consent:  Ensure that you obtain informed consent from participants. Clearly explain the nature of the study, potential risks, and their rights as participants.

4. Collect Data

Data collection is a crucial step in Quasi-Experimental research. You must adhere to a consistent and systematic process to gather relevant information before and after the intervention or treatment.

  • Pre-Test Measures:  If applicable, collect data before introducing the independent variable. Ensure that the pre-test measures are standardized and reliable.
  • Post-Test Measures:  After the intervention, collect post-test data using the same measures as the pre-test. This allows you to assess changes over time.
  • Maintain Data Consistency:  Ensure that data collection procedures are consistent across all participants and time points to minimize biases.

5. Analyze Data

Once you've collected your data, it's time to analyze it using appropriate statistical techniques . The choice of analysis depends on your research questions and the type of data you've gathered.

  • Statistical Analysis :  Use statistical software to analyze your data. Common techniques include t-tests , analysis of variance (ANOVA) , regression analysis , and more, depending on the design and variables.
  • Control for Confounding Variables:  Be aware of potential confounding variables and include them in your analysis as covariates to ensure accurate results.

Chi-Square Calculator :

t-Test Calculator :

One-way ANOVA Calculator :

6. Interpret Results

With the analysis complete, you can interpret the results to draw meaningful conclusions about the relationship between the independent and dependent variables.

  • Examine Effect Sizes:  Assess the magnitude of the observed effects to determine their practical significance.
  • Consider Significance Levels:  Determine whether the observed results are statistically significant . Understand the p-values and their implications.
  • Compare Findings to Hypotheses:  Evaluate whether your findings support or reject your hypotheses and research questions.

7. Draw Conclusions

Based on your analysis and interpretation of the results, draw conclusions about the research questions and objectives you set out to address.

  • Causal Inferences:  Discuss the extent to which your study allows for causal inferences. Be transparent about the limitations and potential alternative explanations for your findings.
  • Implications and Applications:  Consider the practical implications of your research. How do your findings contribute to existing knowledge, and how can they be applied in real-world contexts?
  • Future Research:  Identify areas for future research and potential improvements in study design. Highlight any limitations or constraints that may have affected your study's outcomes.

By following these steps meticulously, you can conduct a rigorous and informative Quasi-Experimental study that advances knowledge in your field of research.

Quasi-Experimental Design Examples

Quasi-experimental design finds applications in a wide range of research domains, including business-related and market research scenarios. Below, we delve into some detailed examples of how this research methodology is employed in practice:

Example 1: Assessing the Impact of a New Marketing Strategy

Suppose a company wants to evaluate the effectiveness of a new marketing strategy aimed at boosting sales. Conducting a controlled experiment may not be feasible due to the company's existing customer base and the challenge of randomly assigning customers to different marketing approaches. In this scenario, a quasi-experimental design can be employed.

  • Independent Variable:  The new marketing strategy.
  • Dependent Variable:  Sales revenue.
  • Design:  The company could implement the new strategy for one group of customers while maintaining the existing strategy for another group. Both groups are selected based on similar demographics and purchase history , reducing selection bias. Pre-implementation data (sales records) can serve as the baseline, and post-implementation data can be collected to assess the strategy's impact.

Example 2: Evaluating the Effectiveness of Employee Training Programs

In the context of human resources and employee development, organizations often seek to evaluate the impact of training programs. A randomized controlled trial (RCT) with random assignment may not be practical or ethical, as some employees may need specific training more than others. Instead, a quasi-experimental design can be employed.

  • Independent Variable:  Employee training programs.
  • Dependent Variable:  Employee performance metrics, such as productivity or quality of work.
  • Design:  The organization can offer training programs to employees who express interest or demonstrate specific needs, creating a self-selected treatment group. A comparable control group can consist of employees with similar job roles and qualifications who did not receive the training. Pre-training performance metrics can serve as the baseline, and post-training data can be collected to assess the impact of the training programs.

Example 3: Analyzing the Effects of a Tax Policy Change

In economics and public policy, researchers often examine the effects of tax policy changes on economic behavior. Conducting a controlled experiment in such cases is practically impossible. Therefore, a quasi-experimental design is commonly employed.

  • Independent Variable:  Tax policy changes (e.g., tax rate adjustments).
  • Dependent Variable:  Economic indicators, such as consumer spending or business investments.
  • Design:  Researchers can analyze data from different regions or jurisdictions where tax policy changes have been implemented. One region could represent the treatment group (with tax policy changes), while a similar region with no tax policy changes serves as the control group. By comparing economic data before and after the policy change in both groups, researchers can assess the impact of the tax policy changes.

These examples illustrate how quasi-experimental design can be applied in various research contexts, providing valuable insights into the effects of independent variables in real-world scenarios where controlled experiments are not feasible or ethical. By carefully selecting comparison groups and controlling for potential biases, researchers can draw meaningful conclusions and inform decision-making processes.

How to Publish Quasi-Experimental Research?

Publishing your Quasi-Experimental research findings is a crucial step in contributing to the academic community's knowledge. We'll explore the essential aspects of reporting and publishing your Quasi-Experimental research effectively.

Structuring Your Research Paper

When preparing your research paper, it's essential to adhere to a well-structured format to ensure clarity and comprehensibility. Here are key elements to include:

Title and Abstract

  • Title:  Craft a concise and informative title that reflects the essence of your study. It should capture the main research question or hypothesis.
  • Abstract:  Summarize your research in a structured abstract, including the purpose, methods, results, and conclusions. Ensure it provides a clear overview of your study.

Introduction

  • Background and Rationale:  Provide context for your study by discussing the research gap or problem your study addresses. Explain why your research is relevant and essential.
  • Research Questions or Hypotheses:  Clearly state your research questions or hypotheses and their significance.

Literature Review

  • Review of Related Work:  Discuss relevant literature that supports your research. Highlight studies with similar methodologies or findings and explain how your research fits within this context.
  • Participants:  Describe your study's participants, including their characteristics and how you recruited them.
  • Quasi-Experimental Design:  Explain your chosen design in detail, including the independent and dependent variables, procedures, and any control measures taken.
  • Data Collection:  Detail the data collection methods , instruments used, and any pre-test or post-test measures.
  • Data Analysis:  Describe the statistical techniques employed, including any control for confounding variables.
  • Presentation of Findings:  Present your results clearly, using tables, graphs, and descriptive statistics where appropriate. Include p-values and effect sizes, if applicable.
  • Interpretation of Results:  Discuss the implications of your findings and how they relate to your research questions or hypotheses.
  • Interpretation and Implications:  Analyze your results in the context of existing literature and theories. Discuss the practical implications of your findings.
  • Limitations:  Address the limitations of your study, including potential biases or threats to internal validity.
  • Future Research:  Suggest areas for future research and how your study contributes to the field.

Ethical Considerations in Reporting

Ethical reporting is paramount in Quasi-Experimental research. Ensure that you adhere to ethical standards, including:

  • Informed Consent:  Clearly state that informed consent was obtained from all participants, and describe the informed consent process.
  • Protection of Participants:  Explain how you protected the rights and well-being of your participants throughout the study.
  • Confidentiality:  Detail how you maintained privacy and anonymity, especially when presenting individual data.
  • Disclosure of Conflicts of Interest:  Declare any potential conflicts of interest that could influence the interpretation of your findings.

Common Pitfalls to Avoid

When reporting your Quasi-Experimental research, watch out for common pitfalls that can diminish the quality and impact of your work:

  • Overgeneralization:  Be cautious not to overgeneralize your findings. Clearly state the limits of your study and the populations to which your results can be applied.
  • Misinterpretation of Causality:  Clearly articulate the limitations in inferring causality in Quasi-Experimental research. Avoid making strong causal claims unless supported by solid evidence.
  • Ignoring Ethical Concerns:  Ethical considerations are paramount. Failing to report on informed consent, ethical oversight, and participant protection can undermine the credibility of your study.

Guidelines for Transparent Reporting

To enhance the transparency and reproducibility of your Quasi-Experimental research, consider adhering to established reporting guidelines, such as:

  • CONSORT Statement:  If your study involves interventions or treatments, follow the CONSORT guidelines for transparent reporting of randomized controlled trials.
  • STROBE Statement:  For observational studies, the STROBE statement provides guidance on reporting essential elements.
  • PRISMA Statement:  If your research involves systematic reviews or meta-analyses, adhere to the PRISMA guidelines.
  • Transparent Reporting of Evaluations with Non-Randomized Designs (TREND):  TREND guidelines offer specific recommendations for transparently reporting non-randomized designs, including Quasi-Experimental research.

By following these reporting guidelines and maintaining the highest ethical standards, you can contribute to the advancement of knowledge in your field and ensure the credibility and impact of your Quasi-Experimental research findings.

Quasi-Experimental Design Challenges

Conducting a Quasi-Experimental study can be fraught with challenges that may impact the validity and reliability of your findings. We'll take a look at some common challenges and provide strategies on how you can address them effectively.

Selection Bias

Challenge:  Selection bias occurs when non-randomized groups differ systematically in ways that affect the study's outcome. This bias can undermine the validity of your research, as it implies that the groups are not equivalent at the outset of the study.

Addressing Selection Bias:

  • Matching:  Employ matching techniques to create comparable treatment and control groups. Match participants based on relevant characteristics, such as age, gender, or prior performance, to balance the groups.
  • Statistical Controls:  Use statistical controls to account for differences between groups. Include covariates in your analysis to adjust for potential biases.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess how vulnerable your results are to selection bias. Explore different scenarios to understand the impact of potential bias on your conclusions.

History Effects

Challenge:  History effects refer to external events or changes over time that influence the study's results. These external factors can confound your research by introducing variables you did not account for.

Addressing History Effects:

  • Collect Historical Data:  Gather extensive historical data to understand trends and patterns that might affect your study. By having a comprehensive historical context, you can better identify and account for historical effects.
  • Control Groups:  Include control groups whenever possible. By comparing the treatment group's results to those of a control group, you can account for external influences that affect both groups equally.
  • Time Series Analysis :  If applicable, use time series analysis to detect and account for temporal trends. This method helps differentiate between the effects of the independent variable and external events.

Maturation Effects

Challenge:  Maturation effects occur when participants naturally change or develop throughout the study, independent of the intervention. These changes can confound your results, making it challenging to attribute observed effects solely to the independent variable.

Addressing Maturation Effects:

  • Randomization:  If possible, use randomization to distribute maturation effects evenly across treatment and control groups. Random assignment minimizes the impact of maturation as a confounding variable.
  • Matched Pairs:  If randomization is not feasible, employ matched pairs or statistical controls to ensure that both groups experience similar maturation effects.
  • Shorter Time Frames:  Limit the duration of your study to reduce the likelihood of significant maturation effects. Shorter studies are less susceptible to long-term maturation.

Regression to the Mean

Challenge:  Regression to the mean is the tendency for extreme scores on a variable to move closer to the mean upon retesting. This can create the illusion of an intervention's effectiveness when, in reality, it's a natural statistical phenomenon.

Addressing Regression to the Mean:

  • Use Control Groups:  Include control groups in your study to provide a baseline for comparison. This helps differentiate genuine intervention effects from regression to the mean.
  • Multiple Data Points:  Collect numerous data points to identify patterns and trends. If extreme scores regress to the mean in subsequent measurements, it may be indicative of regression to the mean rather than a true intervention effect.
  • Statistical Analysis:  Employ statistical techniques that account for regression to the mean when analyzing your data. Techniques like analysis of covariance (ANCOVA) can help control for baseline differences.

Attrition and Mortality

Challenge:  Attrition refers to the loss of participants over the course of your study, while mortality is the permanent loss of participants. High attrition rates can introduce biases and affect the representativeness of your sample.

Addressing Attrition and Mortality:

  • Careful Participant Selection:  Select participants who are likely to remain engaged throughout the study. Consider factors that may lead to attrition, such as participant motivation and commitment.
  • Incentives:  Provide incentives or compensation to participants to encourage their continued participation.
  • Follow-Up Strategies:  Implement effective follow-up strategies to reduce attrition. Regular communication and reminders can help keep participants engaged.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess the impact of attrition and mortality on your results. Compare the characteristics of participants who dropped out with those who completed the study.

Testing Effects

Challenge:  Testing effects occur when the mere act of testing or assessing participants affects their subsequent performance. This phenomenon can lead to changes in the dependent variable that are unrelated to the independent variable.

Addressing Testing Effects:

  • Counterbalance Testing:  If possible, counterbalance the order of tests or assessments between treatment and control groups. This helps distribute the testing effects evenly across groups.
  • Control Groups:  Include control groups subjected to the same testing or assessment procedures as the treatment group. By comparing the two groups, you can determine whether testing effects have influenced the results.
  • Minimize Testing Frequency:  Limit the frequency of testing or assessments to reduce the likelihood of testing effects. Conducting fewer assessments can mitigate the impact of repeated testing on participants.

By proactively addressing these common challenges, you can enhance the validity and reliability of your Quasi-Experimental study, making your findings more robust and trustworthy.

Quasi-experimental design is a powerful tool that helps researchers investigate cause-and-effect relationships in real-world situations where strict control is not always possible. By understanding the key concepts, types of designs, and how to address challenges, you can conduct robust research and contribute valuable insights to your field. Remember, quasi-experimental design bridges the gap between controlled experiments and purely observational studies, making it an essential approach in various fields, from business and market research to public policy and beyond. So, whether you're a researcher, student, or decision-maker, the knowledge of quasi-experimental design empowers you to make informed choices and drive positive changes in the world.

How to Supercharge Quasi-Experimental Design with Real-Time Insights?

Introducing Appinio , the real-time market research platform that transforms the world of quasi-experimental design. Imagine having the power to conduct your own market research in minutes, obtaining actionable insights that fuel your data-driven decisions. Appinio takes care of the research and tech complexities, freeing you to focus on what truly matters for your business.

Here's why Appinio stands out:

  • Lightning-Fast Insights:  From formulating questions to uncovering insights, Appinio delivers results in minutes, ensuring you get the answers you need when you need them.
  • No Research Degree Required:  Our intuitive platform is designed for everyone, eliminating the need for a PhD in research. Anyone can dive in and start harnessing the power of real-time consumer insights.
  • Global Reach, Local Expertise:  With access to over 90 countries and the ability to define precise target groups based on 1200+ characteristics, you can conduct Quasi-Experimental research on a global scale while maintaining a local touch.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is Brand Architecture Models Strategy Examples

27.08.2024 | 34min read

What is Brand Architecture? Models, Strategy, Examples

What is Voice of the Customer VoC Program Examples

22.08.2024 | 32min read

What is Voice of the Customer (VoC)? Program, Examples

What is Employee Experience EX and How to Improve It

20.08.2024 | 31min read

What is Employee Experience (EX) and How to Improve It?

(Stanford users can avoid this Captcha by logging in.)

  • Send to text email RefWorks EndNote printer

Quasi-experimentation : a guide to design and analysis

Available online, at the library.

quasi experiment duration

Green Library

Items in Stacks
Call number Note Status
H62 .R4145 2019 Unknown

More options

  • Find it at other libraries via WorldCat
  • Contributors

Description

Creators/contributors, contents/summary.

  • 1. Introduction Overview 1.1 Introduction 1.2 The Definition of Quasi-Experiment 1.3 Why Study Quasi-Experiments 1.4 Overview of the Volume 1.5 Conclusions 1.6 Suggested Reading
  • 2. Cause and Effect Overview 2.1 Introduction 2.2 Practical Comparisons and Confounds 2.3 The Counterfactual Definition 2.4 The Stable-Unit-Treatment-Value Assumption (SUTVA) 2.5 The Causal Question Being Addressed 2.6 Conventions 2.7 Conclusions 2.8 Suggested Reading
  • 3. Threats to Validity Overview 3.1 Introduction 3.2 The Size of an Effect 3.3 Construct Validity 3.4 Internal Validity 3.5 Statistical Conclusion Validity 3.6 External Validity 3.7 Trade-offs among Types of Validity 3.8 A Focus on Internal and Statistical Conclusion Validity 3.9 Conclusions 3.10 Suggested Reading
  • 4. Randomized Experiments Overview 4.1 Introduction 4.2 Between-Groups Randomized Experiments 4.3 Examples of Randomized Experiments Conducted in the Field 4.4 Selection Differences 4.5 Analysis of Data from the Posttest-Only Randomized Experiment 4.6 Analysis of Data from the Pretest-Posttest Randomized Experiment 4.7 Noncompliance with Treatment Assignment 4.8 Missing Data and Attrition 4.9 Cluster-Randomized Experiments 4.10 Other Threats to Validity in Randomized Experiments 4.11 Strengths and Weaknesses 4.12 Conclusions 4.13 Suggested Reading
  • 5. One-Group Posttest-Only Designs Overview 5.1 Introduction 5.2 Examples of One-Group Posttest-Only Designs 5.3 Strengths and Weaknesses 5.4 Conclusions 5.5 Suggested Reading
  • 6. Pretest-Posttest Designs Overview 6.1 Introduction 6.2 Examples of Pretest-Posttest Designs 6.3 Threats to Internal Validity 6.4 Design Variations 6.5 Strengths and Weaknesses 6.6 Conclusions 6.7 Suggested Reading
  • 7. Nonequivalent Group Designs Overview 7.1 Introduction 7.2 Two Basic Nonequivalent Group Designs 7.3 Change-Score Analysis 7.4 Analysis of Covariance 7.5 Matching and Blocking 7.6 Propensity Scores 7.7 Instrumental Variables 7.8 Selection Models 7.9 Sensitivity Analyses and Tests of Ignorability 7.10 Other Threats to Internal Validity besides Selection Differences 7.11 Alternative Nonequivalent Group Designs 7.12 Empirical Evaluations and Best Practices 7.13 Strengths and Weaknesses 7.14 Conclusions 7.15 Suggested Reading
  • 8. Regression Discontinuity Designs Overview 8.1 Introduction 8.2 The Quantitative Assignment Variable 8.3 Statistical Analysis 8.4 Fuzzy Regression Discontinuity 8.5 Threats to Internal Validity 8.6 Supplemented Designs 8.7 Cluster Regression Discontinuity Designs 8.8 Strengths and Weaknesses 8.9 Conclusions 8.10 Suggested Reading
  • 9. Interrupted Time-Series Designs Overview 9.1 Introduction 9.2 The Temporal Pattern of the Treatment Effect 9.3 Two Versions of the Design 9.4 The Statistical Analysis of Data When N = 1 9.5 The Statistical Analysis of Data When N Is Large 9.6 Threats to Internal Validity 9.7 Design Supplements I: Multiple Interventions 9.8 Design Supplements II: Basic Comparative ITS Designs 9.9 Design Supplements III: Comparative ITS Designs with Multiple Treatments 9.10 Single-Case Designs 9.11 Strengths and Weaknesses 9.12 Conclusions 9.13 Suggested Reading
  • 10. A Typology of Comparisons Overview 10.1 Introduction 10.2 The Principle of Parallelism 10.3 Comparisons across Participants 10.4 Comparisons across Times 10.5 Comparisons across Settings 10.6 Comparisons across Outcome Measures 10.7 Within- and Between-Subject Designs 10.8 A Typology of Comparisons 10.9 Random Assignment to Treatment Conditions 10.10 Assignment to Treatment Conditions Based on an Explicit Quantitative Ordering 10.11 Nonequivalent Assignment to Treatment Conditions 10.12 Credibility and Ease of Implementation 10.13 The Most Commonly Used Comparisons 10.14 Conclusions 10.15 Suggested Reading
  • 11. Methods of Design Elaboration Overview 11.1 Introduction 11.2 Three Methods of Design Elaboration 11.3 The Four Size-of-Effect Factors as Sources for the Two Estimates in Design Elaboration 11.4 Conclusions 11.5 Suggested Reading
  • 12. Unfocused Design Elaboration and Pattern Matching Overview 12.1 Introduction 12.2 Four Examples of Unfocused Design Elaboration 12.3 Pattern Matching 12.4 Conclusions 12.5 Suggested Reading
  • 13. Principles of Design and Analysis for Estimating Effects Overview 13.1 Introduction 13.2 Design Trumps Statistics 13.3 Customized Designs 13.4 Threats to Validity 13.5 The Principle of Parallelism 13.6 The Typology of Simple Comparisons 13.7 Pattern Matching and Design Elaborations 13.8 Size of Effects 13.9 Bracketing Estimates of Effects 13.10 Critical Multiplism 13.11 Mediation 13.12 Moderation 13.13 Implementation 13.14 Qualitative Research Methods 13.15 Honest and Open Reporting of Results 13.16 Conclusions 13.17 Suggested Reading Appendix: The Problems of Overdetermination and Preemption A.1 The Problem of Overdetermination A.2 The Problem of Preemption References Glossary Author Index Subject Index About the Author.
  • (source: Nielsen Book Data)

Bibliographic information

Browse related items.

Stanford University

  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Non-Discrimination
  • Accessibility

© Stanford University , Stanford , California 94305 .

Quasi-Experiment: Understand What It Is, Types & Examples

Discover the concept of quasi-experiment, its various types, real-world examples, and how QuestionPro aids in conducting these studies.

' src=

Quasi-experimental research designs have gained significant recognition in the scientific community due to their unique ability to study cause-and-effect relationships in real-world settings. Unlike true experiments, quasi-experiment lack random assignment of participants to groups, making them more practical and ethical in certain situations. In this article, we will delve into the concept, applications, and advantages of quasi-experiments, shedding light on their relevance and significance in the scientific realm.

What Is A Quasi-Experiment Research Design?

Quasi-experimental research designs are research methodologies that resemble true experiments but lack the randomized assignment of participants to groups. In a true experiment, researchers randomly assign participants to either an experimental group or a control group, allowing for a comparison of the effects of an independent variable on the dependent variable. However, in quasi-experiments, this random assignment is often not possible or ethically permissible, leading to the adoption of alternative strategies.

Types Of Quasi-Experimental Designs

There are several types of quasi-experiment designs to study causal relationships in specific contexts. Some common types include:

Non-Equivalent Groups Design

This design involves selecting pre-existing groups that differ in some key characteristics and comparing their responses to the independent variable. Although the researcher does not randomly assign the groups, they can still examine the effects of the independent variable.

Regression Discontinuity

This design utilizes a cutoff point or threshold to determine which participants receive the treatment or intervention. It assumes that participants on either side of the cutoff are similar in all other aspects, except for their exposure to the independent variable.

Interrupted Time Series Design

This design involves measuring the dependent variable multiple times before and after the introduction of an intervention or treatment. By comparing the trends in the dependent variable, researchers can infer the impact of the intervention.

Natural Experiments

Natural experiments take advantage of naturally occurring events or circumstances that mimic the random assignment found in true experiments. Participants are exposed to different conditions in situations identified by researchers without any manipulation from them.

Application of the Quasi-Experiment Design

quasi experiment duration

Quasi-experimental research designs find applications in various fields, ranging from education to public health and beyond. One significant advantage of quasi-experiments is their feasibility in real-world settings where randomization is not always possible or ethical.

Ethical Reasons

Ethical concerns often arise in research when randomizing participants to different groups could potentially deny individuals access to beneficial treatments or interventions. In such cases, quasi-experimental designs provide an ethical alternative, allowing researchers to study the impact of interventions without depriving anyone of potential benefits.

Examples Of Quasi-Experimental Design

Let’s explore a few examples of quasi-experimental designs to understand their application in different contexts.

Design Of Non-Equivalent Groups

Determining the effectiveness of math apps in supplementing math classes.

Imagine a study aiming to determine the effectiveness of math apps in supplementing traditional math classes in a school. Randomly assigning students to different groups might be impractical or disrupt the existing classroom structure. Instead, researchers can select two comparable classes, one receiving the math app intervention and the other continuing with traditional teaching methods. By comparing the performance of the two groups, researchers can draw conclusions about the app’s effectiveness.

To conduct a quasi-experiment study like the one mentioned above, researchers can utilize QuestionPro , an advanced research platform that offers comprehensive survey and data analysis tools. With QuestionPro, researchers can design surveys to collect data, analyze results, and gain valuable insights for their quasi-experimental research.

How QuestionPro Helps In Quasi-Experimental Research?

QuestionPro’s powerful features, such as random assignment of participants, survey branching, and data visualization, enable researchers to efficiently conduct and analyze quasi-experimental studies. The platform provides a user-friendly interface and robust reporting capabilities, empowering researchers to gather data, explore relationships, and draw meaningful conclusions.

In some cases, researchers can leverage natural experiments to examine causal relationships. 

Determining The Effectiveness Of Teaching Modern Leadership Techniques In Start-Up Businesses

Consider a study evaluating the effectiveness of teaching modern leadership techniques in start-up businesses. Instead of artificially assigning businesses to different groups, researchers can observe those that naturally adopt modern leadership techniques and compare their outcomes to those of businesses that have not implemented such practices.

Advantages and Disadvantages Of The Quasi-Experimental Design

Quasi-experimental designs offer several advantages over true experiments, making them valuable tools in research:

  • Scope of the research : Quasi-experiments allow researchers to study cause-and-effect relationships in real-world settings, providing valuable insights into complex phenomena that may be challenging to replicate in a controlled laboratory environment.
  • Regression Discontinuity : Researchers can utilize regression discontinuity to evaluate the effects of interventions or treatments when random assignment is not feasible. This design leverages existing data and naturally occurring thresholds to draw causal inferences.

Disadvantage

Lack of random assignment : Quasi-experimental designs lack the random assignment of participants, which introduces the possibility of confounding variables affecting the results. Researchers must carefully consider potential alternative explanations for observed effects.

What Are The Different Quasi-Experimental Study Designs?

Quasi-experimental designs encompass various approaches, including nonequivalent group designs, interrupted time series designs, and natural experiments. Each design offers unique advantages and limitations, providing researchers with versatile tools to explore causal relationships in different contexts.

Example Of The Natural Experiment Approach

Researchers interested in studying the impact of a public health campaign aimed at reducing smoking rates may take advantage of a natural experiment. By comparing smoking rates in a region that has implemented the campaign to a similar region that has not, researchers can examine the effectiveness of the intervention.

Differences Between Quasi-Experiments And True Experiments

Quasi-experiments and true experiments differ primarily in their ability to randomly assign participants to groups. While true experiments provide a higher level of control, quasi-experiments offer practical and ethical alternatives in situations where randomization is not feasible or desirable.

Example Comparing A True Experiment And Quasi-Experiment

In a true experiment investigating the effects of a new medication on a specific condition, researchers would randomly assign participants to either the experimental group, which receives the medication, or the control group, which receives a placebo. In a quasi-experiment, researchers might instead compare patients who voluntarily choose to take the medication to those who do not, examining the differences in outcomes between the two groups.

Quasi-Experiment: A Quick Wrap-Up

Quasi-experimental research designs play a vital role in scientific inquiry by allowing researchers to investigate cause-and-effect relationships in real-world settings. These designs offer practical and ethical alternatives to true experiments, making them valuable tools in various fields of study. With their versatility and applicability, quasi-experimental designs continue to contribute to our understanding of complex phenomena.

Turn Your Data Into Easy-To-Understand And Dynamic Stories

When you wish to explain any complex data, it’s always advised to break it down into simpler visuals or stories. This is where Mind the Graph comes in. It is a platform that helps researchers and scientists to turn their data into easy-to-understand and dynamic stories, helping the audience understand the concepts better. Sign Up now to explore the library of scientific infographics . 

Quasi-Experiment: Understand What It Is, Types & Examples

Related Articles

deception in research

Subscribe to our newsletter

Exclusive high quality content about effective visual communication in science.

Sign Up for Free

Try the best infographic maker and promote your research with scientifically-accurate beautiful figures

no credit card required

Content tags

en_US

Centilio Logo

Unraveling the Quasi-Experimental Design: A Comprehensive Guide

Ravi Gandhi

Explore nuanced aspects of Quasi-Experimental Design, offering in-depth understanding and practical insights

Explore nuanced aspects of Quasi-Experimental Design, offering in-depth understanding and practical insights

Introduction to Quasi-Experimental Design

The world of research is vast, encompassing numerous methods and designs. One of the pivotal designs, often dubbed the middle ground between experimental and observational studies, is the Quasi-Experimental Design. Originating from a necessity to address real-world scenarios where randomized control isn’t always feasible, this design has carved a niche for itself in contemporary research.

Understanding Quasi-Experimental Design

Quasi-experimental design, at its core, is a research method where the researcher doesn’t randomly assign participants to treatment or control groups. It’s a step away from the rigidity of true experimental designs but offers more structure than observational studies.

Contrary to true experimental designs, where variables are controlled meticulously, quasi-experimental designs often deal with pre-existing groups. This gives it a unique flavor, enabling researchers to study effects in a more natural setting.

Types of Quasi-Experimental Designs

Venturing into the world of quasi-experimental designs introduces you to various sub-types:

  • Time-Series Design: A classic method where the same group is observed multiple times before and after a treatment.
  • Nonequivalent Control Group Design: Involves two distinct groups – one receiving the treatment and another not, but without random assignment.
  • Interrupted Time-Series Design: Observations made at multiple time points with an “interruption” or treatment in between.

Advantages of Quasi-Experimental Design

Like a breath of fresh air, quasi-experimental design brings along several advantages:

  • Practicality and Real-world Application: It’s grounded in reality, making it applicable in real-world scenarios where random assignment is impossible.
  • Ability to Handle Ethical Concerns: In situations where it’s unethical to withhold treatment, this design shines.
  • Enhanced Ecological Validity: The results often reflect real-world conditions, making them more generalizable.

Challenges and Criticisms

However, it’s not all sunshine and roses. The quasi-experimental design faces its share of criticism:

  • Potential for Confounding Variables: Without random assignment, there’s always the risk of unseen factors affecting the outcome.
  • Limited Internal Validity: It’s hard to establish cause and effect conclusively.
  • Dependence on External Factors: The design can be influenced by external events, skewing results.

Implementing Quasi-Experimental Design

Implementing this design requires a meticulous approach:

  • Key Steps in Execution: From identifying the research question to collecting and analyzing data, each step must be executed with precision.
  • Ensuring Reliability and Validity: Rigorous checks and balances are essential to ensure results are consistent and reflect the true nature of the phenomenon studied.
  • Practical Tips for Researchers: Always be aware of potential confounders and be ready to adapt as real-world scenarios evolve.

Applications in Various Fields

The versatility of quasi-experimental design is evident in its wide-ranging applications:

  • Health and Medicine: From studying the effects of a new drug to understanding behavioral changes, it’s a staple in medical research.
  • Social Sciences: Understanding societal changes, behaviors, and patterns often leans on this design.
  • Business and Economics: Whether it’s market research or understanding consumer behavior, quasi-experimental designs have found their footing.

Quasi-Experimental Design in Digital Age

The dawn of the digital era has reshaped quasi-experimental design:

  • Role of Technology and Software: Modern tools assist in data collection, analysis, and interpretation, streamlining the research process.
  • Data Collection and Analysis Methods: Digital platforms offer a treasure trove of data, making research richer and more comprehensive.

Comparing with Other Research Methods

When juxtaposed with other methods:

  • Qualitative vs. Quantitative: Quasi-experimental design can be tailored for both, offering flexibility.
  • Experimental vs. Non-experimental: It beautifully bridges the gap, providing a balanced approach.
  • What sets quasi-experimental design apart from true experimental design? 

True experimental design involves random assignment, while quasi-experimental does not.

  • Is quasi-experimental design qualitative or quantitative? 

It can be both. The nature of the research question dictates the approach.

  • Are results from quasi-experimental designs reliable? 

Yes, provided the study is designed and executed meticulously.

  • Why choose quasi-experimental design over observational studies? 

It offers a structured approach, allowing for better control over variables.

  • Can technology skew results in quasi-experimental designs? 

If not accounted for, technology can introduce confounding variables.

  • What’s the future of quasi-experimental design in research? 

With evolving tools and methods, it’s poised to become more refined and precise.

Conclusion: The Future of Quasi-Experimental Design

The realm of quasi-experimental design, with its adaptability and relevance, promises a bright future. As tools evolve and research methodologies become more sophisticated, the quasi-experimental design will continue to play a pivotal role, bridging the gap between strict experimental methods and free-form observational studies.

External Links/ Sources:

Quasi-experiment

The Use and Interpretation of Quasi-Experimental Studies in Medical Informatics

Quasi-Experimental Research

Quasi-Experimental Design | Definition, Types & Examples

Undeniable Reasons to Harness Employee Strengths: Unlocking Maximum Productivity

The top 10 insights into customer experience management software, leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Sign up for a free trial today!

Centilio’s end-to-end marketing platform solves every marketing need of your organization.

TRY FOR FREE

Deleting your Account

Add a contact in centilio, accessing the sign journey.

© 2023 Centilio Inc, All Rights Reserved.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

quasi experiment duration

Home Market Research Research Tools and Apps

Quasi-experimental Research: What It Is, Types & Examples

quasi-experimental research is research that appears to be experimental but is not.

Much like an actual experiment, quasi-experimental research tries to demonstrate a cause-and-effect link between a dependent and an independent variable. A quasi-experiment, on the other hand, does not depend on random assignment, unlike an actual experiment. The subjects are sorted into groups based on non-random variables.

What is Quasi-Experimental Research?

“Resemblance” is the definition of “quasi.” Individuals are not randomly allocated to conditions or orders of conditions, even though the regression analysis is changed. As a result, quasi-experimental research is research that appears to be experimental but is not.

The directionality problem is avoided in quasi-experimental research since the regression analysis is altered before the multiple regression is assessed. However, because individuals are not randomized at random, there are likely to be additional disparities across conditions in quasi-experimental research.

As a result, in terms of internal consistency, quasi-experiments fall somewhere between correlational research and actual experiments.

The key component of a true experiment is randomly allocated groups. This means that each person has an equivalent chance of being assigned to the experimental group or the control group, depending on whether they are manipulated or not.

Simply put, a quasi-experiment is not a real experiment. A quasi-experiment does not feature randomly allocated groups since the main component of a real experiment is randomly assigned groups. Why is it so crucial to have randomly allocated groups, given that they constitute the only distinction between quasi-experimental and actual  experimental research ?

Let’s use an example to illustrate our point. Let’s assume we want to discover how new psychological therapy affects depressed patients. In a genuine trial, you’d split half of the psych ward into treatment groups, With half getting the new psychotherapy therapy and the other half receiving standard  depression treatment .

And the physicians compare the outcomes of this treatment to the results of standard treatments to see if this treatment is more effective. Doctors, on the other hand, are unlikely to agree with this genuine experiment since they believe it is unethical to treat one group while leaving another untreated.

A quasi-experimental study will be useful in this case. Instead of allocating these patients at random, you uncover pre-existing psychotherapist groups in the hospitals. Clearly, there’ll be counselors who are eager to undertake these trials as well as others who prefer to stick to the old ways.

These pre-existing groups can be used to compare the symptom development of individuals who received the novel therapy with those who received the normal course of treatment, even though the groups weren’t chosen at random.

If any substantial variations between them can be well explained, you may be very assured that any differences are attributable to the treatment but not to other extraneous variables.

As we mentioned before, quasi-experimental research entails manipulating an independent variable by randomly assigning people to conditions or sequences of conditions. Non-equivalent group designs, pretest-posttest designs, and regression discontinuity designs are only a few of the essential types.

What are quasi-experimental research designs?

Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn’t give full control over the independent variable(s) like true experimental designs do.

In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at random. Instead, people are put into groups based on things they already have in common, like their age, gender, or how many times they have seen a certain stimulus.

Because the assignments are not random, it is harder to draw conclusions about cause and effect than in a real experiment. However, quasi-experimental designs are still useful when randomization is not possible or ethical.

The true experimental design may be impossible to accomplish or just too expensive, especially for researchers with few resources. Quasi-experimental designs enable you to investigate an issue by utilizing data that has already been paid for or gathered by others (often the government). 

Because they allow better control for confounding variables than other forms of studies, they have higher external validity than most genuine experiments and higher  internal validity  (less than true experiments) than other non-experimental research.

Is quasi-experimental research quantitative or qualitative?

Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. 

It draws statistical conclusions from quantitative data. Qualitative data can enhance quasi-experimental research by revealing participants’ experiences and opinions, but quantitative data is the method’s foundation.

Quasi-experimental research types

There are many different sorts of quasi-experimental designs. Three of the most popular varieties are described below: Design of non-equivalent groups, Discontinuity in regression, and Natural experiments.

Design of Non-equivalent Groups

Example: design of non-equivalent groups, discontinuity in regression, example: discontinuity in regression, natural experiments, example: natural experiments.

However, because they couldn’t afford to pay everyone who qualified for the program, they had to use a random lottery to distribute slots.

Experts were able to investigate the program’s impact by utilizing enrolled people as a treatment group and those who were qualified but did not play the jackpot as an experimental group.

How QuestionPro helps in quasi-experimental research?

QuestionPro can be a useful tool in quasi-experimental research because it includes features that can assist you in designing and analyzing your research study. Here are some ways in which QuestionPro can help in quasi-experimental research:

Design surveys

Randomize participants, collect data over time, analyze data, collaborate with your team.

With QuestionPro, you have access to the most mature market research platform and tool that helps you collect and analyze the insights that matter the most. By leveraging InsightsHub, the unified hub for data management, you can ​​leverage the consolidated platform to organize, explore, search, and discover your  research data  in one organized data repository . 

Optimize Your quasi-experimental research with QuestionPro. Get started now!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

quasi experiment duration

Why You Should Attend XDAY 2024

Aug 30, 2024

Alchemer vs Qualtrics

Alchemer vs Qualtrics: Find out which one you should choose

target population

Target Population: What It Is + Strategies for Targeting

Aug 29, 2024

Microsoft Customer Voice vs QuestionPro

Microsoft Customer Voice vs QuestionPro: Choosing the Best

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

Frequently asked questions

What is a quasi-experiment.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 7: Nonexperimental Research

Quasi-Experimental Research

Learning Objectives

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix  quasi  means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A  nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This design would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a  pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of  history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of  maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is  regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study  because  of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is  spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001) [2] . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952) [3] . But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate  without  receiving psychotherapy. This parallel suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here: Classics in the History of Psychology .

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980) [4] . They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Interrupted Time Series Design

A variant of the pretest-posttest design is the  interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this one is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979) [5] . Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.3 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of  Figure 7.3 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of  Figure 7.3 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Image description available

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does  not  receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve  more  than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this change in attitude could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.
  • regression to the mean
  • spontaneous remission

Image Descriptions

Figure 7.3 image description: Two line graphs charting the number of absences per week over 14 weeks. The first 7 weeks are without treatment and the last 7 weeks are with treatment. In the first line graph, there are between 4 to 8 absences each week. After the treatment, the absences drop to 0 to 3 each week, which suggests the treatment worked. In the second line graph, there is no noticeable change in the number of absences per week after the treatment, which suggests the treatment did not work. [Return to Figure 7.3]

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin. ↵
  • Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146. ↵
  • Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324. ↵
  • Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press. ↵

A between-subjects design in which participants have not been randomly assigned to conditions.

The dependent variable is measured once before the treatment is implemented and once after it is implemented.

A category of alternative explanations for differences between scores such as events that happened between the pretest and posttest, unrelated to the study.

An alternative explanation that refers to how the participants might have changed between the pretest and posttest in ways that they were going to anyway because they are growing and learning.

The statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion.

The tendency for many medical and psychological problems to improve over time without any form of treatment.

A set of measurements taken at intervals over a period of time that are interrupted by a treatment.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

quasi experiment duration

  • Open access
  • Published: 03 November 2022

A comparison of four quasi-experimental methods: an analysis of the introduction of activity-based funding in Ireland

  • Gintare Valentelyte   ORCID: orcid.org/0000-0001-9188-3854 1 , 3 ,
  • Conor Keegan   ORCID: orcid.org/0000-0003-1367-1156 2 &
  • Jan Sorensen   ORCID: orcid.org/0000-0003-0857-9267 3  

BMC Health Services Research volume  22 , Article number:  1311 ( 2022 ) Cite this article

3350 Accesses

3 Citations

6 Altmetric

Metrics details

Health services research often relies on quasi-experimental study designs in the estimation of treatment effects of a policy change or an intervention. The aim of this study is to compare some of the commonly used non-experimental methods in estimating intervention effects, and to highlight their relative strengths and weaknesses. We estimate the effects of Activity-Based Funding, a hospital financing reform of Irish public hospitals, introduced in 2016.

We estimate and compare four analytical methods: Interrupted time series analysis, Difference-in-Differences, Propensity Score Matching Difference-in-Differences and the Synthetic Control method. Specifically, we focus on the comparison between the control-treatment methods and the non-control-treatment approach, interrupted time series analysis. Our empirical example evaluated the length of stay impact post hip replacement surgery, following the introduction of Activity-Based Funding in Ireland. We also contribute to the very limited research reporting the impacts of Activity-Based-Funding within the Irish context.

Interrupted time-series analysis produced statistically significant results different in interpretation, while the Difference-in-Differences, Propensity Score Matching Difference-in-Differences and Synthetic Control methods incorporating control groups, suggested no statistically significant intervention effect, on patient length of stay.

Our analysis confirms that different analytical methods for estimating intervention effects provide different assessments of the intervention effects. It is crucial that researchers employ appropriate designs which incorporate a counterfactual framework. Such methods tend to be more robust and provide a stronger basis for evidence-based policy-making.

Peer Review reports

Introduction

In health services research, quasi-experimental methods continue to be the main approaches used in the identification of impacts of policy interventions. These methods provide alternatives to randomised experiments e.g. Randomised Controlled Trials (RCTs), which are less prevalent in health policy research, particularly for larger scale interventions. Examples of previously conducted experiments include the RAND Health Insurance Experiment [ 1 ] and the Oregon Health Insurance Experiment [ 2 ] which have since led to the restructuring of health insurance plan policies across the United States. Although such large-scale experiments can generate robust evidence for informing health policy decisions, they are often too complex, expensive, unethical or infeasible to implement for larger scale policies and interventions [ 3 , 4 ]. Quasi-experimental methods provide an alternative means to policy evaluation, using non-experimental data sources, where randomisation is infeasible or unethical when the intervention already occurred and its evaluation occurred later [ 3 ].

The evaluation of policy impacts, regardless of analytical approach, is aimed at identifying causal effects of a policy change. A concise guide highlights the approaches which are appropriate for evaluating the impact of health policies [ 3 ]. A recent review identified a number of methods appropriate for estimating intervention effects [ 5 ]. Additionally, several control-treatment approaches have recently been compared in terms of their relative performance [ 6 , 7 ].

However, there is limited empirical evidence in the health services research field comparing control-treatment analytical approaches to non-control-treatment approaches, used for estimating health intervention or policy effects. We use an empirical example of Activity-Based Funding (ABF), a hospital financing intervention, to estimate the policy impact using four non-experimental methods: Interrupted Time-Series (ITS), Difference-in-Differences (DiD), Propensity Score Matching Difference-in-Differences (PSM DiD), and Synthetic Control (SC). A review of the application of these methods in the literature examining ABF impacts has recently been undertaken [ 5 ]. Out of 19 identified studies, six studies employed ITS, seven employed DiD and one study employed the SC approach [ 5 ]. The identified effects, as assessed by reporting on a set of hospital outcomes, varied based on the analytical method that was used. The studies which employed ITS all reported statistically significant effects post-ABF which have led to increased levels of hospital activity [ 8 , 9 ], and reductions in patient length of stay (LOS) [ 10 , 11 , 12 , 13 ]. In contrast, the evidence is more mixed, among the remaining studies which employed control-treatment methods. For example, significant increases in hospital activity were reported in three studies which used the DiD approach [ 14 , 15 , 16 ], while another study found no significant impacts in terms of activity [ 17 ]. Similarly, contrasting evidence in terms of changes in LOS [ 16 , 18 , 19 ] and mortality [ 18 , 20 ] were also reported. Therefore, the overall evidence on the impacts of ABF on hospital outcomes can be considered mixed, and as highlighted by Palmer et al. (2014) [ 21 ] ‘Inferences regarding the impact of ABF are limited both by inevitable study design constraints (randomized trials of ABF are unlikely to be feasible) and by avoidable weaknesses in methodology of many studies’ [ 21 ].

The aim of this study is to compare these analytical methods in their estimation of intervention effects, using an empirical case of ABF introduction in Ireland. Specifically, we focus on the comparison of control-treatment analytical approaches (DiD, PSM DiD, SC), to ITS, a commonly used non-control-treatment approach for evaluating policies and interventions. Additionally, we contribute to the very limited research evidence assessing the impacts of ABF within the Irish context.

ABF and the Irish health system

Activity-based funding (ABF) is a financing model that incentivises hospitals to deliver care more efficiently [ 22 ]. Under ABF, hospitals receive prospectively set payments based on the number and type of patients treated [ 22 ]. Services provided to patients are reflected by an efficient price of providing those services and adjustments incorporated for different patient populations served. Prices are determined prospectively e.g. in terms of Diagnosis Related Groups (DRGs), and reflect differences in hospital activity, based on types of diagnosis and procedures provided to patients [ 23 ]. DRGs provide transparent price differences, directly linking hospital services provision to hospital payments. In theory, this incentivises hospitals to deliver more efficient healthcare (e.g. shorten LOS) and to be more transparent in their allocation of resources and finances [ 22 , 24 ].

The Irish healthcare system is predominantly a public health system, with the majority of health expenditure raised through general taxation (72%), and remainder through out-of-pocket payments (13%) and voluntary private health insurance (15%) [ 25 ]. In Ireland, most hospital care is delivered in public hospitals and this care is mostly government-financed, with approximately one-fifth of care delivered in public hospitals privately financed [ 25 , 26 ]. Patients who receive private day or inpatient treatment in public hospitals are required to pay private accommodation and consultant charges. The majority of private patient activity in public hospitals is funded through private health insurance with the remainder through out-of-pocket payments. Public or private patient status relates to whether the hospital patient saw their consultant on a public or private basis [ 27 ]. For non-consultant hospital staff, the same publicly funded staff are employed in delivering care to both publicly and privately financed patients [ 27 ].

Traditionally, all Irish public hospitals were funded on a budgetary block grant basis based on historical performance, making it difficult to measure and monitor activity and funding of public hospital care [ 28 ]. On the 1st January 2016, a major financing reform was introduced, and funding of public patients in most public hospitals moved to ABF [ 29 ]. ABF was introduced centrally by the Health Services Executive (HSE), responsible for delivery of public health services in Ireland. All public inpatient activity is funded under ABF, while all outpatient and Emergency Department (ED) activity continues to be funded using block budgets [ 30 ]. The ABF funding model is based on prospectively set average DRG prices, and additionally financially penalises hospitals for long patient LOS [ 30 ]. Additionally, the amount of activity that a hospital can carry out as well as the maximum funding it can receive, is capped, to preserve the overall health budget provided to a particular hospital [ 30 ]. Public patient reimbursement is based on the average price of DRGs, in contrast to private patients who are reimbursed at a per-diem basis [ 30 ].

Thus, this key difference in reimbursement between public and private patients treated in the same hospitals, lends itself to a naturally occurring control group for our analysis using the control-treatment approaches.

Estimation models

Interrupted time-series analysis.

Interrupted Time Series (ITS) analysis identifies intervention effects by comparing the level and trend of outcomes pre and post intervention [ 31 ]. Often, ITS compares outcome changes for a single population and does not specify a control group against which intervention effects can be compared [ 32 ]. This can bias the estimated intervention effects, as a defined control group often eliminates any unmeasured group or time-invariant confounders from the intervention itself [ 33 ]. Therefore, ITS can overestimate the effects of an intervention producing misleading estimation results [ 4 ].

The ITS analysis model can be presented as [ 34 , 35 ],

Where \({Y}_{t }\) is the outcome measured at time t , \(T\) is the time since the start of the study, \({X}_{t}\) is a dummy variable representing the intervention (0 = pre-intervention period, 1 = post-intervention period), and TX is an interaction term; \({\beta }_{0}\) represents the intercept of the outcome (baseline level at T = 0), \({\beta }_{1}\) is the change in outcome until the introduction of the intervention (pre-intervention trend), \({\beta }_{2}\) is the change in the outcome following the intervention (the level change), \({\beta }_{3}\) represents the difference between pre-intervention and post-intervention slopes of the outcome (treatment effect over time).

Potential outcomes framework

Alternatively, analytical approaches such as Difference-in-Differences (DiD), Propensity Score Matching Difference-in-Differences (PSM DiD) and Synthetic Control (SC) overcome some of the shortcomings of ITS. These approaches are based on the counterfactual framework and the idea of potential outcomes which quantify the estimation of causal effects of a policy or an intervention Footnote 1 . The potential outcomes framework defines a causal effect for an individual as the difference in outcomes that would have been observed for that individual with and without being exposed to an intervention [ 36 , 37 ]. Since we can never observe both potential outcomes for any one individual (we cannot go back in time to expose them to the intervention), we cannot compute the individual treatment effect [ 36 ]. Researchers therefore focus on average causal effects across populations guided by this potential outcomes framework [ 3 , 36 , 37 ]. Therefore in practice, estimation is always related to the counterfactual outcome, which is represented by the control group [ 36 , 38 ] Footnote 2 . Consequently, it is for this reason all of these analytical approaches use a clearly defined control group in estimation, against which the outcomes for a group affected by the intervention are compared. The inclusion of a control group improves the robustness of the estimated intervention effects, by approximating experimental designs such as a RCT, the gold standard [ 38 ].

Difference-in-differences analysis

The DiD approach estimates causal effects by comparing the observed outcome changes pre intervention with the counterfactual outcomes post intervention, between a naturally occurring control group and a treatment group exposed to the intervention change [ 33 ]. The key advantage of the DiD approach is its use of the intervention itself as a naturally occurring experiment, allowing to eliminate any exogenous effects from events occurring simultaneously to the intervention [ 33 , 38 ].

The DiD approach estimates the average treatment effect on the treated (ATT) across individual units at a particular time point, represented by the general DiD model as [ 3 , 6 , 33 , 38 ],

Where \({Y}_{it }\) is the value of the outcome observed for unit i at time t , \({D}_{i}\) is an indicator of unit i being in a treatment group (vs. control group), \({X}_{t}\) is a dummy variable representing the intervention period (0 = pre-intervention period, 1 = post-intervention period), and \({D}_{i}*{X}_{t}\) is the interaction term between the two; \({\beta }_{1}\) represents the estimated average difference in Y between the treatment and control groups, \({\beta }_{2}\) is the expected average change in Y from before to after the onset of the intervention, \({\beta }_{3}\) is the DiD estimator which captures the difference in outcomes before and after the intervention between the treatment and control groups i.e. the estimated average treatment effect on the treated (ATT), \({h}_{i}\) is a vector of hospital fixed effects Footnote 3 which capture unobserved time-invariant differences amongst hospitals (e.g. management), \({{\lambda }}_{t}\) captures time fixed effects for each quarter t , and \({\epsilon }_{it}\) represents exogenous, unobserved idiosyncratic shocks.

However, DiD relies on the parallel trends assumption which states that, in the absence of treatment, the average outcomes for the treated and control groups would have followed parallel trends over time [ 33 ]. This parallel trends assumption can be represented as [ 33 , 38 ],

Where \({Y}^{0}\left(0\right)\) is the outcome pre-intervention observed for all units in both the treatment (D = 1) and control (D = 0) groups; \({Y}^{0}\left(1\right)\) is the outcome post-intervention observed only for the control group and represents the unobserved counterfactual for units in the treatment group (D = 1). This assumption cannot be statistically tested, as it applies to the unobserved counterfactual post-intervention [ 33 , 38 ]. However, it is possible to examine the pre-treatment trends between both groups, by re-running the DiD model which includes an interaction between time and the treatment dummy, in the pre-intervention period [ 39 ].

Propensity score matching difference-in-differences

PSM DiD is an extension to the standard DiD approach. Using this approach, outcomes between treatment and control groups are compared, after matching them with similar observable factors, followed by estimation by DiD [ 40 , 41 , 42 ]. Combining the PSM approach with DiD allows further elimination of any time-invariant differences between the treatment and control groups, and allows selection on observables and unobservables which are constant over time [ 40 , 43 ]. Additionally, matching on the propensity score accounts for imbalances in the distribution of the covariates between the treatment and control groups [ 40 ] Footnote 4 . We present this model as follows [ 40 ],

Where \({Y}_{1i}\) and \({Y}_{0i}\) is the outcome in the post-intervention and pre-intervention period for individual patient episode i respectively, \({D}_{i}=1\) indicates individual patient episode i is in the treatment group, \({D}_{i}=0\) indicates individual patient episode i is in the control group, \(P\left({x}_{0i}\right)\) represents the probability of treatment assignment conditional on observed characteristics in the pre-intervention period.

In our final PSM DiD estimation model we estimate the average treatment effect on the treated (ATT) using nearest neighbour matching propensity scores, by selecting the one comparison unit i.e. patient episode whose propensity score is nearest to the treated unit in question. We present our estimation model as follows:

Where \({D}_{1}\) and \({D}_{0}\) represent the treatment and control groups respectively, \({w}_{ij}\) the nearest neighbour matching weights, and S is the area of common covariate support Footnote 5 .

Additionally, PSM makes the parallel trends assumption more plausible as the control groups are based on similar propensity scores in the PSM DiD approach. PSM forms statistical twin pairs before conducting DiD estimation, thus increasing the credibility of the identification of the treatment effect [ 40 ]. Instead, PSM relies on the conditional independence assumption (CIA). This assumption states that, in absence of the intervention, the expected outcomes for the treated and control groups would have been the same, conditional on their past outcomes and observed characteristics pre-intervention [ 40 , 44 ]. However, it is also important to note, that even if covariate balance is achieved in PSM DiD, this does not necessarily mean that there will be balance across variables that were not used to build the propensity score [ 40 , 44 ]. It is for this reason that the CIA assumption is still required.

Furthermore, recent developments of the DiD approach have highlighted that additional assumptions are necessary to ensure the estimated treatment effects are unbiased [ 45 ]. It is proposed that estimates will remain consistent after conditioning on a vector of pre-treatment covariates [ 45 ]. This was our motivation for employing the PSM DiD approach, as it accounts for pre-intervention characteristics, which allow to further minimise estimation bias. PSM DiD achieves this by properly applied propensity scores, based on matched pre-intervention characteristics, thus eliminating observations that are not similar between treatment and control groups [ 41 ]. Further developments have been made to account for multiple treatment groups, which receive treatment at various time periods i.e. differential timing DiD [ 46 ]. However, this does not affect our analysis, as the introduction of ABF in our empirical example took place at one time.

  • Synthetic control

The Synthetic Control (SC) method estimates the ATT by constructing a counterfactual treatment-free outcome for the treated unit using the weighted average of available control units pre-intervention [ 44 , 47 , 48 ]. The weights are chosen so that the outcomes and covariates for the treated unit and the synthetic control are similar in the pre-treatment period [ 44 , 48 ]. This assumption may not hold in reality, particularly when estimating policy impacts, thus alternative analytical approaches which avoid the parallel trends assumption have been considered.

The SC approach becomes particularly useful in cases when a naturally occurring control group cannot be established, or in cases where the parallel trends assumption does not hold, and can often complement other analytical approaches [ 48 ]. Similarly to PSM, the SC method also relies on the CIA, and controls for pre-treatment outcomes and covariates by re-weighting treated observations, using a semiparametric approach [ 44 ]. For a single treated unit the synthetic control is formed by finding the vector of weights W that minimises [ 44 ]:

Where W represents the vector of weights that are positive and sum to 1, \({X}_{1}\) contains the pre-treatment outcomes and covariates for the treated unit, \({X}_{0}\) contains the pre-treatment outcomes and covariates for the control unit, and V is a positive matrix capturing the relative importance of the chosen variables as predictors of the outcome.

The choice of V is important as W* depends on the choice of V . The synthetic control W*(V) is meant to reproduce the behaviour of the outcome variable for the treated unit in the absence of the treatment. Often a V that minimises the mean squared prediction error is chosen [ 44 , 48 ]:

Where \({T}_{0}\) is the pre-intervention period, \({Y}_{1t}\) is the outcome post-intervention at time t , \({Y}_{jt}\) is the value of the covariates for unit j at time t , \({W}_{j}^{*}\left(V\right)\) is the synthetic control for unit j , W* is a vector of optimally chosen weights.

Similarly, we limit biases in our estimated treatment effects [ 45 ] using the SC approach, which restricts the synthetic control weights to be positive and sum to one and such that the chosen weights minimise the mean squared prediction error with respect to the outcome [ 49 ].

Data and methods

In our empirical example analysis, we used national Hospital In-Patient Enquiry (HIPE) administrative activity data from 2013 to 2019 for 19 public acute hospitals providing orthopaedic services in Ireland. HIPE data used in our analysis record and classify all activity (public and private) in Irish public hospitals [ 27 ]. We divided our data into quarterly time periods (n = 27) based on admission date. Data were available for 12 quarters pre-ABF introduction, and 15 quarters post-ABF introduction. We assessed the impact of ABF on patient average LOS, following elective hip replacement surgery, for a total of 19,565 hospital patient episodes.

For each analysis, we included hospital fixed effects and controlled for the same covariates: Age categories (reference category 60–69 years), average number of diagnoses, average number of additional procedures (additional to hip replacement), Diagnosis-Related Group (DRG) complexity (split by minor and major complexity) and interaction variables: Age categories by average number of diagnoses, age categories by average number of additional procedures, age categories by DRG complexity.

We estimated the ITS model using ordinary least squares and included public patient episodes only. Following guidance from previous studies [ 32 , 50 ], we accounted for seasonality by including indicator variables for elapsed time since ABF introduction. Additionally, we checked for presence of autocorrelation by plotting the residuals and the partial autocorrelation function [ 32 , 50 ].

For the remaining models, we used treatment and control groups consisting of public and private patient episodes, respectively, and estimated the average treatment effects on the treated (ATT). We used the key differences in reimbursement between public (DRG payments) and private (per-diem payments) patient episodes, to differentiate our treatment group from the control group. The identification strategy exploits the fact that per-diem funding of private patient care remained unchanged over the study period. Any change in outcome between public and private patients before and after the introduction of ABF should be due to the policy introduction.

In our DiD analysis, we controlled for common aggregate shock changes by including dummy variables for each time period (time fixed effects). We additionally examined the parallel trends assumption by interacting the time and treatment indicators in the pre-ABF period (see Supplementary Tables  4 , Additional File 6 ).

We estimated PSM DiD in a number of steps Footnote 6 : First we estimated propensity scores to treatment based on our list of covariates, using a probit regression. Second, we matched the observations in the treatment group (public patient episodes) with observations in the control group (private patient episodes) as per estimated propensity scores with the common support condition imposed. Finally, we compared the changes in the average LOS of the treated and matched controls by DiD estimation.

The SC estimation Footnote 7 was conducted at the hospital level. It has been reported that the SC approach used in our analysis works best with aggregate-level data [ 44 , 48 , 52 ]. We incorporated the nested option in our estimation, a fully nested optimization procedure that searches among all (diagonal) positive semidefinite matrices and sets of weights for the best fitting convex combination of the control units [ 44 , 52 ]. The synthetic control group composition consisted of private patient episodes based on characteristics from 9 different public hospitals from the sample of 19 hospitals used in our analysis [see Supplementary Tables  1 , Additional File 2 ].

To examine whether the estimated effects from all analyses still hold, we conducted sensitivity analysis and re-estimated each analytical model using trimmed LOS at 7 days (at the 90th percentile of the LOS distribution). As illustrated by the distribution of LOS in Supplementary Fig.  1 , Additional File 1 , this allowed for the exclusion of outlier LOS values. Additionally, to test the robustness of the estimated treatment effects, we tested the empirical strength of each model by inclusion and exclusion of certain covariates. We also examined the trends in the pre-ABF period across all DiD models, to check whether the trends were similar across the treatment and control groups.

Table  1 summarises the key descriptive statistics of the data analysed. Over the study period, the overall average LOS for this sample of patient episodes was 5.2 days (5.3 and 5.0 days for public and private patients, respectively). The majority (31.7%) of patients were aged 60–69 years (30.9% of public and 33.8% private patients, respectively). The average number of additional diagnoses was 2.5 for public and 2.1 for private patients (overall average of 2.4), and average additional procedures were 3.3 for public and 2.8 for private patients. The DRG complexity indicates that most patients (95.7%) had undergone minor complexity hip replacement surgery.

We illustrate the estimated intervention effects for each of the models in Fig.  1 . We observe a clear reduction in the average LOS from the ITS estimates (Fig.  1 a). However, the DiD and PSM DiD estimates are very similar, and we do not observe a clear effect on the average LOS, with most coefficients distributed closely around zero (Fig.  1 b and c). Similarly, the SC approach could not identify a clear effect (Fig.  1 d). Additionally, both the SC (Fig.  1 d & Supplementary Tables  1 , Additional File 2 ) and PSM DiD (Supplementary Fig.  2 , Additional File 3 ) approaches achieved good balance between the treated (public patient episodes) and control (private patient episodes) groups. Our examination of the pre-ABF trends did not identify any significant differences between treatment and control groups (see Supplementary Tables  4 , Additional File 6 ).

figure 1

Model estimates

Table  2 summarises the estimated treatment effects for each estimation model Footnote 8 . The ITS analysis suggested ABF had the largest and statistically significant impact on the average LOS for public patients, a reduction of 0.7 days (p < 0.01). However, this effect could not be observed with the control-treatment approaches, although we also see a negative but smaller effect on the average LOS from the DiD, PSM DiD and SC estimates. The effect is not statistically significant for any of these models. As illustrated in Fig.  2 below, we observe a generally declining trend in the average LOS for both the public and private patients in our data. This explains the statistically significant effects of ITS, relative to the control-treatment methods, which differentiate out the average LOS effects between both public and private patient episodes.

The results from our sensitivity analysis (Supplementary Tables  2 , Additional File 4 ) revealed no material change for the ITS estimates, which remained statistically significant (p < 0.001). The estimated treatment effects from the control-treatment approaches remained small, and not statistically significant. Similarly, additional robustness testing of the estimated treatment effects by each model (and pre-ABF trend examination) remained consistent with the main results (Supplementary Tables  3 , Additional File 5 ).

figure 2

Average LOS by quarter 2013–2019 for treatment and control groups

In this study we compared the key analytical methods that have been used in the evaluation of policy interventions and used the introduction of Activity-Based Funding (ABF) in Irish public hospitals as an illustrative policy case. Specifically, we compared several control-treatment methods (DiD, PSM DiD, SC), to a non-control-treatment approach, ITS. We contribute to the limited empirical evidence in the health services research field comparing control-treatment analytical approaches to non-control-treatment approaches, based on recent evidence highlighting the common use of these methods in estimation of health intervention or policy effects [ 5 ]. Additionally, we contribute to the very limited research evidence on the evaluation of the ABF policy within the Irish context. We were able to utilise an important dimension of the funding changes, by exploiting the fact that both publicly and privately financed patients are treated in public hospitals in Ireland and over the period of analysis, private patients were not subject to a change in their funding.

From our comparative methods analysis, ITS produced statistically significant estimates, indicating a reduction in LOS post ABF introduction, relative to control-treatment approaches, which did not indicate any significant effects. This is in line with the results from other studies, which have estimated ABF effects using ITS, and have reported significant reductions in LOS [ 10 , 11 , 12 , 13 ]. Caution should be taken when considering ITS, as the estimates may not truly capture the effects of the intervention of interest. This could lead to incorrect inferences, and potentially to misguided assessment of impacts from policy changes across the hospital sector. For instance, the estimated reduction in LOS for Irish public patients, may incorrectly indicate that the ABF reform has been successful. From a policy perspective, the importance of the resulting ABF effects, would be informed by the size of ITS estimates, providing potentially misleading evidence on the funding reform.

Further, caution should be taken, as ITS analysis does not include a control group, relative to the other methods we considered which incorporated a control and treatment groups. Therefore the conclusions drawn from the ITS analysis will differ to those drawn from the control-treatment approaches. Additionally, our findings from ITS analysis align with a recent study which tested the empirical strength of the ITS approach, by comparing the estimated ITS results to the results from a RCT [ 4 ]. Relative to a RCT, ITS produced misleading results, primarily driven by the lack of control group, and ITS model assumptions [ 4 ]. This would suggest, a comparison of the slope of outcomes before and after an intervention may lead to biased estimates when evaluating causal effects on outcomes affected over time, due to influences by simultaneous and other unobservable factors at the time of the intervention.

However, over the study period, the average LOS for both public (treatment) and private (control) patient cases shows a reducing trend over time (Fig.  2 ). By limiting the analysis to the public patients only, the ITS approach ignores the system level effect for all patients treated (public and private), across public hospitals, and picks up a statistically significant and negative effect. In contrast, the control-treatment approaches account for the simultaneous downward trend in private (control) patient activity, thus approximating a natural experiment (e.g. a RCT) more closely, and producing more robust estimates, relative to ITS.

It is important to note that often no comparison group may be available, limiting the analysis to the ITS approach. This may be driven by various data limitations. For example, the data available over a period may only partially be available for a specific intervention. Therefore, conventional regression modelling may be the only feasible approach to account for pre-intervention differences, even though there is evidence that these methods may provide biased results, most notably in the presence of time-dependent confounders [ 4 ]. Additionally, certain intervention and policy evaluations may not be feasible under a control-treatment design, and for which the ITS approach is more suitable. This applies to studies which focus on a specific patient [ 53 ] or hospital group [ 10 ], or policies at a more aggregate or population level [ 54 ], for which it is difficult to identify a naturally occurring control group. Therefore, the inclusion of a control group in these instances would not be suitable, suggesting a before-after comparison in the level and trend of outcomes using ITS analysis as a more suitable approach. Additionally, ITS models may be more effective in the evaluation of policy and intervention effects when the control-treatment specific assumptions of parallel trends and the common independence assumptions do not hold [ 55 ].

Additionally, ITS has been highlighted as an effective approach to study short-term policy and intervention effects, as estimation of long-term effects can be biased due to the presence of simultaneous shocks to the outcome of interest [ 56 ]. In contrast, control-treatment approaches such as DiD and SC have been recognised as more appropriate and robust for estimation of long-term intervention effects [ 57 ], as these allow intervention effects to change over time [ 38 , 49 ]. Despite recent improvements and developments of the ITS approach [ 34 , 35 ], the benefits of adopting control-treatment approaches for health intervention and policy evaluation, have been previously highlighted [ 33 ].

It should be noted that all of the methods applied in this study are limited to the evaluation of a single policy. Therefore, any other smaller scale simultaneous policies that are implemented during the period of analysis are difficult to differentiate out in many instances. However, the control-treatment methods account for any unmeasured group or time-invariant confounders from the main intervention itself by incorporating a control group [ 33 ]. For example, the introduction of ABF in our empirical example may have been accompanied by a hospital-wide discharge policy aimed at reducing LOS. In this instance, ITS may attribute the reduction in LOS as the impact of ABF entirely, although this is a hospital policy effect. Alternatively, the inclusion of a control group (e.g. patients targeted in the LOS policy, but not to ABF) would difference out the ABF effect from the LOS policy, and would capture effects specific to ABF introduction. In this case, ITS may overestimate the impacts of ABF relative to the other approaches and may further contribute to different evidence base for policy decisions.

This study has several limitations. First, we limited our ITS analysis to a single group (public patient episodes) despite recent developments to ITS for multiple group comparisons [ 34 ]. However, this was informed by a recent review, which identified that ITS was employed to estimate intervention effects for a single group [ 5 ]. Second, for each of the control-treatment methods, we assumed that any individual shocks following ABF introduction had the same expected effect on the average LOS for the treatment and control groups. Third, we assumed that all of the models were correctly specified in terms of their respective identification and functional form assumptions. However, if either the identification or the functional assumptions are violated, the estimates can be biased, particularly as highlighted in recent literature on DiD approaches [ 45 ]. Fourth, we limited our focus on two key assumptions applicable to the quasi-experimental approaches i.e. parallel trends and conditional independence, and did not focus on other assumptions e.g. common shock assumption. Fifth, recent research evidence has addressed the issues related to intervention ‘spillover effects’ i.e. the unintended consequences of health-related interventions beyond those initially intended [ 58 ]. It is possible that the differing estimated effects, based on the analytical method used, may have, or could lead to spillover effects as a result. However, given the nature of the data used in our analysis, and our focus on a single procedure in our empirical analysis, it is difficult to identify any potential spillover effects, which may have been linked to ABF. More exploration of such effects may be necessary in future research. Finally, caution should be taken in generalising the reported ABF effects in this study given that our empirical example focused on one procedural group in one particular country.

In health services research it is not always feasible to conduct experimental analysis and we therefore often rely on observational analysis to identify the impact of policy interventions. We demonstrated that ITS analysis produces results different in interpretation relative to control-treatment approaches such as DiD, PSM DiD and SC. Our comparative method analysis therefore suggests that choice of analytical method should be carefully considered and researchers should strive to employ more appropriate designs incorporating control and treatment groups. These methods are more robust and provide a stronger basis for evidence-based policy-making and evidence for informing future financing reform and policy.

Data Availability

The data that support the findings of this study were made available under a strict user agreement with the Healthcare Pricing Office. Access to the data may only be sought directly from the Healthcare Pricing Office.

The treatment effect in terms of potential outcomes: where Y 0 ( i,t ) is the outcome that individual i would attain at time t in absence of treatment and Y 1 ( i,t ) is the outcome that individual i would attain at time t if exposed to treatment. The treatment effect on the outcome for individual i at time t is: Y 1 ( i,t ) – Y 0 ( i,t ). The fundamental identification problem is that for any individual i and time t , both potential outcomes Y 0 ( i, t ) and Y 1 ( i, t ) are not observed and we cannot compute the individual treatment effect. We only observe the outcome Y( i, t ) expressed as: Y( i, t ) = Y 0 ( i, t )(1 − D( i, t )) + Y 1 ( i, t )D( i, t ), [D( i,t ) = 0 control and D( i,t ) = 1 treatment]. Since treatment occurs after period t = 0, we can denote D( i ) = D( i, 1 ), then we have Y( i, 0 ) = Y 0 ( i, 0 ) and Y( i, 1 ) = Y 0 ( i, 1 )(1 − D( i )) + Y 1 ( i, 1 )D(i) (Rubin (1974)).

The change in outcomes from pre to post-intervention in the control group is a proxy for the counterfactual change in untreated potential outcomes in the treatment group.

The unit used is at discharge level but we only have one observation per discharge by definition therefore we cannot apply discharge fixed effects and instead have to include hospital fixed effects.

Matching on the propensity score works because it imposes the same distribution of the covariates for both the control and treatment groups (Rosenbaum and Rubin (1983)).

The common support condition guarantees that only units with suitable control cases are considered by dropping treatment observations whose propensity score is higher than the maximum or less than the minimum propensity score of the controls.

Using the psmatch2 Stata command using nearest neighbour matching which showed the best balancing properties after comparing several algorithms [ 51 ].

Using the synth Stata command [ 44 , 52 ].

Reported p-values for ITS and DiD are for the hypothesis that ATT = 0. For DiD PSM, reported p-values are conditional on the matched data. For SC, reported p-values were calculated using placebo-tests in a procedure akin to permutation tests (Abadie et al. 2010). This involved iteratively resampling from the control pool, and in each iteration re-assigning each control unit as a ‘placebo treated unit’, with a probability according to the proportion of treated units in the original sample. The synthetic control method was then applied on these ‘placebo data’ and ATT calculated for the placebo treated versus control units. The p-value for the ATT was calculated according to the proportion of the replicates in which the absolute value of the placebo-ATT exceeded the estimated ATT. It should be noted that the p-value based on placebo tests relate to falsification tests, while the p-values reported for the other methods relate to sampling uncertainty. Hence the p-values between each estimated model are not directly comparable.

Abbreviations

Randomised Controlled Trial

Interrupted Time Series

Difference-in-Differences

Propensity Score Matching

Propensity Score Matching Difference-in-Differences

Synthetic Control

Conditional Independence Assumption

Activity-Based Funding

Average Treatment effect on the Treated

Health Service Executive

Hospital In-Patient Enquiry

Length of Stay

Diagnosis-Related Group

Brook RH, Keeler EB, Lohr KN, Newhouse JP, Ware JE, Rogers WH, et al. The Health Insurance Experiment: A Classic RAND Study Speaks to the Current Health Care Reform Debate. Santa Monica: RAND Corporation; 2006.

Google Scholar  

Finkelstein A, Taubman S, Wright B, Bernstein M, Gruber J, Newhouse JP, et al. The Oregon Health Insurance Experiment: Evidence from the first year. Q J Econ. 2012;127(3):1057–106.

Article   PubMed   PubMed Central   Google Scholar  

Jones AM, Rice N. Econometric evaluation of health policies. Oxford: Oxford University Press; 2011.

Baicker KS, T.,. Testing the Validity of the Single Interrupted Time Series Design. CID Working Papers 364, Center for International Development at Harvard University. 2019.

Valentelyte G, Keegan C, Sorensen J. Analytical methods to assess the impacts of activity-based funding (ABF): a scoping review. Health Econ Rev. 2021;11(1):17.

O’Neill S, Kreif N, Grieve R, Sutton M, Sekhon JS. Estimating causal effects: considering three alternatives to difference-in-differences estimation. Health Serv Outcomes Res Methodol. 2016;16:1–21.

O’Neill S, Kreif N, Sutton M, Grieve R. A comparison of methods for health policy evaluation with controlled pre-post designs. Health Serv Res. 2020;55(2):328–38.

Sutherland JM, Liu G, Crump RT, Law M. Paying for volume: British Columbia’s experiment with funding hospitals based on activity. Health Policy. 2016;120(11):1322–8.

Article   PubMed   Google Scholar  

Januleviciute J, Askildsen JE, Kaarboe O, Siciliani L, Sutton M. How do Hospitals Respond to Price Changes? Evidence from Norway. Health Econ (United Kingdom). 2016;25(5):620–36.

Article   Google Scholar  

Shmueli A, Intrator O, Israeli A. The effects of introducing prospective payments to general hospitals on length of stay, quality of care, and hospitals’ income: the early experience of Israel. Soc Sci Med. 2002;55(6):981–9.

Perelman J, Closon MC. Hospital response to prospective financing of in-patient days: The Belgian case. Health Policy. 2007;84(2–3):200–9.

Martinussen PE, Hagen TP. Reimbursement systems, organisational forms and patient selection: Evidence from day surgery in Norway. Health Econ Policy Law. 2009;4(2):139–58.

Theurl E, Winner H. The impact of hospital financing on the length of stay: Evidence from Austria. Health Policy. 2007;82(3):375–89.

Gaughan J, Gutacker N, Grašič K, Kreif N, Siciliani L, Street A. Paying for efficiency: Incentivising same-day discharges in the English NHS. J Health Econ. 2019;68:102226-.

Allen T, Fichera E, Sutton M. Can Payers Use Prices to Improve Quality? Evidence from English Hospitals. Health Econ. 2016;25(1):56–70.

Verzulli R, Fiorentini G, Lippi Bruni M, Ugolini C. Price Changes in Regulated Healthcare Markets: Do Public Hospitals Respond and How? Health Econ. 2017;26(11):1429–46.

Krabbe-Alkemade YJFM, Groot TLCM, Lindeboom M. Competition in the Dutch hospital sector: an analysis of health care volume and cost. Eur J Health Econ. 2017;18(2):139–53.

Article   CAS   PubMed   Google Scholar  

Hamada H, Sekimoto M, Imanaka Y. Effects of the per diem prospective payment system with DRG-like grouping system (DPC/PDPS) on resource usage and healthcare quality in Japan. Health Policy. 2012;107(2):194–201.

Farrar S, Yi D, Sutton M, Chalkley M, Sussex J, Scott A. Has payment by results affected the way that English hospitals provide care? Difference-in-differences analysis. BMJ (Online). 2009;339(7720):554–6.

Cooper Z, Gibbons S, Jones S, McGuire A. Does Hospital Competition Save Lives? Evidence From The English NHS Patient Choice Reforms*. Econ J. 2011;121(554):F228-F60.

Palmer KS, Agoritsas T, Martin D, Scott T, Mulla SM, Miller AP, et al. Activity-based funding of hospitals and its impact on mortality, readmission, discharge destination, severity of illness, and volume of care: a systematic review and meta-analysis. PLoS ONE. 2014;9(10):e109975.

Street A, Vitikainen K, Bjorvatn A, Hvenegaard A. Introducing activity-based financing: a review of experience in Australia, Denmark, Norway and Sweden. Working Papers 030cherp, Centre for Health Economics, University of York. 2007.

Street A, Maynard A. Activity based financing in England: the need for continual refinement of payment by results. Health Econ Policy Law. 2007;2(4):419–27.

Shleifer A. A Theory of Yardstick Competition. RAND J Econ. 1985;16(3):319–27.

Brick A, Nolan A, O’Reilly J, Smith S. Resource Allocation, Financing and Sustainability in Health Care. Evidence for the Expert Group on Resource Allocation and Financing in the Health Sector. Dublin: The Economic and Social Research Institute (ESRI); 2010. July 9, 2010.

Keegan C, Connolly S, Wren MA. Measuring healthcare expenditure: different methods, different results. Ir J Med Sci (1971 -). 2018;187(1):13–23.

Article   CAS   Google Scholar  

Healthcare Pricing Office. Activity in Acute Public Hospitals in Ireland. 2021.

Department of Health. Future Health. A Strategic Framework for Reform of the Health Service 2012–2015. Dublin; 2012.

Health Service Executive (HSE). Activity-Based Funding Programme Implementation Plan 2015–2017. Dublin; 2015.

Healthcare Pricing Office. Introduction to the Price Setting Process for Admitted Patients V1.0 26May2015. 2015.

Kontopantelis E, Doran T, Springate DA, Buchan I, Reeves D. Regression based quasi-experimental approach when randomisation is not an option: interrupted time series analysis. BMJ (Clinical research ed). 2015;350:h2750.

Bernal JL, Cummins S, Gasparrini A. Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol. 2017;46(1):348–55.

PubMed   Google Scholar  

Blundell R, Costa Dias M. Evaluation Methods for Non-Experimental Data. Fisc Stud. 2000;21(4):427–68.

Linden A. Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. Stata J. 2015;15(2):480–500.

Linden A, Adams JL. Applying a propensity score-based weighting model to interrupted time series data: improving causal inference in programme evaluation. J Eval Clin Pract. 2011;17(6):1231–8.

Rubin DB. Causal Inference Using Potential Outcomes: Design, Modeling, Decisions. J Am Stat Assoc. 2005;100(469):322–31.

Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Eductational Psychol. 1974;66(5):688–701.

Angrist JDP, Jorn-Steffen. Parallel Worlds: Fixed Effects, Differences-in-differences, and Panel Data. Mostly Harmless Econometrics: An Empiricist’s Companion: Princeton University Press; 2009.

Basu S, Meghani A, Siddiqi A. Evaluating the Health Impact of Large-Scale Public Policy Changes: Classical and Novel Approaches. Annu Rev Public Health. 2017;38:351–70.

Heckman JJ, Ichimura H, Todd PE. Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme. Rev Econ Stud. 1997;64(4):605–54.

Heckman J, Ichimura H, Smith J, Todd PE. Characterizing Selection Bias Using Experimental Data. Econometrica. 1998;66(5):1017–98.

Song Y, Sun W. Health Consequences of Rural-to-Urban Migration: Evidence from Panel Data in China. Health Econ. 2016;25(10):1252–67.

Glazerman S, Levy DM, Myers D. Nonexperimental Replications of Social Experiments: A Systematic Review2003.

Abadie A, Diamond A, Hainmueller J. Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program. J Am Stat Assoc. 2010;105(490):493–505.

Sant’Anna PHC, Zhao J. Doubly robust difference-in-differences estimators. J Econ. 2020;219(1):101–22.

Callaway B, Sant’Anna PHC. Difference-in-Differences with multiple time periods. Journal of Econometrics. 2020.

Kreif N, Grieve R, Hangartner D, Turner AJ, Nikolova S, Sutton M. Examination of the Synthetic Control Method for Evaluating Health Policies with Multiple Treated Units. Health Econ. 2016;25(12):1514–28.

Bouttell J, Craig P, Lewsey J, Robinson M, Popham F. Synthetic control methodology as a tool for evaluating population-level health interventions. J Epidemiol Commun Health. 2018;72(8):673.

Abadie A. Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects. J Econ Lit. 2021;59(2):391–425.

Cruz M, Bender M, Ombao H. A robust interrupted time series model for analyzing complex health care intervention data. Stat Med. 2017;36(29):4660–76.

Leuven E, Sianesi B. PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. Boston College Department of Economics; 2003.

Abadie A, Diamond AJ, Hainmueller J. Comparative Politics and the Synthetic Control Method. American Journal of Political Science 2014, Forthcoming, Formerly MIT Political Science Department Research Paper No 2011-25. 2014.

Epstein RA, Feix J, Arbogast PG, Beckjord SH, Bobo WV. Changes to the financial responsibility for juvenile court ordered psychiatric evaluations reduce inpatient services utilization: an interrupted time series study. BMC Health Serv Res. 2012;12(1):136.

Pincus D, Widdifield J, Palmer KS, Paterson JM, Li A, Huang A, et al. Effects of hospital funding reform on wait times for hip fracture surgery: a population-based interrupted time-series analysis. BMC Health Serv Res. 2021;21(1):576.

Hudson J, Fielding S, Ramsay CR. Methodology and reporting characteristics of studies using interrupted time series design in healthcare. BMC Med Res Methodol. 2019;19(1):137.

Ewusie JE, Soobiah C, Blondal E, Beyene J, Thabane L, Hamid JS. Methods, Applications and Challenges in the Analysis of Interrupted Time Series Data: A Scoping Review. J Multidiscip Healthc. 2020;13:411–23.

Aragón MJ, Chalkley M, Kreif N. The long-run effects of diagnosis related group payment on hospital lengths of stay in a publicly funded health care system: Evidence from 15 years of micro data. Health Economics. 2022;n/a(n/a).

Francetic I, Meacock R, Elliott J, Kristensen SR, Britteon P, Lugo-Palacios DG, et al. Framework for identification and measurement of spillover effects in policy implementation: intended non-intended targeted non-targeted spillovers (INTENTS). Implement Sci Commun. 2022;3(1):30.

Download references

Acknowledgements

The authors wish to thank the Data Analytics team at the Healthcare Pricing Office (HPO) for granting access to the data used in this study. This study was conducted as part of the Health Research Board (HRB) SPHeRE Programme (Grant No. SPHeRE-2018-1). The Health Research Board (HRB) supports excellent research that improves people’s health, patient care and health service delivery. An earlier version of this work has been previously presented at the virtual International Health Economics Association (IHEA) Congress 2021.

This research was funded by the Health Research Board SPHeRE-2018-1.

Author information

Authors and affiliations.

Structured Population and Health services Research Education (SPHeRE) Programme, School of Population Health, RCSI University of Medicine and Health Sciences, Mercer Street Lower, Dublin, Ireland

Gintare Valentelyte

Economic and Social Research Institute (ESRI), Whitaker Square, Dublin, Ireland

Conor Keegan

Healthcare Outcome Research Centre (HORC), School of Population Health, RCSI University of Medicine and Health Sciences, Dublin, Ireland

Gintare Valentelyte & Jan Sorensen

You can also search for this author in PubMed   Google Scholar

Contributions

JS and GV conceived the study. GV drafted and edited the manuscript and performed statistical analysis. CK and JS critically revised the manuscript. All authors approved the final draft.

Corresponding author

Correspondence to Gintare Valentelyte .

Ethics declarations

Ethics approval and consent to participate.

Ethical approval for this study was granted by the Research Ethics Committee of the Royal College of Surgeons of Ireland (REC201910019). We confirm that all methods in this study were carried out in accordance with their specifications and other relevant guidelines and regulations. The ethics committee recognized that explicit consent to participate in the study was not required, as the data used in this study were retrospective, routinely collected, and anonymised. The data controller, the Healthcare Pricing Office, responsible for holding and managing the national Hospital In-Patient Enquiry (HIPE) database, granted access and permission to use the data in this study. The Healthcare Pricing Office ensured strict data user agreements were followed, and the data were anonymized by limiting certain combinations of data that could lead to patient identification. This was in line with the Healthcare Pricing Office adherence to The Data Protection Acts 1998 to 2018 and Regulation (EU) 2016/679 of the European Parliament and the council of 27 of 27 April 2016 also known as the General Data Protection Regulation or GDPR (HPO Data Protection Statement Version 1.2, May 2020, Healthcare Pricing Office. [available at: https://hpo.ie/data_protection/HPO_Data_Protection_Statement_Version_1.2_May2020_Covid_1.pdf ]

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Valentelyte, G., Keegan, C. & Sorensen, J. A comparison of four quasi-experimental methods: an analysis of the introduction of activity-based funding in Ireland. BMC Health Serv Res 22 , 1311 (2022). https://doi.org/10.1186/s12913-022-08657-0

Download citation

Received : 13 December 2021

Accepted : 16 September 2022

Published : 03 November 2022

DOI : https://doi.org/10.1186/s12913-022-08657-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Interrupted time-series
  • Difference-in-differences
  • Propensity score matching
  • Activity-based funding
  • Policy evaluation

BMC Health Services Research

ISSN: 1472-6963

quasi experiment duration

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 September 2024

The carbon emission reduction effect of green fiscal policy: a quasi-natural experiment

  • Shuguang Wang 1 ,
  • Zequn Zhang 1 ,
  • Zhicheng Zhou 2 &
  • Shen Zhong 2  

Scientific Reports volume  14 , Article number:  20317 ( 2024 ) Cite this article

Metrics details

  • Climate-change impacts
  • Climate-change mitigation
  • Environmental impact

Carbon emission reduction is crucial for mitigating global climate change, and green fiscal policies, through providing economic incentives and reallocating resources, are key means to achieve carbon reduction targets. This paper uses data covering 248 cities from 2003 to 2019 and applies a multi-period difference-in-differences model (DID) to thoroughly assess the impact of energy conservation and emission reduction ( ECER ) fiscal policies on enhancing carbon emission ( CE 1 ) reduction and carbon efficiency ( CE 2 ). It further analyzes the mediating role of Green Innovation ( GI ), exploring how it strengthens the impact of ECER policies. We find that: (1) ECER policies significantly promote the improvement of carbon reduction and CE 2 , a conclusion that remains robust after excluding the impacts of concurrent policy influences, sample selection biases, outliers, and other random factors. (2) ECER policies enhance CE 1 reduction and CE 2 in pilot cities by promoting green innovation, and this conclusion is confirmed by Sobel Z tests. (3) The effects of ECER policies on CE 1 reduction and the improvement of CE 2 are more pronounced in higher-level cities, the eastern regions and non-resource cities. This research provides policy makers with suggestions, highlighting that incentivizing green innovation through green fiscal policies is an effective path to achieving carbon reduction goals.

Introduction

Efforts to mitigate global climate change through the reduction of CE 1 have emerged as a shared objective among nations globally 1 . From the initiation of the United Nations Framework Convention on Climate Change to the enactment of the Kyoto Protocol and the adoption of the Paris Agreement, these pacts reflect the unified resolve of nations to tackle global climate change 2 , 3 . With the acceleration of global industrialization and the continuous increase in energy demand, there has been a significant rise in the emissions of greenhouse gases, especially carbon dioxide, posing an unprecedented challenge to the Earth’s climate system 4 . These issues encompass the escalation of average global temperatures, a surge in severe weather occurrences, accelerated glacier melt, and a persistent increase in sea levels 5 , 6 , 7 , which threaten the balance of natural ecosystems and have profound impacts on the economic development and well-being of human societies. Therefore, adopting effective carbon reduction strategies to slow these climate change trends has become an urgent task faced globally.

In the current field of CE 1 reduction research, the focus is mainly on implementing policies such as carbon emission trading 8 , smart city pilot policies 9 , and low-carbon city pilot policies 10 . Among these policies, green fiscal policy, as a core strategy to mitigate the impact of climate change, is increasingly recognized by the academic community and policymakers for its importance in promoting CE 1 reduction 11 , 12 . This policy directly impacts CE 1 in economic activities through adjustments in the tax system, provision of fiscal subsidies, and increased investments in renewable energy and low-carbon technologies 13 . Green fiscal policies differ from traditional environmental protection measures by employing a mechanism that combines incentives and constraints, aiming to encourage enterprises to adopt emission reduction measures. In the implementation process of green fiscal policies, governments encourage enterprises to reduce CE 1 by adjusting tax policies 14 . Specifically, the ECER policy impacts the carbon emissions of demonstration cities through a combination of financial incentives and target constraints. The demonstration period lasts for three years, during which the central government provides reward funds for demonstration projects. The amount of these rewards is determined by the category of the city: 600 million RMB annually for municipalities and city clusters, 500 million RMB annually for sub-provincial cities and provincial capitals, and 400 million RMB annually for other cities. Local governments have the discretion to decide how to utilize these funds, while the central government is responsible solely for project record management. Additionally, the central government conducts annual and overall target assessments of the demonstration cities. The results of the annual assessment influence the reward funds for the following year: cities that perform excellently will receive an additional 20% of reward funds, while those that fail to meet the standards will have 20% of their funds withdrawn. The overall assessment results are linked to the demonstration qualification and reward funds; cities that fail to meet the overall targets or have serious issues will lose their demonstration status and have all reward funds withdrawn. This financial incentive mechanism ensures that local governments have sufficient financial support when implementing green technologies and projects, promoting increased energy efficiency and the widespread adoption of clean energy. Simultaneously, through the target constraint mechanism, the central government strictly supervises and incentivizes local governments’ efforts to reduce emissions, ensuring effective policy implementation. Under the dual pressure of financial incentives and performance assessments, local governments actively adopt various measures to promote energy conservation and emission reduction, including investing in green infrastructure, promoting energy-saving technologies, and optimizing energy structures, thereby achieving significant reductions in carbon emissions.

Furthermore, innovation and technological breakthroughs significantly enhance the effectiveness of green fiscal policies in reducing carbon emissions. Specifically, technological advancements improve energy efficiency, reducing the energy consumption per unit of output; they lower the production costs of clean energy, promoting its widespread adoption; and they advance carbon capture and storage technologies, directly reducing industrial carbon dioxide emissions. These technological improvements bolster the impact of green fiscal policies, making them more effective in achieving carbon reduction targets. However, the implementation of green fiscal policies also faces some challenges. Firstly, balancing the relationship between economic development and environmental protection to avoid potential negative impacts such as job losses and industrial relocation during policy execution is an issue that policymakers need to consider. Secondly, the effective implementation of green fiscal policies requires strong policy support and regulatory mechanisms to ensure that policy measures are effectively executed and can adapt to constantly changing economic and environmental conditions. Therefore, evaluating the carbon reduction effect of such policies is of significant importance for achieving long-term environmental sustainability and promoting the green economic transformation.

This paper analyzes the impact of green fiscal policies on carbon emissions and carbon efficiency. Relevant research mainly focuses on the following two areas: studies on the factors influencing carbon emissions, and research related to environmental regulations and energy conservation and emission reduction fiscal policies.

Firstly, a substantial body of literature focuses on the factors influencing carbon emissions, with some studies specifically examining the impact of government intervention and environmental regulation on CO2 emissions. These studies are closely related to the theme of this paper. From an economic perspective, numerous studies have demonstrated that economic growth significantly impacts carbon emissions 15 , 16 , 17 . Generally, increased economic activity is associated with higher energy consumption, leading to higher carbon emissions. However, as economies reach a certain level of development, the Environmental Kuznets Curve (EKC) phenomenon may occur, where carbon emissions begin to decrease after reaching a certain economic threshold 18 , 19 . Research has also confirmed that economic growth increases the ecological footprint, leading to environmental degradation 20 . For example, economic growth, income inequality, and energy poverty have increased environmental pressure in BRICS countries 21 . In Pakistan, institutional quality has led to higher CO 2 emissions, but economic development can help reduce these emissions 22 . From a social perspective, the acceleration of urbanization is typically accompanied by increased energy consumption, thereby raising carbon emissions. There is a long-term and short-term U-shaped relationship between urbanization and the environment 23 . Upgrading existing infrastructure can enable various sectors to produce minimal waste that impacts emissions 24 . Changes in consumption levels and population structure also significantly affect carbon emissions 25 . From a policy perspective, government-enacted environmental regulations and policies, such as carbon taxes, carbon trading markets, emission standards, and renewable energy subsidies, play a crucial role in reducing carbon emissions. Innovations and environmental policies contribute to emission reductions both in the long and short term. Additionally, carbon pricing can reduce emissions in specific regions, although its impact is often more targeted at specific countries 26 . Carbon taxes and mitigation technologies are helping to achieve sustainable development goals for carbon mitigation 27 . Green energy investments are significantly associated with greenhouse gas emissions and support environmental quality 28 . However, these studies often overlook the impact of energy conservation and emission reduction fiscal policies on carbon emissions.

Secondly, there is a body of literature focusing on environmental regulation, which can be divided into two main areas: the impact of environmental regulation on the environment and its impact on the economy. On the one hand, extensive research has explored the environmental impact of regulation. Studies generally agree that stringent environmental regulations help reduce pollutant emissions and improve environmental quality. Environmental regulations significantly enhance the synergy between carbon reduction and air pollution control 29 . Target-based pollutant reduction policies effectively constrain the sulfur dioxide emissions of regulated enterprises, lowering their sulfur dioxide emission intensity, thereby demonstrating that stringent environmental regulations facilitate green transitions for businesses 30 . However, in some developing countries or regions with weak enforcement, the effectiveness of environmental regulations may be compromised. Despite strict regulatory policies being in place, inadequate enforcement or a lack of regulatory capacity may result in actual pollutant reduction falling short of expectations. On the other hand, part of the literature examines the economic impact of environmental regulation. Some studies suggest that environmental regulation can drive technological innovation and industrial upgrading, thereby promoting economic growth 31 . Strict environmental standards force companies to improve production processes and develop new environmental technologies, which can create new economic opportunities and growth points 32 . Environmental regulations significantly enhance green technological innovation 33 , and they have notably promoted green innovation across European countries 34 . Conversely, environmental regulations may increase operational costs for businesses, particularly in the short term due to compliance costs, which could inhibit economic growth. This is especially true for regions or countries that rely heavily on high-pollution, high-energy-consumption industries, where environmental regulation might lead to a slowdown in economic growth. Given that energy conservation and emission reduction fiscal policies are a form of environmental regulation, it is necessary to evaluate their effectiveness.

Thirdly, some literature evaluates the governance effectiveness of energy conservation and emission reduction fiscal policies. From an environmental perspective, these policies can reduce pollutants and enhance efficiency. On average, such policies have reduced industrial SO2 (sulfur dioxide) emissions by 23.8% and industrial wastewater discharge by 17.5% 35 . Additionally, energy conservation and emission reduction fiscal policies can effectively improve green total factor carbon efficiency 36 . From an economic perspective, these policies can promote investment and economic growth 37 . They have significantly improved green credit for enterprises and can facilitate sustainable urban development 38 .

In summary, there are two significant gaps in the existing literature. Firstly, although numerous studies have extensively explored the factors influencing carbon emissions from economic, social, and policy perspectives, relatively few have examined the relationship between ECER policies and carbon emissions. Specifically, most of the existing literature focuses on the impact of macroeconomic policies, industrial structure adjustments, and technological innovation on carbon emissions. However, there is a lack of systematic empirical analysis on how specific fiscal incentives directly affect carbon emissions, limiting our comprehensive understanding of the actual effects of fiscal policies on emission reduction. Secondly, most of the existing studies investigate carbon dioxide emissions from a single perspective, such as focusing on total carbon emissions, carbon intensity, or carbon efficiency. These studies lack a multi-faceted exploration of the relationship between a single policy and carbon emissions. Typically, research adopts a specific metric to measure policy effects, but this approach overlooks how different metrics might reveal various aspects of policy impact. Consequently, these studies fail to capture the multi-dimensional effects of policies on reducing carbon emissions comprehensively. This single-perspective research methodology cannot adequately reflect the multiple impacts of policies on carbon emissions across different scenarios and time periods. This paper aims to evaluate the impact of the ECER policy, jointly introduced by the Ministry of Finance and the National Development and Reform Commission in 2011, on CE1 and CE2. Given that the ECER policy was implemented in three batches of pilot cities, this study employs a multi-period Difference-in-Differences (DID) model for analysis. The advantage of this model lies in its ability to compare the effects of the policy before and after its implementation across multiple time points, thereby capturing the dynamic impacts of the policy. Furthermore, this article explores the mediating role of green innovation in the impact process of the ECER policy, revealing the policy’s varying effects on CE 1 and CE2 across different regions through heterogeneity analysis.The marginal contributions of this article: Firstly, this paper evaluates the relationship between ECER policies and carbon emissions, addressing a significant gap in the existing research. Although numerous studies have explored various factors influencing carbon emissions from different perspectives, there is a lack of systematic research on the actual effects of specific fiscal policies on energy conservation and emission reduction, particularly their direct impact on carbon emissions. Through empirical analysis and data validation, this study thoroughly investigates the specific mechanisms and effects of ECER policies on carbon emissions in practice, thus filling this research gap. Secondly, this paper systematically assesses the relationship between ECER policies and carbon emissions from two key perspectives: total carbon emissions and carbon efficiency. By considering these two important indicators, this study not only examines the impact of ECER fiscal policies on overall carbon emissions but also analyzes their role in improving carbon efficiency. Through an in-depth analysis of these two metrics, this paper provides a more comprehensive and multi-dimensional view, systematically evaluating the effectiveness and mechanisms of ECER policies.

The remainder of the article is organized as follows: the second part discusses the policy background and theoretical analysis; the third part details the model settings and variable explanations; the fourth part presents the empirical analysis; the fifth part analyzes regional heterogeneity; and the last part concludes with conclusions and policy recommendations.

Policy background and theoretical analysis

Policy background.

In 2011, the Ministry of Finance and the National Development and Reform Commission issued the “Notice on Conducting Comprehensive Demonstration Work of Fiscal Policies for Energy Conservation and Emission Reduction,” deciding to carry out comprehensive demonstrations of fiscal policies for ECER in some cities during the “Twelfth Five-Year” period. Beijing, Shenzhen, Chongqing, Hangzhou, Changsha, Guiyang, Jilin, and Xinyu were selected as the first batch of demonstration cities. In the subsequent years of 2013 and 2014, 10 and 12 cities were respectively chosen as pilot cities for the fiscal policies on ECER . Specifically, this policy uses cities as platforms and integrates fiscal policies as a means to comprehensively carry out urban ECER demonstrations in various aspects, including industrial decarbonization, transportation clean-up, building greening, service intensification, major pollutant reduction, and large-scale utilization of renewable energy. Its main goal in terms of CE 1 reduction is to establish a concept of green, circular, and low-carbon development in the demonstration cities, achieve widespread promotion of low-carbon technologies in industries, construction, transportation, and other fields, lead the pilot cities in ECER efforts across society, and significantly enhance their capacity for sustainable development. Figure  1 presents the spatial distribution of ECER policy pilot cities in the years 2011, 2013, and 2014 (This figure was created using ArcMap software).

figure 1

Distribution of ECER Policy Pilot Areas (Plan Approval Number GS(2019)1822).

Theoretical analysis

Carbon emission reduction effect of green fiscal policy.

Green fiscal policy, as a significant environmental governance tool, promotes the transformation of the economic and social system towards low-carbon, sustainable development through fiscal measures 39 . Its CE 1 reduction effects can be described from the following aspects. Firstly, green fiscal policy encourages the research and application of green technologies through economic incentives (such as tax reductions and fiscal subsidies) 40 . These technologies include energy efficiency improvement technologies, clean energy technologies, and carbon capture and storage technologies, which directly reduce energy consumption and CE 1 in economic activities. Secondly, green fiscal policy influences the behavior of consumers and producers by affecting the price mechanism. The imposition of a carbon tax raises the cost of CE 1 , reflecting the external cost of CE 1 on the environment, encouraging enterprises to take emission reduction measures, and prompting consumers to prefer low-carbon products and services 41 . The change in price signals promotes the transformation of the entire society’s energy consumption structure towards more efficient and low-carbon directions. Furthermore, green fiscal policy can support CE 1 reduction-related infrastructure construction and public service improvements through the guidance and redistribution of funds. This includes the construction and optimization of public transportation systems, urban greening, and forest conservation projects, which not only directly or indirectly reduce CE 1 but also enhance the carbon absorption capacity of cities and regions. Lastly, green fiscal policies, by raising public environmental awareness and participation, create a conducive atmosphere for all sectors of society to join in carbon reduction efforts 42 . Governments can increase public awareness of climate change and inspire a low-carbon lifestyle through the promotion and education of fiscal policies, providing broader social support for carbon reduction 43 .

Green fiscal policies not only drive a reduction in CE 1 but also stimulate sustainable economic growth. By taxing high-carbon activities, offering financial subsidies and incentives for green projects, these policies channel capital towards low-carbon and green industries. This not only mitigates negative environmental impacts but also fosters the development of emerging green technologies and sectors. As the green industry expands and low-carbon technologies become more widespread, economic growth increasingly relies on clean and efficient energy use 44 , thereby enhancing the CE 2 . Thus, the implementation of green fiscal policies demonstrates a commitment to transitioning towards a low-carbon economy, playing a crucial role in the global response to climate change, achieving a win–win for environmental protection and economic growth.

Based on this, the article proposes hypothesis 1: Green fiscal policies can promote CE 1 reduction effects and enhance CE 2 .

Mechanism analysis

Green innovation is a key factor in driving sustainable development, particularly playing a significant role in CE 1 reduction and efficiency enhancement. By introducing and adopting new environmentally friendly technologies and processes, green innovation not only significantly reduces greenhouse gas emissions but also enhances the efficiency of energy use and resource management, thus promoting a harmonious coexistence between economic activity and environmental protection. Green innovation, through the development and adoption of renewable energy technologies such as solar, wind, and biomass energy, directly reduces reliance on fossil fuels and the corresponding CE 1 . The application of these technologies not only reduces the carbon footprint but also promotes the diversification of energy supply and enhances energy security 45 . Green innovation also plays an essential role in improving energy efficiency. By adopting more efficient production processes and energy-using equipment, businesses and households can accomplish the same tasks or meet the same living needs with lower energy consumption, thus reducing CE 1 46 . Additionally, green innovation encompasses the concepts and practices of the circular economy, which encourages the reuse, recycling, and recovery of materials, reducing the extraction and processing of new materials and further lowering CE1s in the production process 47 . Green innovation includes the development of Carbon Capture, Utilization, and Storage (CCUS) technologies, which can directly capture carbon dioxide from industrial emissions and either convert it into useful products or safely store it, thereby reducing the carbon content in the atmosphere 48 . On the policy and management level, green innovation also involves establishing and refining mechanisms such as carbon pricing, green taxes, and carbon trading, which promote the adoption of low-carbon and environmentally friendly technologies and behaviors among businesses and individuals through economic incentives 49 . Based on this, the article proposes hypothesis H2: Green fiscal policies can promote CE 1 reduction effects and CE 2 by fostering green innovation.

In conclusion, the theoretical framework, as shown in Fig.  2 .

figure 2

Theoretical framework.

Model setting and variable description

To address the limitations faced by traditional regression models in evaluating policy implementation effects, this study utilizes DID model for analysis. Given the variation in the policy implementation years in this paper, the traditional DID model cannot be used 50 . Accordingly, this paper draws on the approach of Beck et al. 51 , employing a DID with multiple time periods to assess the policy effects, with the model set up as follows:

Y in the model is the explained variable, indicating CE 1 and CE 2 of the city i in the annual t . Treated i is the group variable, where it takes the value 1 if city i belongs to the treatment group, and 0 if it belongs to the control group; Post it is the post-treatment period dummy variable, where it takes the value 1 for city i in year t if ECER policy has been officially implemented, and 0 if it has not been officially implemented. This study investigates the impact of energy conservation and emission reduction fiscal policies on urban CE 1 and CE 2 by examining the effect of the interaction term Treated  ×  Post it on the dependent variable. The coefficient β 1 measures the impact of the policy on the dependent variable. Controls in this study represent control variables, specifically urbanization rate ( lnur ), foreign direct investment level ( lnfdi ), industrial structure ( lnis ), level of scientific and technological expenditure ( lnsst ), and fiscal revenue and expenditure level ( lnfre ), among others. \(\nu\) , \(\tau\) and \(\varepsilon\) represent city fixed effects, time fixed effects, and random error terms, respectively.

Considering the three-year implementation period of green fiscal policies, it is necessary to establish an exit mechanism for the treatment group. Drawing on existing literature 12 , this paper constructs the following treatment groups: the first batch of pilot cities from 2011 to 2014 is set to 1; the second batch of pilot cities from 2013 to 2016 is set to 1; the third batch of pilot cities from 2014 to 2017 is set to 1, with other years set to 0. The pilot cities are shown in Fig.  3 .

figure 3

ECER policy implementation period.

Variables and data sources

Explained variables.

Carbon Emissions: Drawing from existing literature, this article utilizes current CE 1 data to calculate CE 1 52 , 53 . It follows the guidelines on greenhouse gas emission allocations by the IPCC , taking into account the emissions of carbon dioxide within the administrative boundaries of each city. Territorial emissions refer to emissions occurring within the managed territory and maritime areas under the jurisdiction of a region 54 , including emissions from socio-economic sectors and direct residential activities within regional boundaries 55 .

Carbon Efficiency: Following existing literature, this paper measures CE 2 using the ratio of CE 1 to GDP 56 .

In examining the correlation between CE 1 and economic efficiency, Fig.  4 a provides an overview of the evolution of CE 1 from 2003 to 2019, while Fig.  4 b offers a detailed portrayal of the progress in CE 2 over the same period. Figure  4 a reveals a steady increase in total CE 1 beginning in 2002, with a notable acceleration post-2009, peaking in 2017. Despite some fluctuations and a slight dip in 2018, the figures for 2019 remained just below the peak, overall indicating an upward trajectory. In contrast, Fig.  4 b demonstrates a year-on-year improvement in CE 2 , measured in tens of thousands of yuan output per ton of carbon emitted, starting in 2003. The pace of growth accelerated significantly after 2011, reaching its zenith in 2019. This signifies a substantial rise in the economic output efficiency per unit of carbon emitted, revealing a reduction in carbon dependency within economic activities. The combined analysis of both figures indicates that, alongside economic growth, there has been a notable advancement in optimizing CE 2 .

figure 4

Trends in CE 1 ( a ) and CE 2 ( b ) (2003–2019).

Control variables

To eliminate the interference of omitted variables on the research results, this article selects the following control variables 57 , 58 : Urbanization rate ( lnur ), which refers to the ratio of urban population to total population; Level of foreign direct investment ( lnfdi ), the ratio of actual foreign investment to the GDP ; Industrial structure ( lnis ), the proportion of the secondary industry in GDP ; Level of science and technology expenditure ( lnsst ), the ratio of science and technology expenditure in ten thousand to GDP in hundred billion; Fiscal revenue and expenditure level ( lnfre ), the sum of local fiscal budget revenue and expenditure to GDP . To reduce heteroscedasticity in the data, this article takes the logarithm of all control variables. Table 1 reports the definitions of the main variables in this paper.

Sample selection and data source

We selects cities at the prefecture level in China from 2003 to 2019 as the research sample. Considering that missing data can affect the results, this paper excludes samples with missing data, ultimately obtaining 3134 samples. The CE 1 data in this paper comes from the China Emissions Accounts and Datasets (CEADs), which provides CE 1 data from 1997 to 2019, so the sample period for this paper ends in 2019. The control variable data are all sourced from the China City Statistical Yearbook covering the years 2004 to 2020. Table 2 provides descriptive statistics for the main variables in this paper.

Eliminating interference

In a quasi-natural experiment, various factors may influence the relationship between the implementation of green fiscal policies and the reduction of carbon emissions. To address this, we employed multiple methods to control for these potential confounding variables. Firstly, we introduced control variables to eliminate or reduce the interference of external factors on the main research relationship, ensuring the accurate estimation of the effects of green fiscal policies. Secondly, we adopted a two-way fixed effects model to control for time-invariant city characteristics and potential common time trends. Thirdly, we conducted parallel trend tests to verify whether the trends of the treatment and control groups were consistent before the policy implementation, ensuring the validity of the Difference-in-Differences (DID) estimates. Additionally, we performed multiple robustness checks, including propensity score matching and excluding the effects of other concurrent policies, to test the robustness of the results. Finally, we confirmed the reliability of the results through placebo tests. These methods collectively help to effectively reduce the interference of external variables, ensuring the accuracy and reliability of the research findings.

Empirical results

Benchmark regression analysis.

We employs a two-way fixed effects model for the empirical analysis of the CE 1 reduction effects of ECER policies, with the estimation results presented in Table 3 . Columns (1) to (3) of Table 3 report the estimation results of green fiscal policies on CE 1 . The results show that, when the model does not include control variables, the implementation of green fiscal policies has an estimated coefficient of − 0.070 for CE 1 , significant at the 1% level, indicating that the CE 1 of pilot cities are 7.0% lower than those of non-pilot cities. After adding control variables, the results do not change significantly. Columns (4) to (6) report the estimation results of green fiscal policies on CE 2 . The results indicate that, when the model does not include control variables, the implementation of green fiscal policies has an estimated coefficient of 0.099 for CE 2 , significant at the 1% level, suggesting that the CE 2 of pilot cities is 9.9% higher than that of non-pilot cities. After including control variables, the results remain largely unchanged. This provides evidence for Hypothesis 1: ECER policies have a significant CE 1 reduction effect and also significantly promote CE 2 .

To further illustrate the step-by-step changes in the coefficients, this paper presents Fig.  5 . The horizontal axis of Fig.  5 represents the number of control variables, while the vertical axis indicates the coefficients, with the grey area denoting the error bars. As evident from Fig.  5 , the coefficients and error bars exhibit minimal variation with the increase in control variables, indicating a negligible impact of the number of control variables on the coefficients and highlighting their stability. This finding suggests that the primary regression coefficients remain consistent even when more control variables are included in the analysis, underscoring the model’s robustness.

figure 5

Plot of coefficient variation based on the step by step method.

Parallel trend test

The prerequisite for using DID model to evaluate policies is the parallel trends assumption. This implies that, before the policy intervention, the treatment group and the control group should exhibit similar trends without systematic differences. After the policy intervention, the trends between these two groups should diverge significantly. Following existing literature 50 , 59 , 60 , this paper employs an event study approach to analyze the effects before and after the policy implementation.

In Eq. ( 2 ), the variable Treated still represents cities that have been approved to establish pilot ECER policies. To avoid perfect multicollinearity, this paper uses the year before policy implementation as the baseline group, meaning that k  = −  1 is not included in the regression equation, and the other parts of the model are consistent with the baseline model. If the coefficient is not significant when k  <  0 , it indicates that the estimated results satisfy the parallel trends assumption. Figure  6 shows that, before the implementation of the policy, all coefficients are not significant, and in the fifth year after policy implementation, the coefficients start to become significant. This indicates that the implementation of ECER policies has a significant promotional effect on CE 1 reduction and CE 2 in the pilot areas, but this effect has some lag.

figure 6

Parallel trend test of CE 1 ( a ) and CE 2 ( b ).

Robustness test

Exclusion of contemporaneous policies.

The smart city construction policy began with the “Notice on Carrying out the National Smart City Pilot Work” issued by the Ministry of Housing and Urban–Rural Development in 2012, with smart city pilots being established in 2012, 2013, and 2014 61 . This paper excludes all smart pilot cities and re-runs the regression, with results shown in columns (1) and (2) of Table 4 . The results indicate that contemporaneous policies during the sample period caused some interference with the estimated coefficients, but the extent is very limited. The implementation of ECER policies still has statistically and economically significant effects on promoting CE 1 reduction and CE 2 in pilot cities.

We employs the Propensity Score Matching (PSM) method to process the data, aiming to reduce data bias and the impact of confounding factors 62 , 63 . Through PSM-DID analysis, the results show that after matching, the absolute bias (|bias|) of all variables decreases by more than 70%, and the p -values are not statistically significant. This comparative analysis reveals the effectiveness of PSM in reducing the initial bias between the treatment and control groups. Therefore, the matching process successfully achieves balance in characteristics between the two groups across key indicators, making the assessment of the treatment effect more accurate and reliable.

Table 4 reports the results of the PSM. The propensity score matching results show a substantial decrease in |bias| for variables, highlighting an enhanced balance between treated and control groups post-matching. For instance, the absolute bias for “lnur” dropped from 86.0% to just 3.3%, showcasing a 96.2% reduction in bias, which underscores the effectiveness of the matching process. Similarly, other variables like “lnfdi”, “lnis”, and “lnsst” experienced significant reductions in bias. The p  >|t| values, mostly above 0.05 post-matching, indicate that the differences between groups are not statistically significant, affirming the success of the matching in minimizing discrepancies and improving comparability.

Figure  7 displays the matching results of PSM. The results indicate that after the matching process, the percentage bias (%bias) for the control variables all remain below 10%. This finding fully confirms the effectiveness of the PSM method in balancing key characteristics between the experimental and control groups, thereby ensuring the accuracy and reliability of subsequent analyses.

figure 7

Balance test.

This paper conducts an empirical analysis using matched data, with the results shown in columns (3) and (4) of Table 5 . The results indicate that ECER policy still has a significant CE 1 reduction effect and also significantly promotes CE 2 . This suggests that there is no significant impact of self-selection bias on the regression results in this study.

To reduce the impact of outliers on regression analysis, this paper adopts a winsorization process 39 , 64 , which involves replacing observations below a certain threshold with the 1st percentile and those above the threshold with the 99th percentile before conducting the regression. Columns (5) and (6) of Table 5 display the analysis results after this treatment, showing that the impact of outliers on the regression results is not significant.

Replacement sample time

Considering the potential unique impact of the COVID-19 pandemic on CE 1 and CE 2 in 2019, this paper decided to exclude data from 2019 to ensure the robustness of the research results, thus avoiding the interference of pandemic-related outliers in the analysis. Subsequently, the paper conducted an empirical analysis based on the updated dataset, with the analysis results presented in columns (7) and (8) of Table 5 . The analysis results indicate that after excluding the special impact of the COVID-19 pandemic, the CE 1 reduction effect of the green fiscal policy remains significant, and there is still a significant promotional effect on CE 2 .

Placebo test

The DID model is based on the common trends assumption, which posits that, in the absence of an intervention, the trends of the treatment and control groups would have been similar 65 . By conducting a placebo test on data from before the intervention, this assumption can be tested for validity. If significant ‘intervention effects’ are also found during the placebo test conducted before the intervention or at irrelevant time points, this indicates that the effects estimated by DID are actually caused by other unobserved factors, rather than the intervention itself 66 . Referencing the placebo practices in existing literature 59 , this paper tests for the impact of unobservable factors on the estimation results. The study randomizes the impact of ECER policies across cities, selecting treatment groups randomly from 248 cities, with the remaining cities serving as control groups. This randomization process is repeated 500 times to generate a distribution graph of the regression coefficients, where the dashed line in the graph represents the actual regression coefficient, as specifically shown in Fig.  8 . Figure  8 a represents the placebo test for CE 1 , and Fig.  8 b for CE 2 . From Fig.  8 , it is evident that after randomizing the core explanatory variables, the mean of the coefficients is close to 0, and the mean of the coefficients after randomization significantly deviates from their true values. This indicates that, excluding the interference of other random factors on the empirical results, the green fiscal policy has a significant effect on CE 1 reduction and significantly promotes CE 2 .

figure 8

Placebo test of CE 1 ( a ) and CE 2 ( b ).

Mechanism test

The analysis results presented earlier indicate that the ECER policy has significantly promoted CE 1 reduction and the improvement of CE 2 in pilot cities. Accordingly, this study will further explore the mechanism of action of ECER policy and has constructed the following model:

GI refers to green innovation. Following existing literature, this study uses the number of green invention patent grants ( lngi_invention ) and the total number of green patents per 10,000 people ( lnpgi_total ) as proxy variables for green innovation 67 , 68 . Due to the evident causal inference flaws in the three-stage mediation mechanism test 69 , this study refers to the mediation effect test model by Niu et al. 70 and employs the Sobel test to further evaluate the regression results, thereby enhancing the completeness and credibility of the mechanism test 71 . The regression results are shown in Table 6 . Columns (1) and (4) report the impact of the ECER policy on green innovation, with significant results. This confirms hypothesis H2: green fiscal policies can promote CE 1 reduction effects and CE 2 by fostering green innovation. Moreover, the Sobel Z coefficients are greater than 2.58, indicating that the mediating variable has a sufficiently strong explanatory power for the total effect.

Heterogeneity analysis

By city grade.

In the process of urbanization and industrialization, a city’s level often reflects its level of economic development, capacity for technological innovation, infrastructure completeness, and the comprehensiveness of its public services. This paper categorizes the sample cities based on their tier into higher-level cities (provincial capitals, sub-provincial cities, and municipalities directly under the Central Government) and general cities, and conducts regression analysis. The regression results shown in Table 7 , specifically in columns (1), (2), (6), and (7), indicate that in higher-tier cities, the coefficients of the ECER policy on CE 1 and CE 2 for pilot cities are -0.098 and 0.118, respectively, significant at the 1% level. However, in general cities, the absolute values of the coefficients are smaller and not significant. From this, we can conclude that the ECER policy’s effect on CE 1 reduction and the enhancement of CE 2 is more significant in higher-tier cities compared to general cities. Higher-level cities, with their advanced economic structures, abundant fiscal resources, high levels of technological innovation, and strong policy enforcement capabilities, make the green fiscal policy more effective in these areas in terms of CE 1 reduction and the promotion of CE 2 . Firstly, economically developed higher-tier cities have more sufficient fiscal funds and investment capacity, which can support large-scale green infrastructure construction and green technology R&D, thereby directly reducing urban CE 1 and improving energy use efficiency. Secondly, technological innovation is a key factor in improving CE 2 . As centers of technological innovation and information exchange, higher-level cities are more likely to attract and gather high-tech companies and research institutions, promoting the development and application of green technologies, and effectively reducing CE 1 . Additionally, higher-tier cities usually have more comprehensive laws, regulations, and policy enforcement mechanisms, ensuring the effective implementation and regulation of green fiscal policies. Also, residents in these cities often have higher environmental awareness and a preference for green consumption, which helps to create a favorable social atmosphere for the implementation of green fiscal policies. Finally, due to their strong regional influence and exemplary role, higher-tier cities can promote green transformation and low-carbon development in surrounding areas and even the entire country through policy guidance and market incentives, further amplifying the CE 1 reduction effect and enhancing the impact on CE 2 of green fiscal policies.

By geographic location

Given the significant differences in economic development levels, resource endowments, and institutional environments across regions in China, the implementation effects of the ECER policy may exhibit heterogeneity. Therefore, this paper divides the sample into eastern, central, and western regions for analysis and conducts regressions separately. The regression results are presented in Table 7 . Columns (3) to (5) and (8) to (9) of Table 7 show the regression results for CE1s and CE 2 , respectively, with columns (3) and (8) representing the results for the eastern region. The analysis indicates that, in the eastern region, the ECER policy significantly promotes carbon reduction and CE 2 . Although the policy’s effects in the central region are less than those in the eastern region, they still exhibit a positive impact. In contrast, in the western region, the ECER policy’s promotional effects on carbon reduction and CE 2 are not significant.

This analysis reveals that, within the regional development pattern of China, the eastern regions exhibit more significant outcomes in terms of the CE 1 reduction effect and the enhancement of CE 2 under green fiscal policies compared to the central and western regions. Firstly, as the most economically developed area in China, the eastern region, with its leading total economic output, industrialization, and urbanization levels, provides a solid fiscal support and technological foundation for the implementation of green fiscal policies. This economic advantage enables the eastern region to allocate more resources to the research, development, and application of green technologies, as well as related infrastructure construction, thereby effectively promoting CE 1 reduction and energy efficiency improvement. Secondly, environmental policies and regulations in the eastern region are generally stricter and more advanced. Coupled with a higher public awareness of environmental protection, this creates a favorable social environment and policy atmosphere for the implementation of green fiscal policies and carbon reduction. Additionally, the industrial structure in the eastern region is more optimized and high-end compared to the central and western regions, with a larger proportion of the service industry and high-tech industries, which typically have lower energy consumption intensity and CE 1 , facilitating the improvement of overall CE 2 . Furthermore, as an important gateway for international trade and investment, the eastern region is more open to adopting and introducing advanced green technologies and management practices from abroad, accelerating the pace of green transformation. Lastly, the dense urban network and well-developed transportation and logistics systems in the eastern region provide convenient conditions for the effective implementation of green fiscal policies. Therefore, due to comprehensive advantages in economic development level, industrial structure, policy environment, technological innovation capability, and infrastructure, the eastern region demonstrates more significant performance in the CE 1 reduction effect and the promotion of CE 2 under green fiscal policies.

Figure  9 reports the main regression coefficients and error bars from the heterogeneity analysis, clearly illustrating the distribution of coefficients.

figure 9

Results of heterogeneity analysis.

Classification by resource-based city

Resource-based cities center on industries involved in the extraction and processing of local natural resources, including minerals and forests 72 , 73 , 74 . Due to their unique urban characteristics, these cities may have a specific impact on the efficacy of ECEP policy. Consequently, this paper follows the guidelines set forth by the State Council in the “National Plan for Sustainable Development of Resource-based Cities (2013–2020),” dividing the sample into resource-based and non-resource-based cities for separate regression analyses, the results of which are presented in Table 8 . Columns (1) and (2) detail the regression outcomes for CE 1 , while columns (3) and (4) address CE 2 . The findings reveal that, compared to resource-based cities, the effect of ECEP policies on carbon reduction is more pronounced in non-resource-based cities, with a similarly more substantial impact on the promotion of CE 2 .

Upon conducting a thorough analysis of the disparities in how non-resource-based cities and resource-based cities respond to ECER policies, a significant finding emerges: non-resource-based cities, due to their diversified industrial structures and lower reliance on highly polluting and energy-intensive heavy industries and mineral resource extraction, demonstrate a stronger capacity to adopt and promote new energy, clean energy, and energy-efficient technologies. This characteristic of their industrial structure not only facilitates effective carbon reduction efforts but also propels a shift in economic growth models towards services, high-tech industries, and innovation-driven sectors, which are associated with lower energy consumption and carbon intensities. Therefore, the potential for ECER policies to enhance CE 2 and reduce CE 1 is greater in these cities. In contrast, resource-based cities, due to their long-standing dependence on resource extraction, exhibit significant inertia in their economic structure, technological levels, and employment opportunities. This inertia not only complicates their transition and industrial restructuring but also increases the associated costs. Against this backdrop, non-resource-based cities are more likely to achieve notable successes in implementing ECER policies compared to their resource-based counterparts.

Conclusions and policy recommendations

Conclusions.

Based on the city-level dataset from 2003 to 2019, this paper employs a multi-time point difference-in-differences model to thoroughly explore the impact of the ECER policy on CE 1 reduction and CE 2 , reaching the following conclusions:

The ECER policy is confirmed to play a significant role in promoting the reduction of CE 1 and enhancing CE 2 . This conclusion remains robust even after controlling for factors that might affect the accuracy of the assessment, such as contemporaneous policy interferences, sample selection biases, extreme value treatments, and other random factors. This indicates that the ECER policy has important practical implications in mitigating climate change impacts, and its effects are not significantly influenced by the aforementioned potential interferences. The ECER policy effectively promotes CE 1 reduction and CE 2 improvements by incentivizing the research and application of green technologies. This finding underscores the mediating role of green innovation in environmental policies, highlighting that fiscal incentives such as tax breaks and subsidies are crucial for promoting technological innovation and application, and further achieving environmental benefits. The CE 1 reduction effect and CE 2 enhancement of the ECER policy are more pronounced in economically developed, higher-tier cities and in the eastern regions. This may be due to these areas having better infrastructure, higher technological innovation capabilities, more abundant fiscal resources, and stronger public environmental awareness, which all provide strong support for the effective implementation of the ECER policy. Moreover, this variation also suggests that policymakers need to consider regional characteristics when implementing relevant policies to maximize policy effectiveness.

Existing literature has explored the role of energy conservation and emission reduction fiscal policies in environmental protection, such as green credit 37 , ESG performance 75 , green total factor carbon efficiency 36 , and sustainable urban development 38 . These studies report the positive impact of such policies on the environment. However, they do not directly examine the impact of these policies on pollutants. Our study extends the existing literature by investigating the relationship between these policies and carbon emissions. Green fiscal policies significantly promote the reduction of carbon emissions (CE1) and the improvement of carbon efficiency (CE2) through economic incentives, price mechanisms, infrastructure support, and increasing public environmental awareness. Specifically, these policies encourage the research and application of green technologies, change consumer and producer behavior, optimize energy consumption structures, support related infrastructure construction, and increase public participation in low-carbon living. Additionally, green fiscal policies promote sustainable economic growth by directing funds towards low-carbon and green industries, fostering the development of green technologies and industries. Overall, green fiscal policies have not only achieved significant environmental protection results but also played a crucial role in realizing the dual goals of economic growth and environmental protection.

Despite the significant findings, our study has some limitations. Firstly, the data is limited to 248 cities from 2003 to 2019, which may not fully capture the long-term impact of ECER policies. Secondly, reliance on existing data may introduce biases, as not all relevant factors could be considered. Future research could address these limitations by expanding the dataset, including more diverse regions, and employing alternative methods to validate these findings.

Policy recommendations

Based on the above analysis, the policy recommendations of this paper are as follows:

Continue to increase fiscal support. The government should continue to enhance fiscal support for the ECER policy, including expanding the scope of tax reductions and increasing the level of fiscal subsidies, especially for those projects and technologies that can significantly improve energy efficiency and reduce CE 1 . This will further stimulate the innovation motivation of enterprises and research institutions, accelerating the research and development (R&D) and application of low-carbon technologies.

Optimize policy design and implementation mechanisms. Considering the robustness of the ECER policy effects, the government should further refine the policy design to ensure that measures precisely target sectors and aspects with high CE 1 . Concurrently, it is crucial to establish and enhance the supervision mechanism for policy execution, ensuring effective implementation of policy measures. This approach also necessitates timely adjustments and optimizations of the policy to tackle new challenges effectively.

Establish a dedicated Green Technology Innovation Fund. This fund aims to provide financial support specifically for R&D and promotion of green technologies with high CE 2 . By offering startup capital, R&D subsidies, and rewards for the successful commercialization of green technologies, the fund can not only stimulate the innovation drive of enterprises and research institutions but also accelerate the transformation of green technologies from theory to practice. Consequently, this will promote CE 1 reduction and CE 2 enhancement on a broader scale. This initiative directly responds to the importance of fiscal incentive measures for promoting technological innovation and application emphasized in the research, ensuring the ECER policy maximizes its benefits in promoting green development.

Differentiated policy design. Given the variations in the effects of the ECER policy across different regions, policymakers should design and implement differentiated energy-saving and emission reduction policies based on regional factors such as economic development level, industrial structure, and resource endowment. For economically more developed areas with a stronger technological foundation, CE 1 reduction can be promoted by introducing higher standards for environmental protection and mechanisms for rewarding technological innovation. For regions that are relatively less economically developed, the focus should be on providing technical support and financial assistance to enhance their capacity for CE 1 reduction.

Green fiscal policies play a crucial role in reducing carbon emissions and promoting sustainable economic growth, but their impact on social and income inequality needs careful consideration. Firstly, while policies like carbon taxes are effective in reducing emissions, they may place a significant burden on low-income households, as a larger proportion of their income goes towards energy and basic necessities. To mitigate this inequality, governments can implement redistributive measures, such as using carbon tax revenues for direct subsidies or tax reductions for low-income families, ensuring social equity while achieving emission reductions. Secondly, green fiscal policies encourage investment in green technologies and the implementation of green projects. However, these incentives often favor businesses and wealthy families capable of making such investments, potentially widening income disparities. Therefore, policy design should consider inclusive growth by providing green job training and encouraging small and medium-sized enterprises to participate in green projects, ensuring that various social strata benefit from the green economy. Furthermore, in terms of public investment, governments should prioritize low-income and marginalized communities, ensuring they also benefit from the construction of green infrastructure. This includes prioritizing the development of public transportation and renewable energy projects in these areas, thereby reducing living costs and improving the quality of life for these communities. By adopting these redistributive measures and inclusive policy designs, green fiscal policies can achieve the goals of environmental protection and economic growth while effectively mitigating their negative impacts on social and income inequality, promoting sustainable and inclusive development.

When evaluating various policy tools for achieving carbon reduction goals, it is evident that carbon taxes, renewable energy subsidies, ECER policies, emissions trading systems, and energy efficiency standards each have their unique advantages (see Table 9 ). Carbon taxes leverage price mechanisms to encourage emissions reduction and provide redistribution opportunities, while renewable energy subsidies promote technological advancement and market development. ECER policies offer direct incentives and support for infrastructure, resulting in long-term environmental benefits. Emissions trading systems combine cap-and-trade controls with market flexibility, and energy efficiency standards provide direct pathways to emissions reduction. In practical applications, the integrated use of multiple policy tools, fully utilizing their respective advantages, can more effectively achieve carbon reduction goals and drive the transition to a low-carbon economy. Policymakers must consider equity, economic impact, and public acceptance when designing these policies to balance environmental protection with economic growth. Through careful integration and balanced implementation, green fiscal policies can significantly reduce carbon emissions while promoting sustainable and inclusive economic development.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Martin, G. & Saikawa, E. Effectiveness of state climate and energy policies in reducing power-sector CO2 emissions. Nat. Clim. Change 7 , 912–919 (2017).

Article   ADS   Google Scholar  

Soergel, B. et al. A sustainable development pathway for climate action within the UN 2030 Agenda. Nat. Clim. Change 11 , 656–664 (2021).

Terhaar, J., Frölicher, T. L., Aschwanden, M. T., Friedlingstein, P. & Joos, F. Adaptive emission reduction approach to reach any global warming target. Nat. Clim. Change 12 , 1136–1142 (2022).

Gidden, M. J. et al. Aligning climate scenarios to emissions inventories shifts global benchmarks. Nature 624 , 102–108 (2023).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Brown, P. T. et al. Climate warming increases extreme daily wildfire growth risk in California. Nature 621 , 760–766 (2023).

Article   ADS   PubMed   Google Scholar  

Liu, Y., Cai, W., Lin, X. & Li, Z. Increased extreme swings of Atlantic intertropical convergence zone in a warming climate. Nat. Clim. Change 12 , 828–833 (2022).

Tebaldi, C. et al. Extreme sea levels at different global warming levels. Nat. Clim. Change 11 , 746–751 (2021).

Chaobo, Z. & Qi, S. Can carbon emission trading policy break China’s urban carbon lock-in?. J. Environ. Manag. 353 , 120129 (2024).

Article   Google Scholar  

Wu, S. Smart cities and urban household carbon emissions: A perspective on smart city development policy in China. J. Clean Prod. 373 , 133877 (2022).

Zhang, H., Feng, C. & Zhou, X. Going carbon-neutral in China: Does the low-carbon city pilot policy improve carbon emission efficiency. Sustain. Prod. Consump. 33 , 312–329 (2022).

Shan, Y. et al. Impacts of COVID-19 and fiscal stimuli on global emissions and the Paris Agreement. Nat. Clim. Chang. 11 , 200–206 (2021).

Sun, L. & Feng, N. Research on fiscal policies supporting green and low-carbon transition to promote energy conservation and emission reduction in cities: Empirical evidence from China. J. Clean. Prod. 430 , 139688 (2023).

Zhu, X. & Lu, Y. Open economy, fiscal expenditure of environmental protection and pollution governance: Evidences from China’s provincial and industrial panel data. China Popul. Resour. Environ. 27 , 10–18 (2017).

Google Scholar  

Fan, H. & Liang, C. The pollutant and carbon emissions reduction synergistic effect of green fiscal policy: Evidence from China. Financ. Res. Lett. 58 , 104446 (2023).

Waheed, R., Sarwar, S. & Wei, C. The survey of economic growth, energy consumption and carbon emission. Energy Rep. 5 , 1103–1115 (2019).

Esso, L. J. & Keho, Y. Energy consumption, economic growth and carbon emissions: Cointegration and causality evidence from selected African countries. Energy 114 , 492–497 (2016).

Li, J. & Li, S. Energy investment, economic growth and carbon emissions in China-Empirical analysis based on spatial Durbin model. Energy Policy 140 , 111425 (2020).

Yin, J., Zheng, M. & Chen, J. The effects of environmental regulation and technical progress on CO 2 Kuznets curve: An evidence from China. Energy Policy 77 , 97–108 (2015).

Al-Mulali, U., Saboori, B. & Ozturk, I. Investigating the environmental Kuznets curve hypothesis in Vietnam. Energy Policy 76 , 123–131 (2015).

Danish, Hassan, S. T., Baloch, M. A., Mahmood, N. & Zhang, J. Linking economic growth and ecological footprint through human capital and biocapacity. Sustain. Cities Soc. 47 , 101516 (2019).

Hassan, S. T., Batool, B., Zhu, B. & Khan, I. Environmental complexity of globalization, education, and income inequalities: New insights of energy poverty. J. Clean. Prod. 340 , 130735 (2022).

Hassan, S. T., Danish, Khan, S.U.-D., Xia, E. & Fatima, H. Role of institutions in correcting environmental pollution: An empirical investigation. Sustain. Cities Soc. 53 , 101901 (2020).

Awan, A., Sadiq, M., Hassan, S. T., Khan, I. & Khan, N. H. Combined nonlinear effects of urbanization and economic growth on CO2 emissions in Malaysia. An application of QARDL and KRLS. Urban Clim. 46 , 101342 (2022).

Khan, K. & Khurshid, A. Are technology innovation and circular economy remedy for emissions? Evidence from the Netherlands. Environ. Dev. Sustain. 26 , 1435–1449 (2024).

Zhu, Q. & Peng, X. The impacts of population change on carbon emissions in China during 1978–2008. Environ. Impact Assess. Rev. 36 , 1–8 (2012).

Khurshid, A., Rauf, A., Qayyum, S., Calin, A. C. & Duan, W. Green innovation and carbon emissions: the role of carbon pricing and environmental policies in attaining sustainable development targets of carbon mitigation—Evidence from Central-Eastern Europe. Environ. Dev. Sustain. 25 , 8777–8798 (2023).

Li, Z. et al. Climate change and the UN-2030 agenda: Do mitigation technologies represent a driving factor? New evidence from OECD economies. Clean Technol. Environ. Policy 25 , 195–209 (2023).

Hassan, S. T., Batool, B., Sadiq, M. & Zhu, B. How do green energy investment, economic policy uncertainty, and natural resources affect greenhouse gas emissions? A Markov-switching equilibrium approach. Environ. Impact Assess. Rev. 97 , 106887 (2022).

Liao, N., Luo, X. & He, Y. Could environmental regulation effectively boost the synergy level of carbon emission reduction and air pollutants control? Evidence from industrial sector in China. Atmos. Pollut. Res. 15 , 102173 (2024).

Meng, X., Zhang, M. & Zhao, Y. Environmental regulation and green transition: Quasi-natural experiment from China’s efforts in sulfur dioxide emissions control. J. Clean Prod. 434 , 139741 (2024).

Chen, J. & Hu, L. Does environmental regulation drive economic growth through technological innovation: Application of nonlinear and spatial spillover effect. Sustainability 14 , 16455 (2022).

Zhang, Y. & Li, X. Environmental regulation and high-quality economic growth: Quasi-natural experimental evidence from China. Environ. Sci. Pollut. Res. 29 , 85389–85401 (2022).

Wang, X., Chai, Y., Wu, W. & Khurshid, A. The empirical analysis of environmental regulation’s spatial spillover effects on green technology innovation in China. Int. J. Environ. Res. Public Health 20 , 1069 (2023).

Article   PubMed   PubMed Central   Google Scholar  

Khurshid, A., Huang, Y., Cifuentes-Faura, J. & Khan, K. Beyond borders: Assessing the transboundary effects of environmental regulation on technological development in Europe. Technol. Forecast. Soc. Change 200 , 123212 (2024).

Zhu, Y., Han, S., Zhang, Y. & Huang, Q. Evaluating the effect of government emission reduction policy: Evidence from demonstration cities in China. Int. J. Environ. Res. Public Health 18 , 4649 (2021).

Li, G. & Wang, X. Can green fiscal policy improve green total factor carbon efficiency? Evidence from China. J. Environ. Plan. Manag. https://doi.org/10.1080/09640568.2024.2352554 (2024).

Cheng, Y. & Xu, Z. Fiscal policy promotes corporate green credit: Experience from the construction of energy conservation and emission reduction demonstration cities in China. Green Financ. 6 , 1–23 (2024).

Lin, B. & Zhu, J. Impact of energy saving and emission reduction policy on urban sustainable development: Empirical evidence from China. Appl. Energy 239 , 12–22 (2019).

Liu, S., Xu, P. & Chen, X. Green fiscal policy and enterprise green innovation: evidence from quasi-natural experiment of China. Environ. Sci. Pollut. Res. 30 , 94576–94593 (2023).

Nordhaus, W. The Climate Casino: Risk, Uncertainty, and Economics for a Warming World (Yale University Press, 2013). https://doi.org/10.2307/j.ctt5vkrpp .

Book   Google Scholar  

Green, J. F. Does carbon pricing reduce emissions? A review of ex-post analyses. Environ. Res. Lett. 16 , 043004 (2021).

Ma, B., Sharif, A., Bashir, M. & Bashir, M. F. The dynamic influence of energy consumption, fiscal policy and green innovation on environmental degradation in BRICST economies. Energy Policy 183 , 113823 (2023).

Hu, Y., Ding, Y., Liu, J., Zhang, Q. & Pan, Z. Does carbon mitigation depend on green fiscal policy or green investment?. Environ. Res. Lett. 18 , 045005 (2023).

Nie, C., Li, R., Feng, Y. & Chen, Z. The impact of China’s energy saving and emission reduction demonstration city policy on urban green technology innovation. Sci. Rep. 13 , 15168 (2023).

Fischedick, M. et al. Industry. In: Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Technical Report. (2014).

Sorrell, S. Reducing energy demand: A review of issues, challenges and approaches. Renew. Sustain. Energy Rev. 47 , 74–82 (2015).

Ghisellini, P., Cialani, C. & Ulgiati, S. A review on circular economy: The expected transition to a balanced interplay of environmental and economic systems. J. Clean. Prod. 114 , 11–32 (2016).

Leung, D. Y. C., Caramanna, G. & Maroto-Valer, M. M. An overview of current status of carbon dioxide capture and storage technologies. Renew. Sustain. Energy Rev. 39 , 426–443 (2014).

Stiglitz, J. E. et al. Report of the High-Level Commission on Carbon Prices. 1–61 (2017) https://doi.org/10.7916/d8-w2nc-4103 .

Callaway, B. & Sant’Anna, P. H. C. Difference-in-differences with multiple time periods. J. Econom. 225 , 200–230 (2021).

Article   MathSciNet   Google Scholar  

Beck, T., Levine, R. & Levkov, A. Big bad banks? The winners and losers from bank deregulation in the United States. J. Financ. 65 , 1637–1667 (2010).

Shan, Y. et al. City-level emission peak and drivers in China. Sci. Bull. 67 , 1910–1920 (2022).

Shan, Y. et al. City-level climate change mitigation in China. Sci. Adv. 4 , eaaq0390 (2018).

Eggleston, H. S., Buendia, L., Miwa, K., Ngara, T. & Tanabe, K. 2006 IPCC Guidelines for National Greenhouse Gas Inventories. (2006).

Shan, Y. et al. Methodology and applications of city level CO2 emission accounts in China. J. Clean. Prod. 161 , 1215–1225 (2017).

Xu, Q., Zhong, M. & Cao, M. Does digital investment affect carbon efficiency? Spatial effect and mechanism discussion. Sci. Total Environ. 827 , 154321 (2022).

Article   PubMed   Google Scholar  

Sun, W. & Huang, C. How does urbanization affect carbon emission efficiency? Evidence from China. J. Clean. Prod. 272 , 122828 (2020).

Opoku, E. E. O. & Boachie, M. K. The environmental impact of industrialization and foreign direct investment. Energy Policy 137 , 111178 (2020).

Li, P., Lu, Y. & Wang, J. Does flattening government improve economic performance? Evidence from China. J. Dev. Econ. 123 , 18–37 (2016).

Guo, Q. & Zhong, J. The effect of urban innovation performance of smart city construction policies: Evaluate by using a multiple period difference-in-differences model. Technol. Forecast. Soc. Chang. 184 , 122003 (2022).

Guo, Q., Wang, Y. & Dong, X. Effects of smart city construction on energy saving and CO 2 emission reduction: Evidence from China. Appl. Energy 313 , 118879 (2022).

Abadie, A. & Imbens, G. W. Matching on the estimated propensity score. Econometrica 84 , 781–807 (2016).

Lyu, C., Xie, Z. & Li, Z. Market supervision, innovation offsets and energy efficiency: Evidence from environmental pollution liability insurance in China. Energy Policy 171 , 113267 (2022).

Park, B. U., Simar, L. & Zelenyuk, V. Local likelihood estimation of truncated regression and its partial derivatives: Theory and application. J. Econ. 146 , 185–198 (2008).

Baker, A. C., Larcker, D. F. & Wang, C. C. Y. How much should we trust staggered difference-in-differences estimates?. J. Financ. Econ. 144 , 370–395 (2022).

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W. & Wager, S. Synthetic difference-in-differences. Am. Econ. Rev. 111 , 4088–4118 (2021).

Cho, J. H. & Sohn, S. Y. A novel decomposition analysis of green patent applications for the evaluation of R&D efforts to reduce CO2 emissions from fossil fuel energy consumption. J. Clean. Prod. 193 , 290–299 (2018).

Zhao, P., Lu, Z., Kou, J. & Du, J. Regional differences and convergence of green innovation efficiency in China. J. Environ. Manag. 325 , 116618 (2023).

Jiang, T. Mediating and moderating effects in empirical studies of causal inference. China Ind. Econ. https://doi.org/10.19581/j.cnki.ciejournal.2022.05.005 (2022).

Niu, Z., Xu, C. & Wu, Y. Business environment optimization, human capital effect and firm labor productivity. J. Manag. World 39 , 83–100 (2023).

Aguinis, H., Edwards, J. R. & Bradley, K. J. Improving our understanding of moderation and mediation in strategic management research. Org. Res. Methods 20 , 665–685 (2017).

Chen, W. et al. Exploring the industrial land use efficiency of China’s resource-based cities. Cities 93 , 215–223 (2019).

Li, B., Han, Y., Wang, C. & Sun, W. Did civilized city policy improve energy efficiency of resource-based cities? Prefecture-level evidence from China. Energy Policy 167 , 113081 (2022).

Wang, Y. et al. Has the sustainable development planning policy promoted the green transformation in China’s resource-based cities. Resour. Conserv. Recycl. 180 , 106181 (2022).

Miao, S., Tuo, Y., Zhang, X. & Hou, X. Green fiscal policy and ESG performance: Evidence from the energy-saving and emission-reduction policy in China. Energies 16 , 3667 (2023).

Download references

This study is supported by the National Social Science Fund Major Project: “Research on the Policy System and Implementation Path to Accelerate the Formation of New Productive Forces,” Project Number: 23&ZD069.

Author information

Authors and affiliations.

School of Public Finance and Administration, Harbin University of Commerce, Harbin, 150028, China

Shuguang Wang & Zequn Zhang

School of Finance, Harbin University of Commerce, Harbin, 150028, China

Zhicheng Zhou & Shen Zhong

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by ZC. Z, S. Z. The first draft of the manuscript was written by ZQ. Z and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Shen Zhong .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, S., Zhang, Z., Zhou, Z. et al. The carbon emission reduction effect of green fiscal policy: a quasi-natural experiment. Sci Rep 14 , 20317 (2024). https://doi.org/10.1038/s41598-024-71728-1

Download citation

Received : 07 March 2024

Accepted : 30 August 2024

Published : 02 September 2024

DOI : https://doi.org/10.1038/s41598-024-71728-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Carbon emission reduction
  • Fiscal policies for energy conservation and emission reduction
  • Multi-period difference-in-differences method
  • Quasi-natural experiment

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

quasi experiment duration

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Psychol Med
  • v.43(5); 2021 Sep

The Limitations of Quasi-Experimental Studies, and Methods for Data Analysis When a Quasi-Experimental Research Design Is Unavoidable

Chittaranjan andrade.

1 Dept. of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India.

A quasi-experimental (QE) study is one that compares outcomes between intervention groups where, for reasons related to ethics or feasibility, participants are not randomized to their respective interventions; an example is the historical comparison of pregnancy outcomes in women who did versus did not receive antidepressant medication during pregnancy. QE designs are sometimes used in noninterventional research, as well; an example is the comparison of neuropsychological test performance between first degree relatives of schizophrenia patients and healthy controls. In QE studies, groups may differ systematically in several ways at baseline, itself; when these differences influence the outcome of interest, comparing outcomes between groups using univariable methods can generate misleading results. Multivariable regression is therefore suggested as a better approach to data analysis; because the effects of confounding variables can be adjusted for in multivariable regression, the unique effect of the grouping variable can be better understood. However, although multivariable regression is better than univariable analyses, there are inevitably inadequately measured, unmeasured, and unknown confounds that may limit the validity of the conclusions drawn. Investigators should therefore employ QE designs sparingly, and only if no other option is available to answer an important research question.

If we wish to study how antidepressant drug treatment affects outcomes in pregnancy, we should ideally randomize depressed pregnant women to receive an antidepressant drug or placebo; this is a randomized controlled trial (RCT) research design. However, because ethics committees are unlikely to approve such RCTs, researchers can only examine pregnancy outcomes (prospectively or retrospectively) in women who did versus did not receive antidepressant drugs; this is a quasi-experimental (QE) research design. A QE study is one that compares outcomes between intervention groups where, for reasons related to ethics or feasibility, participants are not randomized to their respective interventions.

QE studies are problematic because, when participants are not randomized to intervention versus control groups, systematic biases may influence group membership. For example, women who are prescribed and who accept antidepressant medications during pregnancy are likely to be more severely ill than those who are not prescribed or those who do not accept antidepressant medications during pregnancy. So, if adverse pregnancy outcomes are commoner in the antidepressant group, they may be consequences of genetic, physiological, and/or behavioral features that characterize severe depression rather than the antidepressant treatment, itself.

A statistical approach to dealing with such confounds is to perform a regression analysis where pregnancy outcome is the dependent variable and antidepressant treatment, age, sex, socioeconomic status, medical history, family history, smoking history, drinking history, history of use of other substances, nutrition, history of infection during pregnancy, and dozens of other important variables that can influence pregnancy outcomes are independent variables. In such a regression, antidepressant treatment is the independent variable of interest, and the remaining independent variables are confounders that are adjusted for in the regression so that the unique effect of antidepressant treatment on pregnancy outcomes can be better identified. Propensity score matching refines the approach to analysis. 1

Many investigators use QE designs to answer their research questions, though not necessarily as an “experiment” with an intervention. For example, Thomas et al. 2 compared psychosocial dysfunction and family burden between outpatients diagnosed with schizophrenia and those diagnosed with obsessive-compulsive disorder (OCD). Obviously, it is not feasible to randomize patients to have schizophrenia or OCD. So, in their analysis, Thomas et al. 2 first examined whether the two groups were comparable on important sociodemographic and clinical variables. They found that the groups did not differ on, for example, age, family income, and duration of illness (but here, and in other QE studies, as well, these baseline comparisons would almost certainly have been underpowered); however, the schizophrenia group was overrepresented for males and for a history of substance abuse. In further analysis, Thomas et al. 2 used t tests to compare dysfunction and burden between the two groups; they found that both dysfunction and burden were greater in schizophrenia than in OCD.

Now, because patients had not been randomized to their respective diagnoses, it is obvious that the groups could have differed in many ways and not in diagnosis, alone. So, separate regressions should have been conducted with dysfunction and with burden as the dependent variable, and with diagnosis, age, sex, socioeconomic status, duration of illness, history of substance abuse, and others as the independent variables. Such an analysis would allow the investigators to understand not only the unique impact of the diagnosis but also the impact of the other sociodemographic and clinical variables on dysfunction and burden.

Note that inadequately measured, unmeasured, and unknown confounds would still have plagued the results. For example, in this study, 2 severity of illness was an unmeasured confound. What if the authors had, by chance, sampled more severely ill schizophrenia patients and less severely ill OCD patients? Then, illness severity rather than clinical diagnosis would have explained the greater dysfunction and burden observed in the schizophrenia group. Had they obtained a global rating of illness, they could have included it as an additional, important independent variable in the regression.

In another study with a QE design, Harave et al., 3 like Thomas et al., 2 used univariate tests to compare neurocognitive functioning between unaffected first-degree relatives of schizophrenia patients and healthy controls. More correctly, because there are likely to be systematic differences between schizophrenia relatives and healthy controls, they should have performed multivariable regressions with neurocognitive measures as the dependent variables, and with group and confounders as independent variables. Confounders that could have been considered include age, sex, education, family income, a measure of stress, history of smoking, drinking, other substance use, and so on, all of which can directly or indirectly influence neurocognitive performances.

This multivariable regression approach to data analysis in QE designs requires the a priori identification and measurement of all important confounding variables. In such analyses, the sample size for a continuous dependent variable should ideally be at least 10–15 times the number of independent variables. 4 Given that the number of confounding variables to be included is likely to be large, a very large sample will become necessary. Additionally, because studies are never perfect, it would be impossible to adjust for inadequately measured, unmeasured, and unknown confounds (but adjusting for whatever is known and measured is better than making no adjustments, at all). All said and done, the QE research design is best avoided because it is flawed and because even the best statistical approaches to data analysis would be imperfect. The QE design should be considered only when no other options are available. Readers are referred to Harris et al. 5 for a further discussion on QE studies.

Declaration of Conflicting Interests: The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author received no financial support for the research, authorship, and/or publication of this article.

  • Introduction
  • Conclusions
  • Article Information

AD indicates atopic dermatitis; OCS, oral corticosteroid.

a Rheumatoid arthritis, ankylosing spondylitis, systemic lupus erythematosus, psoriasis, Crohn disease, ulcerative colitis, Sjögren syndrome, systemic sclerosis, dermatomyositis, polymyositis, thromboangiitis obliterans, Behçet disease, sarcoidosis, pemphigus, and vitiligo.

b Patients who received a diagnosis of the outcomes of interest (osteoporosis, fracture, type 2 diabetes, hyperlipidemia, hypertension, myocardial infarction, stroke, heart failure, avascular necrosis, cataract, or glaucoma) during 1 year before or 1 year after the cohort entry date were excluded.

OR indicates odds ratio.

a Modified definition of the exposure from cumulative duration of more than 30 days per year and more than 90 days per year to a cumulative duration of more than 60 days per year.

b The long-term use of oral corticosteroids was defined as a cumulative supply of more than 30 days or more than 90 days with a greater than 5-mg daily prednisolone-equivalent dose of oral corticosteroids, which places patients at risk of systemic adverse effects, and we assessed the long-term use of oral corticosteroids annually. To exclude potential use of oral corticosteroids for conditions other than atopic dermatitis, we restricted exposure to prescriptions for patients with a diagnosis of atopic dermatitis.

c Restricted to patients who could be followed up for at least 3 years from the cohort entry date.

d Restricted to patients who could be followed up for at least 5 years from the cohort entry date.

eTable 1. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis for Composite Outcome

eTable 2. Codes Used to Define Exclusion Criteria, Exposures, Outcomes, and Covariates

eTable 3. Exposure Definition Regarding to Long-Term Oral Corticosteroid Usage in the Previous Studies

eTable 4. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis, Comparison Between Ever Long-Term Use of OCS Over 30 Days vs 90 Days

eTable 5. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Osteoporosis

eTable 6. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Fracture

eTable 7. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Type 2 Diabetes Mellitus

eTable 8. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Hyperlipidemia

eTable 9. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Hypertension

eTable 10. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Myocardial Infarction

eTable 11. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Stroke

eTable 12. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Heart Failure

eTable 13. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Avascular Necrosis

eTable 14. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Cataract

eTable 15. Demographic and Clinical Characteristics of Cases and Controls of Adult Patients (>18 Years) With Atopic Dermatitis: Glaucoma

eTable 16. E-Values for Point Estimates of Different Outcomes of Interest for Primary Exposure: >30 Days a Year

eTable 17. E-Values for Point Estimates of Different Outcomes of Interest for Primary Exposure: >90 Days a Year

eFigure 1. Overall Design of This Nested-Case Control Study

eFigure 2. Case-Control Matching Using Risk-Set Sampling Method

eFigure 3 . Explanation for the Exposure Status According to 1) Ever Long-Term OCS, 2) Cumulative No. of Years of Long-Term OCS, 3) Consecutive No. of Years of Long-Term OCS for the Primary (>30 Days) and Secondary (>90 Days) Exposure Definition

eFigure 4. Subgroup Analysis According to the Age Stratification for Evaluating the Risk of Composite Adverse Outcomes Associated With Long-Term Use of OCS

eFigure 5. Subgroup Analysis According to the Sex Stratification for Evaluating the Risk of Composite Adverse Outcomes Associated With Long-Term Use of OCS

eFigure 6. Subgroup Analysis According to the Severity of AD Stratification for Evaluating the Risk of Composite Adverse Outcomes Associated With Long-Term Use of OCS

Data Sharing Statement

See More About

Sign up for emails based on your interests, select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Get the latest research based on your areas of interest.

Others also liked.

  • Download PDF
  • X Facebook More LinkedIn

Jang YH , Choi E , Lee H, et al. Long-Term Use of Oral Corticosteroids and Safety Outcomes for Patients With Atopic Dermatitis. JAMA Netw Open. 2024;7(7):e2423563. doi:10.1001/jamanetworkopen.2024.23563

Manage citations:

© 2024

  • Permissions

Long-Term Use of Oral Corticosteroids and Safety Outcomes for Patients With Atopic Dermatitis

  • 1 Department of Dermatology, School of Medicine, Kyungpook National University, Kyungpook National University Hospital, Daegu, Korea
  • 2 School of Pharmacy, Sungkyunkwan University, Suwon, South Korea
  • 3 Department of Biohealth Regulatory Science, Sungkyunkwan University, Suwon, South Korea
  • 4 Research Department of Practice and Policy, School of Pharmacy, University College London, London, United Kingdom
  • 5 School of Pharmacy, Aston University, Birmingham, United Kingdom
  • 6 Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
  • 7 Department of Inflammation & Immunology Medical Affairs, Pfizer Pharmaceuticals Korea Ltd, Seoul, South Korea
  • 8 Department of Clinical Research Design & Evaluation, Samsung Advanced Institute for Health Sciences and Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
  • 9 Department of Dermatology, Konkuk University School of Medicine, Seoul, South Korea

Question   What duration of oral corticosteroid use is associated with adverse effects among adult patients with atopic dermatitis?

Findings   In this nested case-control study including 1 025 270 patients with atopic dermatitis, use of oral corticosteroids for more than 90 days during 1 year was associated with a slightly increased risk of composite adverse outcomes. There was no increased risk with use of oral corticosteroids for more than 30 days.

Meaning   This study suggests that for patients with exacerbations of atopic dermatitis, limiting the duration of oral corticosteroid treatment to 90 days or less may limit adverse effects.

Importance   The use of oral corticosteroids for prolonged periods may be associated with adverse events (AEs). Nevertheless, the risk of AEs with oral corticosteroids, especially among patients with atopic dermatitis (AD), has not been comprehensively investigated and lacks evidence on duration of treatment.

Objective   To assess the association between long-term exposure to oral corticosteroids and AEs among adult patients with AD.

Design, Setting, and Participants   This nested case-control study used data from the Health Insurance Review and Assessment Service database of South Korea between January 1, 2012, and October 31, 2021, which included 1 year prior to the cohort entry date of January 1, 2013, for assessing exclusion criteria and baseline characteristics, and 1 year after the study end date of October 31, 2020, to ensure a minimum duration for assessing exposure. Among the population of adults with AD, patients diagnosed with any of 11 AEs were matched with patients who had never received a diagnosis of any of the 11 AEs.

Exposure   Long-term use of oral corticosteroids was defined as cumulative supply of more than 30 days or more than 90 days of oral corticosteroid prescription per year.

Main Outcomes and Measures   We used multivariable conditional logistic regression analyses to measure the risk of 11 individual outcomes (osteoporosis, fracture, type 2 diabetes, hyperlipidemia, hypertension, myocardial infarction, stroke, heart failure, avascular necrosis, cataract, or glaucoma) as the composite outcome, controlling for potential confounders. We further classified the composite outcome to individual outcomes to evaluate the AE-specific risk.

Results   Among 1 025 270 patients with AD between 2013 and 2020, 164 809 cases (mean [SD] age, 39.4 [14.8]; 56.9% women) were matched with 328 303 controls (mean [SD] age, 39.3 [14.7]; 56.9% women) for sex, age, cohort entry date, follow-up duration, and severity of AD, where the balance of most baseline characteristics was achieved. A total of 5533 cases (3.4%) and 10 561 controls (3.2%) were exposed to oral corticosteroids for more than 30 days, while 684 cases (0.4%) and 1153 controls (0.4%) were exposed to oral corticosteroids for more than 90 days. Overall, there was no increased risk of AEs with use of oral corticosteroids for more than 30 days (adjusted odds ratio [AOR], 1.00; 95% CI, 0.97-1.04), whereas the risk was slightly higher with use of oral corticosteroids for more than 90 days (AOR, 1.11; 95% CI, 1.01-1.23). The small elevation in experiencing an AE was observed with each cumulative or consecutive year of ever long-term use.

Conclusions and Relevance   This case-control study found a slightly increased risk of AEs associated with use of oral corticosteroids for more than 90 days per year, which warrants future research to fully elucidate the observed findings.

Atopic dermatitis (AD) is a chronic inflammatory disease that causes serious morbidity, such as pruritus, impaired quality of life, and a range of comorbidities. 1 , 2 AD is a lifelong condition that relapses chronically and needs constant care. 3 Although AD is considered primarily a pediatric disease, studies have shown high rates of AD among adults as well. 4 The prevalence of AD among adults ranged from 2.1% to 4.9% across countries, and up to 10% of adults required medication for moderate to severe AD due to inadequate response to topical therapies; the prevalence rates were higher among adult patients than among pediatric patients, of whom 1.5% required medication for moderate to severe AD. 5 - 7

As AD treatment strategies, international guidelines and expert opinions generally recommend that oral corticosteroids should generally be avoided or limited to the short term only as rescue therapy. 8 - 11 Nonetheless, given the benefits of oral corticosteroids, including their effectiveness in allergic diseases, short-term safety, and low cost, many patients with moderate to severe AD are treated with oral corticosteroids for prolonged periods, which may constitute inappropriate or excessive use. 12 , 13 However, oral corticosteroid treatment for prolonged periods could have an association with oral corticosteroid–related complications. 14 Hence, clinical evidence informing patients and practitioners regarding the management of AD exacerbations in routine clinical practice is warranted.

Although previous studies among patients with asthma or rheumatic disease have suggested associations between long-term use of oral corticosteroids and various adverse events (AEs), there are few studies of patients with AD, to our knowledge. 15 - 21 In addition, existing studies about corticosteroid use among patients with AD were conducted to evaluate the safety concerns primarily about topical corticosteroids. 22 - 29 Considering the frequent use of oral corticosteroids among adults with AD and the potential association between long-term use of oral corticosteroids and AEs, some of which are severe, there is a need to investigate the safety of the long-term use of oral corticosteroids among adults with AD. 6 , 30 , 31 Accordingly, we aimed to investigate the association between long-term use of oral corticosteroids and AEs among adult patients with AD in South Korea.

We used the nationwide Health Insurance Review and Assessment Service (HIRA) database of South Korea between January 1, 2012, and October 31, 2021, which included 1 year prior to the cohort entry date of January 1, 2013, for assessing exclusion criteria and baseline characteristics, and 1 year after the study end date of October 31, 2020, to ensure a minimum duration for assessing exposure. It encompasses comprehensive data on health care use for every resident of South Korea, ensuring that patient identifiers are anonymized. The database collects information on socioeconomic and demographic variables, diagnosis ( International Statistical Classification of Diseases and Related Health Problems, Tenth Revision diagnostic code; setting of diagnosis; date of diagnosis; and others), and medications prescribed (national drug chemical code, days’ supply, dose, date of prescription, route of administration, and others) until the occurrence of emigration or death. A prior validation study examined diagnosis codes documented in the HIRA in comparison with those in electronic medical records and found an overall positive predictive value of 82.3%. 32 This study was approved by the institutional review board of Sungkyunkwan University, which waived the informed consent because only deidentified data were used in this study. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology ( STROBE ) reporting guideline. 33

The study cohort comprised patients who were prescribed oral corticosteroids at least once with an AD diagnosis code from January 1, 2013, to October 31, 2020. The cohort entry date was defined as the first date of the prescription of oral corticosteroids with an AD diagnosis within the study period to include the new users of oral corticosteroids. Eligible case and control groups were identified after excluding the following: (1) patients with a diagnosis of immune-mediated inflammatory diseases during a 1-year window of exclusion assessment before the cohort entry date, to evaluate the risk of AEs from oral corticosteroid use for AD; (2) patients with a diagnosis of any of 11 outcomes of interest during the exclusion assessment window of 1 year before and 1 year after the cohort entry date, to investigate the association of oral corticosteroid use with newly occurred outcomes; and (3) patients who were younger than 18 years of age on the cohort entry date, to include adult patients ( Figure 1 ).

Cases were defined as patients with AD who received a diagnosis of any of our outcomes of interest after the cohort entry date, and the index date was defined as the first date of outcome occurrence. The composite outcome of interest consisted of osteoporosis, fracture, type 2 diabetes, hyperlipidemia, hypertension, myocardial infarction, stroke, heart failure, avascular necrosis (AVN), cataract, and glaucoma. We defined controls as patients with AD who never received a diagnosis of our outcomes of interests after the cohort entry date. We matched each case with up to 2 controls without replacement, using risk-set sampling on the cohort entry date (±30 days), follow-up duration (between the cohort entry date and the index date [±30 days]), age, sex, and severity of AD. Disease severity of AD was classified as moderate to severe on the basis of the current treatment guidelines for AD. 7 Moderate to severe AD was defined as patients who were receiving at least 1 immunosuppressant, alitretinoin, intravenous immunoglobulin, dupilumab, or phototherapy during the 1 year prior to the cohort entry date. The index dates of the control group were aligned with the corresponding index date of their respective matched cases. For individual outcomes, each case was matched with up to 5 or 10 controls, using different numbers from the composite outcome for ensuring statistical power according to the size of cases for each outcome variable, using risk-set sampling as well (eFigure 2 in Supplement 1 ).

We defined the exposure ascertainment window as the period between the cohort entry date and the index date, segmenting the period into yearly intervals to assess exposure year by year to determine whether patients met the definition for long-term use of oral corticosteroids. Owing to the absence of consensus for a definition of long-term oral corticosteroid use among patients with AD, and even for other diseases, we set the classification of long-term oral corticosteroid use as follows: cumulatively more than 30 days as the primary definition for modest long-term use or more than 90 days as a secondary definition for extensive long-term use, both with greater than a 5-mg daily prednisolone-equivalent dose of oral corticosteroids per year, which places patients at risk of systemic adverse effects. 17 To exclude potential use of oral corticosteroids for related conditions other than AD, we restricted exposure to prescriptions of oral corticosteroids to patients with a diagnosis of AD. Ever long-term use was defined as patients with a history of long-term use of oral corticosteroids for at least 1 year, and all remaining patients were defined as no long-term use . Primarily, ever long-term use was defined as a binary variable using 2 thresholds (>30 days and >90 days). In addition, to examine the duration-response association with long-term use of oral corticosteroids, we used the year, which met the definition for the long-term use, as a continuous variable. We assessed the risk of each outcome associated with the number of cumulative years (considering all the intermittent years of long-term use of oral corticosteroids) throughout the exposure ascertainment period. We also evaluated the risk associated with the number of consecutive years (considering only the continuous years of long-term use of oral corticosteroids) within the exposure ascertainment period. Details of the exposure assessment are shown in eFigure 3 in Supplement 1 .

We discerned a sufficient collection of confounding variables that adequately accounted for potential biases in our analysis: demographic characteristics (eg, sex, age, and medical aid recipients), comorbidities (eg, allergic rhinitis, depression, chronic obstructive pulmonary disease, and thyroid disorders), comedications (eg, antidepressants, antibiotics, estrogens, and proton-pump inhibitors), proxies of overall health status (eg, history of hospitalization, number of outpatient visits, and Charlson Comorbidity Index score), and severity of AD. The characteristic assessment window was defined as the 1-year period before the cohort entry date (eFigure 1 in Supplement 1 ; the demographic characteristics [sex, age, insurance] were assessed on the cohort entry date and other characteristics such as comorbidities, comedications, proxies of health status, and severity of AD during the 1 year prior to cohort entry). The details of exclusion criteria, exposures, outcomes, and covariates are presented in eTable 2 in Supplement 1 .

The demographic characteristics of the cases and controls were presented as frequency (proportion) for categorical variables and as mean (SD) or median (IQR) values for continuous variables. The same analysis used to evaluate the demographic characteristics of the cases and controls of patients with AD were repeated for each of the 11 outcomes as secondary outcomes. Differences in baseline covariates between cases and controls were evaluated using the absolute standardized difference, where an absolute standardized difference greater than 0.1 indicates a statistical imbalance existing between 2 groups.

The association between long-term oral corticosteroid use and the risk of the composite and individual outcomes were investigated using multivariable conditional logistic regression analyses to estimate adjusted odds ratios (AORs) with 95% CIs, adjusting for unbalanced comorbidities, comedications, and proxies of health status after the matching. We conducted additional analyses by considering the number of cumulative or consecutive years of long-term use of oral corticosteroids throughout the entire exposure ascertainment window as continuous variables, to investigate the monotonic duration-response association.

The potential heterogeneity of long-term treatment adverse effects in selected subgroups of patients with AD was examined for the composite adverse outcomes according to age (18-39, 40-64, and ≥65 years), sex (male or female), and severity of AD (mild or moderate to severe AD). To evaluate the robustness of the main findings, sensitivity analyses were first conducted by modifying the definition of exposure from a cumulative duration of more than 30 days or more than 90 days per year to more than 60 days per year. Second, we restricted the population to patients who could be followed up for at least 3 years or 5 years from the cohort entry date. All statistical tests were 2 sided. Analyses were conducted using SAS Enterprise Guide, version 7.1 (SAS Institute Inc), provided by HIRA through a virtual access machine.

Of 1 025 270 patients with AD who had at least 1 prescription of oral corticosteroids between 2013 and 2020, we matched 164 809 cases (mean [SD] age, 39.4 [14.8]; 56.9% women and 43.1% men) with 328 303 controls (mean [SD] age, 39.3 [14.7]; 56.9% women and 43.1% men) ( Table 1 ) by 1:2 matching using risk-set sampling. Cases and controls were matched for sex, age, cohort entry date, follow-up duration, and severity of AD; balance was achieved for most covariates between the 2 groups, with an absolute standardized difference of less than 0.1 ( Table 1 ; whole baseline characteristics of cases and controls are presented in eTable 1 in Supplement 1 , individual outcomes in eTables 5-15 in Supplement 1 , and modest long-term [>30 days] vs extensive long-term [>90 days] in eTable 4 in Supplement 1 ). The most common comorbidity was allergic rhinitis (cases, 42.2%; controls, 38.7%), and the most prevalently prescribed concurrent medication was antibiotics (cases, 71.3% and controls, 66.8%). All the imbalanced variables of concurrent medication use and number of outpatient visits were additionally adjusted in the multivariable logistic regression.

Among the 164 809 cases and 328 303 controls, 5533 cases (3.4%) and 10 561 controls (3.2%) were exposed to oral corticosteroids over 30 days, and 684 cases (0.4%) and 1153 controls (0.4%) were exposed to oral corticosteroids over 90 days. Overall, the risk of AEs was not associated with use of oral corticosteroids exceeding 30 days (AOR, 1.00; 95% CI, 0.97-1.04) ( Table 2 ), while use of oral corticosteroids exceeding 90 days was associated with an 11% increased risk of the composite adverse outcome (AOR, 1.11; 95% CI, 1.01-1.23) ( Table 3 ). Each cumulative or consecutive additive year of long-term exposure (>90 days a year) was associated with a slightly increased risk of having an AE (AOR, 1.06; 95% CI, 1.00-1.13 and AOR, 1.06; 95% CI, 1.00-1.13, respectively).

In the analyses of individual outcomes, an increased risk for hypertension (AOR, 1.09; 95% CI, 1.03-1.15), AVN (AOR, 2.56; 95% CI, 1.82-3.62), and cataract (AOR, 3.22; 95% CI, 1.05-9.85) was associated with use of oral corticosteroids for more than 30 days ( Table 2 ). An increased risk for fracture (AOR, 1.22; 95% CI, 1.05-1.42), hyperlipidemia (AOR, 1.16; 95% CI, 1.03-1.30), myocardial infarction (AOR, 2.22; 95% CI, 1.17-4.22), and AVN (AOR, 6.88; 95% CI, 3.53-13.42) was associated with use of oral corticosteroids for more than 90 days ( Table 3 ). In our subgroup analysis, as compared with unexposed patients, the risk of composite AEs associated with long-term use of oral corticosteroids was generally consistent with the main analyses. No differences were observed in the stratified analyses according to the age group, sex, and severity of AD (eFigures 4-6 in Supplement 1 ). Furthermore, the results of composite outcomes demonstrated a high degree of consistency across all sensitivity analyses regarding the point estimates ( Figure 2 ).

We identified 164 809 cases and 328 303 controls of comparable patients with AD. The risk of composite adverse outcomes was not associated with with ever long-term use of oral corticosteroids exceeding 30 days, whereas the risk was slightly associated with ever long-term use exceeding 90 days. Also, the cumulative and consecutive years of ever long-term use throughout entire exposure ascertainment period was associated with a monotonic elevated risk of having an AE, although there was not a large discrepancy between the 2 distinctive analyses of additive years. Furthermore, small increased risks were identified in the examination of individual outcomes of fracture, hyperlipidemia, hypertension, myocardial infarction, AVN, and cataract. Generally consistent findings, with regard to point estimates, were observed across a range of sensitivity analyses.

Considering the overlapping pathogenetic mechanism between AD and asthma, we referred to studies of patients with asthma for comparison. One cohort study using Medicaid data found that the use of medium and high doses of systemic corticosteroids was associated with bone, cardiovascular, metabolic, and ocular AEs. 34 Another cohort study using 2000-2014 MarketScan data showed a similar increased risk of various AEs associated with the use of 1 to 3 oral corticosteroid prescriptions (AOR, 1.04; 95% CI, 1.01-1.06) and the use of 4 or more prescriptions (AOR, 1.29; 95% CI, 1.20-1.37); the cumulative burden also increased as the number of years accumulated. 20 Although previous research evaluated the frequency of oral corticosteroid use based on prescription numbers, our study provided more conclusive and valid clinical evidence by defining long-term use based on exact duration.

For individual outcomes, in line with previous studies, we also identified fracture, hypertension, hyperlipidemia, and myocardial infarction as AEs associated with long-term use of oral corticosteroids, owing to interruption of endocrine function and metabolism. 20 , 35 - 38 We observed risks of AVN and cataract with long-term oral corticosteroid use, although the risks of these 2 conditions were inconclusive in past studies. For the underlying mechanisms for AVN of the femoral head, the use of oral corticosteroids leads to intravascular coagulation that results in a inhibition of blood flow to the bones, which consequently triggers ischemic injury. 39 - 41 Although existing evidence regarding an association of AVN with duration of oral corticosteroid treatment is unclear, AVN could be induced from use of just over 30 days, and cumulative exposure is the important determining factor, as shown in our results. 39 Furthermore, although a complete elucidation remains uncertain, the mechanisms of new-onset cataract associated with modest long-term use of oral corticosteroids may be due to disturbances in osmotic equilibrium, oxidative detriment, and perturbations in lens growth factors. 42 , 43 Another potential hypothesis involves nonenzymatic Schiff base intermediates that form between the corticosteroid’s C-20 ketone group and its nucleophilic groups, undergoing Heyns rearrangement to produce stable amine-substituted adducts seen only in corticosteroid-induced posterior subcapsular cataracts. 44 , 45 No association or subtle increased hazard was observed with osteoporosis, glaucoma, stroke, or heart failure, implying that the dose and duration of corticosteroid treatment may not pose a risk for these conditions among patients with AD.

This study has some strengths. Concerns about conducting this study arose from the lack of consensus regarding the definition of long-term corticosteroid treatment, as different criteria have been used and variations have been observed (eTable 3 in Supplement 1 ). Accordingly, we combined the NICE (National Institute for Health and Care Excellence) guidelines 17 with the opinions of clinicians practicing in clinical settings. Even though evidence for a safe continuous duration of corticosteroid treatment was not available as we developed criteria for the definition of long-term treatment for the dichotomous variable, our criteria are expected to serve as a primary threshold for deciding the duration of treatment. In addition, although the long-term use of oral corticosteroids is not recommended in the guideline for treatment of AD, relatively prolonged use of oral corticosteroids is identified frequently in clinical practice. 12 Thus, this study addresses a significant gap in research by investigating the association between long-term oral corticosteroid use and a comprehensive range of AEs specifically among adults with AD. With its substantial sample size, the study provides robust statistical power to detect associations between oral corticosteroid use and relatively rare outcomes, adding to the existing evidence.

This study also has some limitations. First, disparities arose between the diagnoses recorded and the actual diseases a patient had. 46 In addition, HIRA data do not include clinical data; accordingly, the diagnostic standard criteria for AD, such as the Hanifin-Rajka criteria, 47 , 48 were infeasible. To comply with this issue, we included patients with AD who had at least 1 oral corticosteroid prescription and restricted prescriptions to patients with a diagnosis of AD. Second, due to the inbuilt characteristics of database recording drugs that are prescribed rather than drugs that are taken, the exposure measurement could be uncertain. However, we set the exposed group from the modest long term (>30 days) to the extensive long term (>90 days) and also included the numbers of cumulative or consecutive years of ever long-term use, from which the cumulative burden would be appropriately measured. Third, inhaled corticosteroids, which have some degree of systemic bioavailability, and topical and eye drop formulations of corticosteroids were not accounted for in this study. Fourth, for some of the individual study outcomes, we could not rule out the failure to detect the true effect due to the lack of statistical power; thus, future studies are warranted to corroborate these results. Fifth, due to the nature of the case-control design, it is not possible to completely exclude reverse causality. Sixth, although we considered moderate to severe AD using prescriptions of medication based on the treatment guideline, the influence of AD-related disease severity cannot be eliminated. Seventh, we addressed residual or unmeasured confounders by calculating E-values (eTables 16 and 17 in Supplement 1 ), but unmeasured confounders may be present, and the results should be interpreted with caution.

In this large population-based case-control study, we discovered that oral corticosteroid use of more than 90 days among individuals with AD was associated with a small increased risk of composite adverse outcomes. Future investigations are warranted to confirm this potential risk of AEs associated with long-term use of oral corticosteroids for patients with exacerbations of AD, and health care professionals should thoroughly weigh the benefits associated with oral corticosteroids against the observed small risk of AEs, while continuously monitoring for AEs.

Accepted for Publication: May 23, 2024.

Published: July 19, 2024. doi:10.1001/jamanetworkopen.2024.23563

Open Access: This is an open access article distributed under the terms of the CC-BY-NC-ND License . © 2024 Jang YH et al. JAMA Network Open .

Corresponding Authors: Ju-Young Shin, PhD, School of Pharmacy, Sungkyunkwan University, 2066 Saburo, Jangan-gu, Suwon, Gyeonggi-do 16419, South Korea ( [email protected] ); Yang Won Lee, MD, PhD, Department of Dermatology, Konkuk University School of Medicine, 120-1 Neungdong-ro, Gwangjin-gu, Seoul 05030, South Korea ( [email protected] ).

Author Contributions: Drs Shin and Y. W. Lee had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Drs Jang and Choi contributed equally to this work.

Concept and design: All authors.

Acquisition, analysis, or interpretation of data: H. Lee, Noh, Jeon, Yoo.

Drafting of the manuscript: Jang, Choi, Woo, Jeon, Yoo.

Critical review of the manuscript for important intellectual content: Jang, H. Lee, Woo, Park, Noh, Jeon, Yoo, Shin, Y. W. Lee.

Statistical analysis: Choi, H. Lee, Woo, Park.

Obtained funding: Jeon, Yoo, Shin.

Administrative, technical, or material support: Jeon, Yoo, Shin, Y. W. Lee.

Supervision: Jang, Shin.

Conflict of Interest Disclosures: Dr Park reported receiving support from the AIR@innoHK programme of the Government of Hong Kong Special Administrative Region Innovation and Technology Commission. Dr Noh reported receiving grants from the Ministry of Health and Welfare outside the submitted work. Drs Jeon and Yoo reported receiving personal fees from Pfizer Pharmaceuticals Korea Ltd outside the submitted work. Dr Shin reported receiving grants from the Ministry of Food and Drug Safety, the Ministry of Health and Welfare, the National Research Foundation of Korea, Celltrion, and SK Bioscience outside the submitted work. No other disclosures were reported.

Funding/Support: This work was supported by Pfizer Pharmaceuticals Korea Ltd.

Role of the Funder/Sponsor: Pfizer Pharmaceuticals Korea Ltd had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Data Sharing Statement: See Supplement 2 .

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

Four New Faculty Members Join Michigan Law’s 2024-2025 Academic Year

  • Civil Rights
  • Constitutional Law
  • Corporate and Securities Law
  • Criminal Law
  • Human Rights
  • Intellectual Property and Antitrust
  • Public Interest Law

A "Welcome to Michigan" banner is hung outside the Lawyers Club, welcoming all who pass by.

Michigan Law welcomes four full-time professors beginning in the 2024–2025 academic year, with research and teaching interests focused in areas as diverse as antitrust, civil rights, criminal law, education, economics, immigration, and others.

In addition, four early career academics have joined the Law School as fellows. Alma Diamond and Austin Nelson will serve as Michigan Faculty Fellows, Olivia Vigiletti is a clinical fellow in the Michigan Innocence Clinic, and Matt Blaszczyk is a non-teaching fellow for the Law and Mobility Program.

“I am thrilled to welcome four exceptional new faculty members and several fellows to Michigan Law,” said Kyle Logue, interim dean and Douglas A. Kahn Collegiate Professor of Law. “Each brings a wealth of knowledge and a unique perspective to our academic community. Their dedication to excellence in teaching, research, and service resonates deeply with our mission, and we look forward to the many ways they will inspire, challenge, and enrich our students and colleagues alike.” 

Meet the Full-time Faculty

Paulina arnold.

Paulina Arnold standing next to a brick wall for a portrait.

Paulina Arnold worked with detained migrant parents directly after college, and that drove her desire to go to law school. After graduating and two years of clerking, she worked as a movement lawyer, partially on detention issues. And now, as a law professor, civil detention is a primary focus of her research. 

Arnold brings her expertise on immigration detention and related issues—including constitutional law, habeas, and prison law—to Michigan Law this fall, as she joins the faculty as an assistant professor. 

Continue Reading

Jenna Cobb

A portrait of Jenna Cobb.

For Professor Jenna Cobb, perhaps the most important thing a lawyer can do is help a person win their freedom.

“Walking someone out of prison after decades—someone you’ve come to know and develop a relationship with, who has been through so much and has so much to give—It is difficult to think of moments that are as rewarding in your career,” she said.

This fall, Cobb joins the Michigan Law faculty in a role perfectly suited to her dedication—as co-director of the Michigan Innocence Clinic, which fights for the release of men and women who have been wrongfully convicted in the criminal justice system. It will be something of a homecoming for Cobb, who grew up in Detroit and whose father is a Michigan Law graduate. 

Albert Pak

A portrait of Albert Pak.

The value of education and helping community organizations are two common themes in Albert Pak’s life. When he found the opportunity to combine those interests at Michigan Law and in Detroit, the decision to jump from private practice to academia was made.

Pak joins the Michigan Law faculty this fall, working in the Community Enterprise Clinic. Although he comes to the faculty from private practice, the things that appeal to him most about academia have long been a part of his career. 

Spencer Smith

A portrait of Spencer Smith.

Professor Spencer Smith was initially drawn to the law as a way to make positive change in the world.

He has put that drive into practice at the  US  Department of Justice and as a law clerk for Supreme Court Justice Sonia Sotomayor. As he joins the Michigan Law faculty, the desire to spur change continues to motivate him into the world of academia.

While Smith has written on a number of topics, from taxation to torts, his recent focus, both in practice and in research, has been antitrust law. “I’m interested in what works,” he said. “How can the law create real competition and economic opportunity, free from monopoly power?”

Also of Interest

News and events.

A portrait of Sherman Clark.

5Qs: New Video Series from Professor Sherman Clark Discusses Effective and Ethical Persuasion

A "Welcome to Michigan" banner is hung outside the Lawyers Club, welcoming all who pass by.

Students in Problem Solving Initiative Course Publish “A Roadmap to Clean and Equitable Power in Michigan”

A portrait of Spencer Smith.

Justice Department Lawyer Spencer Smith Joins Michigan Law Faculty

A portrait of Jenna Cobb.

Jenna Cobb, Committed to Undoing Past Wrongs and Helping Others Do the Same, Joins Michigan Law Faculty

Paulina Arnold standing next to a brick wall for a portrait.

Paulina Arnold, Expert on Civil Detention, Joins Michigan Law Faculty

Labor day- no classes, registration and drop/add end, law and mobility program information session.

An illustration shows batteries flow down an assembly line, turning them from gray to green.

Researchers discover a surprising way to jump-start battery performance

Charging lithium-ion batteries at high currents just before they leave the factory is 30 times faster and increases battery lifespans by 50%, according to a study at the SLAC-Stanford Battery Center. 

By Glennda Chui

A lithium-ion battery’s very first charge is more momentous than it sounds. It determines how well and how long the battery will work from then on – in particular, how many cycles of charging and discharging it can handle before deteriorating.

In a study published today in Joule , researchers at the SLAC-Stanford Battery Center report that giving batteries this first charge at unusually high currents increased their average lifespan by 50% while decreasing the initial charging time from 10 hours to just 20 minutes.

Just as important, the researchers were able to use scientific machine learning to pinpoint specific changes in the battery electrodes that account for this increase in lifespan and performance – invaluable insights for battery manufacturers looking to streamline their processes and improve their products.

The study was carried out by a SLAC/Stanford team led by Professor Will Chueh in collaboration with researchers from the Toyota Research Institute (TRI), the Massachusetts Institute of Technology and the University of Washington. It is part of SLAC's sustainability research and a broader effort to reimagine our energy future leveraging the lab’s unique tools and expertise and partnerships with industry.

“This is an excellent example of how SLAC is doing manufacturing science to make critical technologies for the energy transition more affordable,” Chueh said. “We’re solving a real challenge that industry is facing; critically, we partner with industry from the get-go.”

This was the latest in a series of studies funded by TRI under a cooperative research agreement with the Department of Energy’s SLAC National Accelerator Laboratory.

The results have practical implications for manufacturing not just lithium-ion batteries for electric vehicles and the electric grid, but for other technologies, too, said Steven Torrisi, a senior research scientist at TRI who collaborated on the research.

“This study is very exciting for us,” he said. “Battery manufacturing is extremely capital, energy and time intensive. It takes a long time to spin up manufacturing of a new battery, and it’s really difficult to optimize the manufacturing process because there are so many factors involved.”

Torrisi said the results of this research “demonstrate a generalizable approach for understanding and optimizing this crucial step in battery manufacturing. Further, we may be able to transfer what we have learned to new processes, facilities, equipment and battery chemistries in the future.”

A “squishy layer” that’s key to battery performance 

To understand what happens during the battery’s initial cycling, Chueh’s team builds pouch cells in which the positive and negative electrodes are surrounded by an electrolyte solution where lithium ions move freely. 

When a battery charges, lithium ions flow into the negative electrode for storage. When a battery discharges, they flow back out and travel to the positive electrode; this triggers a flow of electrons for powering devices, from electric cars to the electricity grid. 

An illustration of a battery charging process.

The positive electrode of a newly minted battery is 100% full of lithium, said Xiao Cui, the lead researcher for the battery informatics team in Chueh’s lab. Every time the battery goes through a charge-discharge cycle, some of the lithium is deactivated. Minimizing those losses prolongs the battery’s working lifetime. 

Oddly enough, one way to minimize the overall lithium loss is to deliberately lose a large percentage of the initial supply of lithium during the battery’s first charge, Cui said. It’s like making a small investment that yields good returns down the road. 

This first-cycle lithium loss is not in vain. The lost lithium becomes part of a squishy layer called the solid electrolyte interphase, or SEI, that forms on the surface of the negative electrode during the first charge. In return, the SEI protects the negative electrode from side reactions that would accelerate the lithium loss and degrade the battery faster over time. Getting the SEI just right is so important that the first charge is known as the formation charge.

“Formation is the final step in the manufacturing process,” Cui said, “so if it fails, all the value and effort invested in the battery up to that point are wasted.”

High charging current boosts battery performance

Manufacturers generally give new batteries their first charge with low currents, on the theory that this will create the most robust SEI layer. But there’s a downside: Charging at low currents is time-consuming and costly and doesn’t necessarily yield optimal results. So, when recent studies suggested that faster charging with higher currents does not degrade battery performance, it was exciting news.

But researchers wanted to dig deeper. The charging current is just one of dozens of factors that go into the formation of SEI during the first charge. Testing all possible combinations of them in the lab to see which one worked best is an overwhelming task. 

To whittle the problem down to manageable size, the research team used scientific machine learning to identify which factors are most important in achieving good results. To their surprise, just two of them – the temperature and current at which the battery is charged – stood out from all the rest.

Experiments confirmed that charging at high currents has a huge impact, increasing the lifespan of the average test battery by 50%. It also deactivated a much higher percentage of lithium up front – about 30%, compared to 9% with previous methods – but that turned out to have a positive effect.

Removing more lithium ions up front is a bit like scooping water out of a full bucket before carrying it, Cui said. The extra headspace in the bucket decreases the amount of water splashing out along the way. In similar fashion, deactivating more lithium ions during formation frees up headspace in the positive electrode and allows the electrode to cycle in a more efficient way, improving subsequent performance.

“Brute force optimization by trial-and-error is routine in manufacturing– how should we perform the first charge, and what is the winning combination of factors?” Chueh said. “Here, we didn’t just want to identify the best recipe for making a good battery; we wanted to understand how and why it works. This understanding is crucial for finding the best balance between battery performance and manufacturing efficiency.”

This research was funded by the Toyota Research Institute through its Accelerated Materials Design and Discovery program. 

Citation: Xiao Cui et al., Joule , 29 August 2024 ( 10.1016/j.joule.2024.07.024 )

For questions or comments, contact SLAC Strategic Communications & External Affairs at  [email protected] .

SLAC National Accelerator Laboratory explores how the universe works at the biggest, smallest and fastest scales and invents powerful tools used by researchers around the globe. As world leaders in ultrafast science and bold explorers of the physics of the universe, we forge new ground in understanding our origins and building a healthier and more sustainable future. Our discovery and innovation help develop new materials and chemical processes and open unprecedented views of the cosmos and life’s most delicate machinery. Building on more than 60 years of visionary research, we help shape the future by advancing areas such as quantum technology, scientific computing and the development of next-generation accelerators.

SLAC is operated by Stanford University for the U.S. Department of Energy’s Office of Science . The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time.

Related Topics

  • Energy sciences
  • SLAC-Stanford Battery Center
  • Science news
  • AI and machine learning
  • Materials science

Related stories

Scientists use attosecond x-ray pulses to shed new light on the photoelectric effect.

Their method provides a new tool to study electron-electron interactions, which are fundamental to many technologies, including semiconductors and solar cells.

photoelectric effect

Spotlight on sustainability research: Molleigh Preefer on developing new materials for fast-charging batteries

A materials chemist and SLAC associate scientist, Preefer is excited about the synergies being sparked at the SLAC-Stanford Battery Center. 

Headshot Molleigh Preefer at SSRL

Researchers observe “locked” electron pairs in a superconductor cuprate

The finding could help future efforts to design superconductors that work at higher temperatures.

This is a graphic drawing that represents how electrons lock together in a superconducting material.

LCLS resolves fusion-related nanofoams in 3D

An X-ray imaging technique revealed that copper nanofoams used in inertial fusion experiments aren't as uniform as expected.

Green blobs on a blue background.

Atomic 'GPS' elucidates movement during ultrafast material transitions

Scientists demonstrated a materials characterization technique can be successful at a new type of facility, and they used it at LCLS to discover a...

A small bright ball falls on a purple grid, creating a wave.

SLAC’s high-speed electron camera uncovers a new ‘light-twisting’ behavior in an ultrathin material

A study reveals an ultrathin material’s ability to circularly polarize light, potentially informing how they work in optoelectronic devices.

Image from SLAC's high-speed electron camera showing circular polarization of terahertz light.

IMAGES

  1. PPT

    quasi experiment duration

  2. Advantages Of Quasi Experimental Research

    quasi experiment duration

  3. PPT

    quasi experiment duration

  4. 5 Quasi-Experimental Design Examples (2024)

    quasi experiment duration

  5. Types of Quasi Experimental Design

    quasi experiment duration

  6. PPT

    quasi experiment duration

VIDEO

  1. Quasi-experiment and Difference-in-differences(DID)

  2. Quasi Experiment Designs

  3. Quasi-Experiment Design That Use Control Group

  4. QUASI

  5. Artificial gravity

  6. Boiling water freezes before it hits Siberian soil

COMMENTS

  1. Quasi-Experimental Design

    Revised on January 22, 2024. Like a true experiment, a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable. However, unlike a true experiment, a quasi-experiment does not rely on random assignment. Instead, subjects are assigned to groups based on non-random criteria.

  2. 7.3 Quasi-Experimental Research

    Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one. The prefix quasi means "resembling.". Thus quasi-experimental research is research that resembles experimental research but is not true experimental research.

  3. Quasi-experiment

    A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to ...

  4. Quasi Experimental Design Overview & Examples

    Quasi-experimental research is a design that closely resembles experimental research but is different. The term "quasi" means "resembling," so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has.

  5. Chapter 7 Quasi-Experimental Research

    The prefix quasi means "resembling." Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook et al., 1979).Because the independent variable is manipulated before the dependent variable is ...

  6. Quasi-Experimental Research Design

    When the research question involves investigating a long-term intervention: In situations where the intervention or program is long-term, it may be difficult to randomly assign participants to intervention and control groups for the entire duration of the intervention. Quasi-experimental designs can be used to evaluate the impact of the ...

  7. Quasi-Experimental Designs for Causal Inference

    When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs.

  8. Quasi-Experimental Design: Definition, Types, Examples

    When reporting your Quasi-Experimental research, watch out for common pitfalls that can diminish the quality and impact of your work: Overgeneralization: ... Shorter Time Frames: Limit the duration of your study to reduce the likelihood of significant maturation effects. Shorter studies are less susceptible to long-term maturation.

  9. Quasi-experimentation : a guide to design and analysis

    Comparing quasi-experiments to randomized experiments, Reichardt discusses when and why the former might be a better choice than the latter in the face of the contingencies that are likely to arise in practice. Modern methods for elaborating a research design to remove bias from estimates of treatment effects are described, as are tactics for ...

  10. Quasi-Experiment: Understand What It Is, Types & Examples

    Quasi-experimental research designs have gained significant recognition in the scientific community due to their unique ability to study cause-and-effect relationships in real-world settings. Unlike true experiments, quasi-experiment lack random assignment of participants to groups, making them more practical and ethical in certain situations.

  11. Unraveling the Quasi-Experimental Design: A Comprehensive Guide

    Quasi-experimental design, at its core, is a research method where the researcher doesn't randomly assign participants to treatment or control groups. It's a step away from the rigidity of true experimental designs but offers more structure than observational studies. Contrary to true experimental designs, where variables are controlled ...

  12. The Use and Interpretation of Quasi-Experimental Studies in Medical

    Quasi-experiments are studies that aim to evaluate interventions but that do not use randomization. Similar to randomized trials, quasi-experiments aim to demonstrate causality between an intervention and an outcome. Quasi-experimental studies can use both preintervention and postintervention measurements as well as nonrandomly selected control ...

  13. Selecting and Improving Quasi-Experimental Designs in Effectiveness and

    However, associated with a longer duration of roll-out for practical reasons such as this switching, are associated costs in threats to internal validity, discussed below. ... "The design and conduct of quasi-experiments and true experiments in field settings." Handbook of industrial and organizational psychology 223 (1976): 336.

  14. Quasi-Experimental Design: Types, Examples, Pros, and Cons

    See why leading organizations rely on MasterClass for learning & development. A quasi-experimental design can be a great option when ethical or practical concerns make true experiments impossible, but the research methodology does have its drawbacks. Learn all the ins and outs of a quasi-experimental design.

  15. PDF Quasi-Experimental Designs

    Quasi-experiments are designed to maximize internal validity (confidence in cause-and-effect conclusions) despite being unable to randomly assign. At the outset, it is important to know that quasi-experiments tend to have lower internal validity than true experiments. In this reading, we'll discuss five quasi-

  16. Quasi-experimental Research: What It Is, Types & Examples

    Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. It draws statistical conclusions from quantitative data.

  17. The Limitations of Quasi-Experimental Studies, and Methods for Data

    A quasi-experimental (QE) study is one that compares outcomes between intervention groups where, for reasons related to ethics or feasibility, participants are not randomized to their respective interventions; an example is the historical comparison of pregnancy outcomes in women who did versus did not receive antidepressant medication during pregnancy.

  18. What is a quasi-experiment?

    A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. ... Shorter study duration. Disadvantages: Needs larger samples for high power. Uses more resources to recruit participants, administer sessions, cover costs, etc.

  19. Quasi-Experimental Research

    The prefix quasi means "resembling." Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable ...

  20. A comparison of four quasi-experimental methods: an analysis of the

    Health services research often relies on quasi-experimental study designs in the estimation of treatment effects of a policy change or an intervention. The aim of this study is to compare some of the commonly used non-experimental methods in estimating intervention effects, and to highlight their relative strengths and weaknesses. We estimate the effects of Activity-Based Funding, a hospital ...

  21. PDF Title: Effects of Intervention Duration on Outcomes of Experiments in

    issues on conclusions of meta-analyses, such as research design (randomized vs. quasi-experiment) and outcome measure (independent vs. researcher-made). One important design element evaluated by de Boer et al. (2014) is the intervention duration. The study examined 58 studies, 95 interventions and 180 outcomes

  22. The carbon emission reduction effect of green fiscal policy: a quasi

    In a quasi-natural experiment, various factors may influence the relationship between the implementation of green fiscal policies and the reduction of carbon emissions. To address this, we ...

  23. Habitual Short Sleep Duration, Diet, and Development of Type 2 Diabetes

    Short sleep duration was classified as mild short sleep (6 hours), moderate short sleep (5 hours), and extreme short sleep (3-4 hours) for considering the dose-response relationship with T2D. 2 In line with previous research, 18 participants with a daily sleep duration of less than 3 hours were not included in the main analysis (eFigure 1 in ...

  24. The Limitations of Quasi-Experimental Studies, and Methods for Data

    The Limitations of Quasi-Experimental Studies, and Methods for Data Analysis When a Quasi-Experimental Research Design Is Unavoidable. Chittaranjan Andrade 1 Author information ... socioeconomic status, duration of illness, history of substance abuse, and others as the independent variables. Such an analysis would allow the investigators to ...

  25. Long-Term Use of Oral Corticosteroids and Safety Outcomes for Patients

    OR indicates odds ratio. a Modified definition of the exposure from cumulative duration of more than 30 days per year and more than 90 days per year to a cumulative duration of more than 60 days per year.. b The long-term use of oral corticosteroids was defined as a cumulative supply of more than 30 days or more than 90 days with a greater than 5-mg daily prednisolone-equivalent dose of oral ...

  26. Opto-electrical study of 4T perovskite-chalcogenide ...

    Then, two structures made of the quasi-2D perovskite materials are designed and modeled in order to create a stable SC. The first structure is introduced as a quasi-2D perovskite-chalcogenide TSC based on quasi-2D perovskite as the top sub cell and C I G S S e as the bottom sub cell ALs to achieve a very stable SC. Quasi-2D perovskites as the ...

  27. New Faculty Members Join Michigan Law's 2024-2025 Academic Year

    Michigan Law welcomes four full-time professors beginning in the 2024-2025 academic year, with research and teaching interests focused in areas as diverse as antitrust, civil rights, criminal law, education, economics, immigration, and others.

  28. A new path to better sleep: Evening exercise breaks

    Recent research suggests a simple yet effective strategy to improve sleep duration: incorporating short resistance exercise breaks in the evening.

  29. Researchers discover a surprising way to jump-start battery performance

    Research. Get an overview of research at SLAC: X-ray and ultrafast science, particle and astrophysics, cosmology, particle accelerators, biology, energy and technology. X-ray & ultrafast science. Revealing nature's fastest processes with X-rays, lasers and electrons. Physics of the universe. Studying the particles and forces that knit the ...