• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Quasi Experimental Design Overview & Examples

By Jim Frost Leave a Comment

What is a Quasi Experimental Design?

A quasi experimental design is a method for identifying causal relationships that does not randomly assign participants to the experimental groups. Instead, researchers use a non-random process. For example, they might use an eligibility cutoff score or preexisting groups to determine who receives the treatment.

Image illustrating a quasi experimental design.

Quasi-experimental research is a design that closely resembles experimental research but is different. The term “quasi” means “resembling,” so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has. However, unlike true experimental research, participants are not randomly assigned to different groups.

Learn more about Experimental Designs: Definition & Types .

When to Use Quasi-Experimental Design

Researchers typically use a quasi-experimental design because they can’t randomize due to practical or ethical concerns. For example:

  • Practical Constraints : A school interested in testing a new teaching method can only implement it in preexisting classes and cannot randomly assign students.
  • Ethical Concerns : A medical study might not be able to randomly assign participants to a treatment group for an experimental medication when they are already taking a proven drug.

Quasi-experimental designs also come in handy when researchers want to study the effects of naturally occurring events, like policy changes or environmental shifts, where they can’t control who is exposed to the treatment.

Quasi-experimental designs occupy a unique position in the spectrum of research methodologies, sitting between observational studies and true experiments. This middle ground offers a blend of both worlds, addressing some limitations of purely observational studies while navigating the constraints often accompanying true experiments.

A significant advantage of quasi-experimental research over purely observational studies and correlational research is that it addresses the issue of directionality, determining which variable is the cause and which is the effect. In quasi-experiments, an intervention typically occurs during the investigation, and the researchers record outcomes before and after it, increasing the confidence that it causes the observed changes.

However, it’s crucial to recognize its limitations as well. Controlling confounding variables is a larger concern for a quasi-experimental design than a true experiment because it lacks random assignment.

In sum, quasi-experimental designs offer a valuable research approach when random assignment is not feasible, providing a more structured and controlled framework than observational studies while acknowledging and attempting to address potential confounders.

Types of Quasi-Experimental Designs and Examples

Quasi-experimental studies use various methods, depending on the scenario.

Natural Experiments

This design uses naturally occurring events or changes to create the treatment and control groups. Researchers compare outcomes between those whom the event affected and those it did not affect. Analysts use statistical controls to account for confounders that the researchers must also measure.

Natural experiments are related to observational studies, but they allow for a clearer causality inference because the external event or policy change provides both a form of quasi-random group assignment and a definite start date for the intervention.

For example, in a natural experiment utilizing a quasi-experimental design, researchers study the impact of a significant economic policy change on small business growth. The policy is implemented in one state but not in neighboring states. This scenario creates an unplanned experimental setup, where the state with the new policy serves as the treatment group, and the neighboring states act as the control group.

Researchers are primarily interested in small business growth rates but need to record various confounders that can impact growth rates. Hence, they record state economic indicators, investment levels, and employment figures. By recording these metrics across the states, they can include them in the model as covariates and control them statistically. This method allows researchers to estimate differences in small business growth due to the policy itself, separate from the various confounders.

Nonequivalent Groups Design

This method involves matching existing groups that are similar but not identical. Researchers attempt to find groups that are as equivalent as possible, particularly for factors likely to affect the outcome.

For instance, researchers use a nonequivalent groups quasi-experimental design to evaluate the effectiveness of a new teaching method in improving students’ mathematics performance. A school district considering the teaching method is planning the study. Students are already divided into schools, preventing random assignment.

The researchers matched two schools with similar demographics, baseline academic performance, and resources. The school using the traditional methodology is the control, while the other uses the new approach. Researchers are evaluating differences in educational outcomes between the two methods.

They perform a pretest to identify differences between the schools that might affect the outcome and include them as covariates to control for confounding. They also record outcomes before and after the intervention to have a larger context for the changes they observe.

Regression Discontinuity

This process assigns subjects to a treatment or control group based on a predetermined cutoff point (e.g., a test score). The analysis primarily focuses on participants near the cutoff point, as they are likely similar except for the treatment received. By comparing participants just above and below the cutoff, the design controls for confounders that vary smoothly around the cutoff.

For example, in a regression discontinuity quasi-experimental design focusing on a new medical treatment for depression, researchers use depression scores as the cutoff point. Individuals with depression scores just above a certain threshold are assigned to receive the latest treatment, while those just below the threshold do not receive it. This method creates two closely matched groups: one that barely qualifies for treatment and one that barely misses out.

By comparing the mental health outcomes of these two groups over time, researchers can assess the effectiveness of the new treatment. The assumption is that the only significant difference between the groups is whether they received the treatment, thereby isolating its impact on depression outcomes.

Controlling Confounders in a Quasi-Experimental Design

Accounting for confounding variables is a challenging but essential task for a quasi-experimental design.

In a true experiment, the random assignment process equalizes confounders across the groups to nullify their overall effect. It’s the gold standard because it works on all confounders, known and unknown.

Unfortunately, the lack of random assignment can allow differences between the groups to exist before the intervention. These confounding factors might ultimately explain the results rather than the intervention.

Consequently, researchers must use other methods to equalize the groups roughly using matching and cutoff values or statistically adjust for preexisting differences they measure to reduce the impact of confounders.

A key strength of quasi-experiments is their frequent use of “pre-post testing.” This approach involves conducting initial tests before collecting data to check for preexisting differences between groups that could impact the study’s outcome. By identifying these variables early on and including them as covariates, researchers can more effectively control potential confounders in their statistical analysis.

Additionally, researchers frequently track outcomes before and after the intervention to better understand the context for changes they observe.

Statisticians consider these methods to be less effective than randomization. Hence, quasi-experiments fall somewhere in the middle when it comes to internal validity , or how well the study can identify causal relationships versus mere correlation . They’re more conclusive than correlational studies but not as solid as true experiments.

In conclusion, quasi-experimental designs offer researchers a versatile and practical approach when random assignment is not feasible. This methodology bridges the gap between controlled experiments and observational studies, providing a valuable tool for investigating cause-and-effect relationships in real-world settings. Researchers can address ethical and logistical constraints by understanding and leveraging the different types of quasi-experimental designs while still obtaining insightful and meaningful results.

Cook, T. D., & Campbell, D. T. (1979).  Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin

Share this:

experimental and quasi experimental examples

Reader Interactions

Comments and questions cancel reply.

Experimental vs Quasi-Experimental Design: Which to Choose?

Here’s a table that summarizes the similarities and differences between an experimental and a quasi-experimental study design:

 Experimental Study (a.k.a. Randomized Controlled Trial)Quasi-Experimental Study
ObjectiveEvaluate the effect of an intervention or a treatmentEvaluate the effect of an intervention or a treatment
How participants get assigned to groups?Random assignmentNon-random assignment (participants get assigned according to their choosing or that of the researcher)
Is there a control group?YesNot always (although, if present, a control group will provide better evidence for the study results)
Is there any room for confounding?No (although check for a detailed discussion on post-randomization confounding in randomized controlled trials)Yes (however, statistical techniques can be used to study causal relationships in quasi-experiments)
Level of evidenceA randomized trial is at the highest level in the hierarchy of evidenceA quasi-experiment is one level below the experimental study in the hierarchy of evidence [ ]
AdvantagesMinimizes bias and confounding– Can be used in situations where an experiment is not ethically or practically feasible
– Can work with smaller sample sizes than randomized trials
Limitations– High cost (as it generally requires a large sample size)
– Ethical limitations
– Generalizability issues
– Sometimes practically infeasible
Lower ranking in the hierarchy of evidence as losing the power of randomization causes the study to be more susceptible to bias and confounding

What is a quasi-experimental design?

A quasi-experimental design is a non-randomized study design used to evaluate the effect of an intervention. The intervention can be a training program, a policy change or a medical treatment.

Unlike a true experiment, in a quasi-experimental study the choice of who gets the intervention and who doesn’t is not randomized. Instead, the intervention can be assigned to participants according to their choosing or that of the researcher, or by using any method other than randomness.

Having a control group is not required, but if present, it provides a higher level of evidence for the relationship between the intervention and the outcome.

(for more information, I recommend my other article: Understand Quasi-Experimental Design Through an Example ) .

Examples of quasi-experimental designs include:

  • One-Group Posttest Only Design
  • Static-Group Comparison Design
  • One-Group Pretest-Posttest Design
  • Separate-Sample Pretest-Posttest Design

What is an experimental design?

An experimental design is a randomized study design used to evaluate the effect of an intervention. In its simplest form, the participants will be randomly divided into 2 groups:

  • A treatment group: where participants receive the new intervention which effect we want to study.
  • A control or comparison group: where participants do not receive any intervention at all (or receive some standard intervention).

Randomization ensures that each participant has the same chance of receiving the intervention. Its objective is to equalize the 2 groups, and therefore, any observed difference in the study outcome afterwards will only be attributed to the intervention – i.e. it removes confounding.

(for more information, I recommend my other article: Purpose and Limitations of Random Assignment ).

Examples of experimental designs include:

  • Posttest-Only Control Group Design
  • Pretest-Posttest Control Group Design
  • Solomon Four-Group Design
  • Matched Pairs Design
  • Randomized Block Design

When to choose an experimental design over a quasi-experimental design?

Although many statistical techniques can be used to deal with confounding in a quasi-experimental study, in practice, randomization is still the best tool we have to study causal relationships.

Another problem with quasi-experiments is the natural progression of the disease or the condition under study — When studying the effect of an intervention over time, one should consider natural changes because these can be mistaken with changes in outcome that are caused by the intervention. Having a well-chosen control group helps dealing with this issue.

So, if losing the element of randomness seems like an unwise step down in the hierarchy of evidence, why would we ever want to do it?

This is what we’re going to discuss next.

When to choose a quasi-experimental design over a true experiment?

The issue with randomness is that it cannot be always achievable.

So here are some cases where using a quasi-experimental design makes more sense than using an experimental one:

  • If being in one group is believed to be harmful for the participants , either because the intervention is harmful (ex. randomizing people to smoking), or the intervention has a questionable efficacy, or on the contrary it is believed to be so beneficial that it would be malevolent to put people in the control group (ex. randomizing people to receiving an operation).
  • In cases where interventions act on a group of people in a given location , it becomes difficult to adequately randomize subjects (ex. an intervention that reduces pollution in a given area).
  • When working with small sample sizes , as randomized controlled trials require a large sample size to account for heterogeneity among subjects (i.e. to evenly distribute confounding variables between the intervention and control groups).

Further reading

  • Statistical Software Popularity in 40,582 Research Papers
  • Checking the Popularity of 125 Statistical Tests and Models
  • Objectives of Epidemiology (With Examples)
  • 12 Famous Epidemiologists and Why

Child Care and Early Education Research Connections

Experiments and quasi-experiments.

This page includes an explanation of the types, key components, validity, ethics, and advantages and disadvantages of experimental design.

An experiment is a study in which the researcher manipulates the level of some independent variable and then measures the outcome. Experiments are powerful techniques for evaluating cause-and-effect relationships. Many researchers consider experiments the "gold standard" against which all other research designs should be judged. Experiments are conducted both in the laboratory and in real life situations.

Types of Experimental Design

There are two basic types of research design:

  • True experiments
  • Quasi-experiments

The purpose of both is to examine the cause of certain phenomena.

True experiments, in which all the important factors that might affect the phenomena of interest are completely controlled, are the preferred design. Often, however, it is not possible or practical to control all the key factors, so it becomes necessary to implement a quasi-experimental research design.

Similarities between true and quasi-experiments:

  • Study participants are subjected to some type of treatment or condition
  • Some outcome of interest is measured
  • The researchers test whether differences in this outcome are related to the treatment

Differences between true experiments and quasi-experiments:

  • In a true experiment, participants are randomly assigned to either the treatment or the control group, whereas they are not assigned randomly in a quasi-experiment
  • In a quasi-experiment, the control and treatment groups differ not only in terms of the experimental treatment they receive, but also in other, often unknown or unknowable, ways. Thus, the researcher must try to statistically control for as many of these differences as possible
  • Because control is lacking in quasi-experiments, there may be several "rival hypotheses" competing with the experimental manipulation as explanations for observed results

Key Components of Experimental Research Design

The manipulation of predictor variables.

In an experiment, the researcher manipulates the factor that is hypothesized to affect the outcome of interest. The factor that is being manipulated is typically referred to as the treatment or intervention. The researcher may manipulate whether research subjects receive a treatment (e.g., antidepressant medicine: yes or no) and the level of treatment (e.g., 50 mg, 75 mg, 100 mg, and 125 mg).

Suppose, for example, a group of researchers was interested in the causes of maternal employment. They might hypothesize that the provision of government-subsidized child care would promote such employment. They could then design an experiment in which some subjects would be provided the option of government-funded child care subsidies and others would not. The researchers might also manipulate the value of the child care subsidies in order to determine if higher subsidy values might result in different levels of maternal employment.

Random Assignment

  • Study participants are randomly assigned to different treatment groups
  • All participants have the same chance of being in a given condition
  • Participants are assigned to either the group that receives the treatment, known as the "experimental group" or "treatment group," or to the group which does not receive the treatment, referred to as the "control group"
  • Random assignment neutralizes factors other than the independent and dependent variables, making it possible to directly infer cause and effect

Random Sampling

Traditionally, experimental researchers have used convenience sampling to select study participants. However, as research methods have become more rigorous, and the problems with generalizing from a convenience sample to the larger population have become more apparent, experimental researchers are increasingly turning to random sampling. In experimental policy research studies, participants are often randomly selected from program administrative databases and randomly assigned to the control or treatment groups.

Validity of Results

The two types of validity of experiments are internal and external. It is often difficult to achieve both in social science research experiments.

Internal Validity

  • When an experiment is internally valid, we are certain that the independent variable (e.g., child care subsidies) caused the outcome of the study (e.g., maternal employment)
  • When subjects are randomly assigned to treatment or control groups, we can assume that the independent variable caused the observed outcomes because the two groups should not have differed from one another at the start of the experiment
  • For example, take the child care subsidy example above. Since research subjects were randomly assigned to the treatment (child care subsidies available) and control (no child care subsidies available) groups, the two groups should not have differed at the outset of the study. If, after the intervention, mothers in the treatment group were more likely to be working, we can assume that the availability of child care subsidies promoted maternal employment

One potential threat to internal validity in experiments occurs when participants either drop out of the study or refuse to participate in the study. If particular types of individuals drop out or refuse to participate more often than individuals with other characteristics, this is called differential attrition. For example, suppose an experiment was conducted to assess the effects of a new reading curriculum. If the new curriculum was so tough that many of the slowest readers dropped out of school, the school with the new curriculum would experience an increase in the average reading scores. The reason they experienced an increase in reading scores, however, is because the worst readers left the school, not because the new curriculum improved students' reading skills.

External Validity

  • External validity is also of particular concern in social science experiments
  • It can be very difficult to generalize experimental results to groups that were not included in the study
  • Studies that randomly select participants from the most diverse and representative populations are more likely to have external validity
  • The use of random sampling techniques makes it easier to generalize the results of studies to other groups

For example, a research study shows that a new curriculum improved reading comprehension of third-grade children in Iowa. To assess the study's external validity, you would ask whether this new curriculum would also be effective with third graders in New York or with children in other elementary grades.

Glossary terms related to validity:

  • internal validity
  • external validity
  • differential attrition

It is particularly important in experimental research to follow ethical guidelines. Protecting the health and safety of research subjects is imperative. In order to assure subject safety, all researchers should have their project reviewed by the Institutional Review Boards (IRBS). The  National Institutes of Health  supplies strict guidelines for project approval. Many of these guidelines are based on the  Belmont Report  (pdf).

The basic ethical principles:

  • Respect for persons  -- requires that research subjects are not coerced into participating in a study and requires the protection of research subjects who have diminished autonomy
  • Beneficence  -- requires that experiments do not harm research subjects, and that researchers minimize the risks for subjects while maximizing the benefits for them
  • Justice  -- requires that all forms of differential treatment among research subjects be justified

Advantages and Disadvantages of Experimental Design

The environment in which the research takes place can often be carefully controlled. Consequently, it is easier to estimate the true effect of the variable of interest on the outcome of interest.

Disadvantages

It is often difficult to assure the external validity of the experiment, due to the frequently nonrandom selection processes and the artificial nature of the experimental context.

Experimental and Quasi-Experimental Research

Guide Title: Experimental and Quasi-Experimental Research Guide ID: 64

You approach a stainless-steel wall, separated vertically along its middle where two halves meet. After looking to the left, you see two buttons on the wall to the right. You press the top button and it lights up. A soft tone sounds and the two halves of the wall slide apart to reveal a small room. You step into the room. Looking to the left, then to the right, you see a panel of more buttons. You know that you seek a room marked with the numbers 1-0-1-2, so you press the button marked "10." The halves slide shut and enclose you within the cubicle, which jolts upward. Soon, the soft tone sounds again. The door opens again. On the far wall, a sign silently proclaims, "10th floor."

You have engaged in a series of experiments. A ride in an elevator may not seem like an experiment, but it, and each step taken towards its ultimate outcome, are common examples of a search for a causal relationship-which is what experimentation is all about.

You started with the hypothesis that this is in fact an elevator. You proved that you were correct. You then hypothesized that the button to summon the elevator was on the left, which was incorrect, so then you hypothesized it was on the right, and you were correct. You hypothesized that pressing the button marked with the up arrow would not only bring an elevator to you, but that it would be an elevator heading in the up direction. You were right.

As this guide explains, the deliberate process of testing hypotheses and reaching conclusions is an extension of commonplace testing of cause and effect relationships.

Basic Concepts of Experimental and Quasi-Experimental Research

Discovering causal relationships is the key to experimental research. In abstract terms, this means the relationship between a certain action, X, which alone creates the effect Y. For example, turning the volume knob on your stereo clockwise causes the sound to get louder. In addition, you could observe that turning the knob clockwise alone, and nothing else, caused the sound level to increase. You could further conclude that a causal relationship exists between turning the knob clockwise and an increase in volume; not simply because one caused the other, but because you are certain that nothing else caused the effect.

Independent and Dependent Variables

Beyond discovering causal relationships, experimental research further seeks out how much cause will produce how much effect; in technical terms, how the independent variable will affect the dependent variable. You know that turning the knob clockwise will produce a louder noise, but by varying how much you turn it, you see how much sound is produced. On the other hand, you might find that although you turn the knob a great deal, sound doesn't increase dramatically. Or, you might find that turning the knob just a little adds more sound than expected. The amount that you turned the knob is the independent variable, the variable that the researcher controls, and the amount of sound that resulted from turning it is the dependent variable, the change that is caused by the independent variable.

Experimental research also looks into the effects of removing something. For example, if you remove a loud noise from the room, will the person next to you be able to hear you? Or how much noise needs to be removed before that person can hear you?

Treatment and Hypothesis

The term treatment refers to either removing or adding a stimulus in order to measure an effect (such as turning the knob a little or a lot, or reducing the noise level a little or a lot). Experimental researchers want to know how varying levels of treatment will affect what they are studying. As such, researchers often have an idea, or hypothesis, about what effect will occur when they cause something. Few experiments are performed where there is no idea of what will happen. From past experiences in life or from the knowledge we possess in our specific field of study, we know how some actions cause other reactions. Experiments confirm or reconfirm this fact.

Experimentation becomes more complex when the causal relationships they seek aren't as clear as in the stereo knob-turning examples. Questions like "Will olestra cause cancer?" or "Will this new fertilizer help this plant grow better?" present more to consider. For example, any number of things could affect the growth rate of a plant-the temperature, how much water or sun it receives, or how much carbon dioxide is in the air. These variables can affect an experiment's results. An experimenter who wants to show that adding a certain fertilizer will help a plant grow better must ensure that it is the fertilizer, and nothing else, affecting the growth patterns of the plant. To do this, as many of these variables as possible must be controlled.

Matching and Randomization

In the example used in this guide (you'll find the example below), we discuss an experiment that focuses on three groups of plants -- one that is treated with a fertilizer named MegaGro, another group treated with a fertilizer named Plant!, and yet another that is not treated with fetilizer (this latter group serves as a "control" group). In this example, even though the designers of the experiment have tried to remove all extraneous variables, results may appear merely coincidental. Since the goal of the experiment is to prove a causal relationship in which a single variable is responsible for the effect produced, the experiment would produce stronger proof if the results were replicated in larger treatment and control groups.

Selecting groups entails assigning subjects in the groups of an experiment in such a way that treatment and control groups are comparable in all respects except the application of the treatment. Groups can be created in two ways: matching and randomization. In the MegaGro experiment discussed below, the plants might be matched according to characteristics such as age, weight and whether they are blooming. This involves distributing these plants so that each plant in one group exactly matches characteristics of plants in the other groups. Matching may be problematic, though, because it "can promote a false sense of security by leading [the experimenter] to believe that [the] experimental and control groups were really equated at the outset, when in fact they were not equated on a host of variables" (Jones, 291). In other words, you may have flowers for your MegaGro experiment that you matched and distributed among groups, but other variables are unaccounted for. It would be difficult to have equal groupings.

Randomization, then, is preferred to matching. This method is based on the statistical principle of normal distribution. Theoretically, any arbitrarily selected group of adequate size will reflect normal distribution. Differences between groups will average out and become more comparable. The principle of normal distribution states that in a population most individuals will fall within the middle range of values for a given characteristic, with increasingly fewer toward either extreme (graphically represented as the ubiquitous "bell curve").

Differences between Quasi-Experimental and Experimental Research

Thus far, we have explained that for experimental research we need:

  • a hypothesis for a causal relationship;
  • a control group and a treatment group;
  • to eliminate confounding variables that might mess up the experiment and prevent displaying the causal relationship; and
  • to have larger groups with a carefully sorted constituency; preferably randomized, in order to keep accidental differences from fouling things up.

But what if we don't have all of those? Do we still have an experiment? Not a true experiment in the strictest scientific sense of the term, but we can have a quasi-experiment, an attempt to uncover a causal relationship, even though the researcher cannot control all the factors that might affect the outcome.

A quasi-experimenter treats a given situation as an experiment even though it is not wholly by design. The independent variable may not be manipulated by the researcher, treatment and control groups may not be randomized or matched, or there may be no control group. The researcher is limited in what he or she can say conclusively.

The significant element of both experiments and quasi-experiments is the measure of the dependent variable, which it allows for comparison. Some data is quite straightforward, but other measures, such as level of self-confidence in writing ability, increase in creativity or in reading comprehension are inescapably subjective. In such cases, quasi-experimentation often involves a number of strategies to compare subjectivity, such as rating data, testing, surveying, and content analysis.

Rating essentially is developing a rating scale to evaluate data. In testing, experimenters and quasi-experimenters use ANOVA (Analysis of Variance) and ANCOVA (Analysis of Co-Variance) tests to measure differences between control and experimental groups, as well as different correlations between groups.

Since we're mentioning the subject of statistics, note that experimental or quasi-experimental research cannot state beyond a shadow of a doubt that a single cause will always produce any one effect. They can do no more than show a probability that one thing causes another. The probability that a result is the due to random chance is an important measure of statistical analysis and in experimental research.

Example: Causality

Let's say you want to determine that your new fertilizer, MegaGro, will increase the growth rate of plants. You begin by getting a plant to go with your fertilizer. Since the experiment is concerned with proving that MegaGro works, you need another plant, using no fertilizer at all on it, to compare how much change your fertilized plant displays. This is what is known as a control group.

Set up with a control group, which will receive no treatment, and an experimental group, which will get MegaGro, you must then address those variables that could invalidate your experiment. This can be an extensive and exhaustive process. You must ensure that you use the same plant; that both groups are put in the same kind of soil; that they receive equal amounts of water and sun; that they receive the same amount of exposure to carbon-dioxide-exhaling researchers, and so on. In short, any other variable that might affect the growth of those plants, other than the fertilizer, must be the same for both plants. Otherwise, you can't prove absolutely that MegaGro is the only explanation for the increased growth of one of those plants.

Such an experiment can be done on more than two groups. You may not only want to show that MegaGro is an effective fertilizer, but that it is better than its competitor brand of fertilizer, Plant! All you need to do, then, is have one experimental group receiving MegaGro, one receiving Plant! and the other (the control group) receiving no fertilizer. Those are the only variables that can be different between the three groups; all other variables must be the same for the experiment to be valid.

Controlling variables allows the researcher to identify conditions that may affect the experiment's outcome. This may lead to alternative explanations that the researcher is willing to entertain in order to isolate only variables judged significant. In the MegaGro experiment, you may be concerned with how fertile the soil is, but not with the plants'; relative position in the window, as you don't think that the amount of shade they get will affect their growth rate. But what if it did? You would have to go about eliminating variables in order to determine which is the key factor. What if one receives more shade than the other and the MegaGro plant, which received more shade, died? This might prompt you to formulate a plausible alternative explanation, which is a way of accounting for a result that differs from what you expected. You would then want to redo the study with equal amounts of sunlight.

Methods: Five Steps

Experimental research can be roughly divided into five phases:

Identifying a research problem

The process starts by clearly identifying the problem you want to study and considering what possible methods will affect a solution. Then you choose the method you want to test, and formulate a hypothesis to predict the outcome of the test.

For example, you may want to improve student essays, but you don't believe that teacher feedback is enough. You hypothesize that some possible methods for writing improvement include peer workshopping, or reading more example essays. Favoring the former, your experiment would try to determine if peer workshopping improves writing in high school seniors. You state your hypothesis: peer workshopping prior to turning in a final draft will improve the quality of the student's essay.

Planning an experimental research study

The next step is to devise an experiment to test your hypothesis. In doing so, you must consider several factors. For example, how generalizable do you want your end results to be? Do you want to generalize about the entire population of high school seniors everywhere, or just the particular population of seniors at your specific school? This will determine how simple or complex the experiment will be. The amount of time funding you have will also determine the size of your experiment.

Continuing the example from step one, you may want a small study at one school involving three teachers, each teaching two sections of the same course. The treatment in this experiment is peer workshopping. Each of the three teachers will assign the same essay assignment to both classes; the treatment group will participate in peer workshopping, while the control group will receive only teacher comments on their drafts.

Conducting the experiment

At the start of an experiment, the control and treatment groups must be selected. Whereas the "hard" sciences have the luxury of attempting to create truly equal groups, educators often find themselves forced to conduct their experiments based on self-selected groups, rather than on randomization. As was highlighted in the Basic Concepts section, this makes the study a quasi-experiment, since the researchers cannot control all of the variables.

For the peer workshopping experiment, let's say that it involves six classes and three teachers with a sample of students randomly selected from all the classes. Each teacher will have a class for a control group and a class for a treatment group. The essay assignment is given and the teachers are briefed not to change any of their teaching methods other than the use of peer workshopping. You may see here that this is an effort to control a possible variable: teaching style variance.

Analyzing the data

The fourth step is to collect and analyze the data. This is not solely a step where you collect the papers, read them, and say your methods were a success. You must show how successful. You must devise a scale by which you will evaluate the data you receive, therefore you must decide what indicators will be, and will not be, important.

Continuing our example, the teachers' grades are first recorded, then the essays are evaluated for a change in sentence complexity, syntactical and grammatical errors, and overall length. Any statistical analysis is done at this time if you choose to do any. Notice here that the researcher has made judgments on what signals improved writing. It is not simply a matter of improved teacher grades, but a matter of what the researcher believes constitutes improved use of the language.

Writing the paper/presentation describing the findings

Once you have completed the experiment, you will want to share findings by publishing academic paper (or presentations). These papers usually have the following format, but it is not necessary to follow it strictly. Sections can be combined or not included, depending on the structure of the experiment, and the journal to which you submit your paper.

  • Abstract : Summarize the project: its aims, participants, basic methodology, results, and a brief interpretation.
  • Introduction : Set the context of the experiment.
  • Review of Literature : Provide a review of the literature in the specific area of study to show what work has been done. Should lead directly to the author's purpose for the study.
  • Statement of Purpose : Present the problem to be studied.
  • Participants : Describe in detail participants involved in the study; e.g., how many, etc. Provide as much information as possible.
  • Materials and Procedures : Clearly describe materials and procedures. Provide enough information so that the experiment can be replicated, but not so much information that it becomes unreadable. Include how participants were chosen, the tasks assigned them, how they were conducted, how data were evaluated, etc.
  • Results : Present the data in an organized fashion. If it is quantifiable, it is analyzed through statistical means. Avoid interpretation at this time.
  • Discussion : After presenting the results, interpret what has happened in the experiment. Base the discussion only on the data collected and as objective an interpretation as possible. Hypothesizing is possible here.
  • Limitations : Discuss factors that affect the results. Here, you can speculate how much generalization, or more likely, transferability, is possible based on results. This section is important for quasi-experimentation, since a quasi-experiment cannot control all of the variables that might affect the outcome of a study. You would discuss what variables you could not control.
  • Conclusion : Synthesize all of the above sections.
  • References : Document works cited in the correct format for the field.

Experimental and Quasi-Experimental Research: Issues and Commentary

Several issues are addressed in this section, including the use of experimental and quasi-experimental research in educational settings, the relevance of the methods to English studies, and ethical concerns regarding the methods.

Using Experimental and Quasi-Experimental Research in Educational Settings

Charting causal relationships in human settings.

Any time a human population is involved, prediction of casual relationships becomes cloudy and, some say, impossible. Many reasons exist for this; for example,

  • researchers in classrooms add a disturbing presence, causing students to act abnormally, consciously or unconsciously;
  • subjects try to please the researcher, just because of an apparent interest in them (known as the Hawthorne Effect); or, perhaps
  • the teacher as researcher is restricted by bias and time pressures.

But such confounding variables don't stop researchers from trying to identify causal relationships in education. Educators naturally experiment anyway, comparing groups, assessing the attributes of each, and making predictions based on an evaluation of alternatives. They look to research to support their intuitive practices, experimenting whenever they try to decide which instruction method will best encourage student improvement.

Combining Theory, Research, and Practice

The goal of educational research lies in combining theory, research, and practice. Educational researchers attempt to establish models of teaching practice, learning styles, curriculum development, and countless other educational issues. The aim is to "try to improve our understanding of education and to strive to find ways to have understanding contribute to the improvement of practice," one writer asserts (Floden 1996, p. 197).

In quasi-experimentation, researchers try to develop models by involving teachers as researchers, employing observational research techniques. Although results of this kind of research are context-dependent and difficult to generalize, they can act as a starting point for further study. The "educational researcher . . . provides guidelines and interpretive material intended to liberate the teacher's intelligence so that whatever artistry in teaching the teacher can achieve will be employed" (Eisner 1992, p. 8).

Bias and Rigor

Critics contend that the educational researcher is inherently biased, sample selection is arbitrary, and replication is impossible. The key to combating such criticism has to do with rigor. Rigor is established through close, proper attention to randomizing groups, time spent on a study, and questioning techniques. This allows more effective application of standards of quantitative research to qualitative research.

Often, teachers cannot wait to for piles of experimentation data to be analyzed before using the teaching methods (Lauer and Asher 1988). They ultimately must assess whether the results of a study in a distant classroom are applicable in their own classrooms. And they must continuously test the effectiveness of their methods by using experimental and qualitative research simultaneously. In addition to statistics (quantitative), researchers may perform case studies or observational research (qualitative) in conjunction with, or prior to, experimentation.

Relevance to English Studies

Situations in english studies that might encourage use of experimental methods.

Whenever a researcher would like to see if a causal relationship exists between groups, experimental and quasi-experimental research can be a viable research tool. Researchers in English Studies might use experimentation when they believe a relationship exists between two variables, and they want to show that these two variables have a significant correlation (or causal relationship).

A benefit of experimentation is the ability to control variables, such as the amount of treatment, when it is given, to whom and so forth. Controlling variables allows researchers to gain insight into the relationships they believe exist. For example, a researcher has an idea that writing under pseudonyms encourages student participation in newsgroups. Researchers can control which students write under pseudonyms and which do not, then measure the outcomes. Researchers can then analyze results and determine if this particular variable alone causes increased participation.

Transferability-Applying Results

Experimentation and quasi-experimentation allow for generating transferable results and accepting those results as being dependent upon experimental rigor. It is an effective alternative to generalizability, which is difficult to rely upon in educational research. English scholars, reading results of experiments with a critical eye, ultimately decide if results will be implemented and how. They may even extend that existing research by replicating experiments in the interest of generating new results and benefiting from multiple perspectives. These results will strengthen the study or discredit findings.

Concerns English Scholars Express about Experiments

Researchers should carefully consider if a particular method is feasible in humanities studies, and whether it will yield the desired information. Some researchers recommend addressing pertinent issues combining several research methods, such as survey, interview, ethnography, case study, content analysis, and experimentation (Lauer and Asher, 1988).

Advantages and Disadvantages of Experimental Research: Discussion

In educational research, experimentation is a way to gain insight into methods of instruction. Although teaching is context specific, results can provide a starting point for further study. Often, a teacher/researcher will have a "gut" feeling about an issue which can be explored through experimentation and looking at causal relationships. Through research intuition can shape practice .

A preconception exists that information obtained through scientific method is free of human inconsistencies. But, since scientific method is a matter of human construction, it is subject to human error . The researcher's personal bias may intrude upon the experiment , as well. For example, certain preconceptions may dictate the course of the research and affect the behavior of the subjects. The issue may be compounded when, although many researchers are aware of the affect that their personal bias exerts on their own research, they are pressured to produce research that is accepted in their field of study as "legitimate" experimental research.

The researcher does bring bias to experimentation, but bias does not limit an ability to be reflective . An ethical researcher thinks critically about results and reports those results after careful reflection. Concerns over bias can be leveled against any research method.

Often, the sample may not be representative of a population, because the researcher does not have an opportunity to ensure a representative sample. For example, subjects could be limited to one location, limited in number, studied under constrained conditions and for too short a time.

Despite such inconsistencies in educational research, the researcher has control over the variables , increasing the possibility of more precisely determining individual effects of each variable. Also, determining interaction between variables is more possible.

Even so, artificial results may result . It can be argued that variables are manipulated so the experiment measures what researchers want to examine; therefore, the results are merely contrived products and have no bearing in material reality. Artificial results are difficult to apply in practical situations, making generalizing from the results of a controlled study questionable. Experimental research essentially first decontextualizes a single question from a "real world" scenario, studies it under controlled conditions, and then tries to recontextualize the results back on the "real world" scenario. Results may be difficult to replicate .

Perhaps, groups in an experiment may not be comparable . Quasi-experimentation in educational research is widespread because not only are many researchers also teachers, but many subjects are also students. With the classroom as laboratory, it is difficult to implement randomizing or matching strategies. Often, students self-select into certain sections of a course on the basis of their own agendas and scheduling needs. Thus when, as often happens, one class is treated and the other used for a control, the groups may not actually be comparable. As one might imagine, people who register for a class which meets three times a week at eleven o'clock in the morning (young, no full-time job, night people) differ significantly from those who register for one on Monday evenings from seven to ten p.m. (older, full-time job, possibly more highly motivated). Each situation presents different variables and your group might be completely different from that in the study. Long-term studies are expensive and hard to reproduce. And although often the same hypotheses are tested by different researchers, various factors complicate attempts to compare or synthesize them. It is nearly impossible to be as rigorous as the natural sciences model dictates.

Even when randomization of students is possible, problems arise. First, depending on the class size and the number of classes, the sample may be too small for the extraneous variables to cancel out. Second, the study population is not strictly a sample, because the population of students registered for a given class at a particular university is obviously not representative of the population of all students at large. For example, students at a suburban private liberal-arts college are typically young, white, and upper-middle class. In contrast, students at an urban community college tend to be older, poorer, and members of a racial minority. The differences can be construed as confounding variables: the first group may have fewer demands on its time, have less self-discipline, and benefit from superior secondary education. The second may have more demands, including a job and/or children, have more self-discipline, but an inferior secondary education. Selecting a population of subjects which is representative of the average of all post-secondary students is also a flawed solution, because the outcome of a treatment involving this group is not necessarily transferable to either the students at a community college or the students at the private college, nor are they universally generalizable.

When a human population is involved, experimental research becomes concerned if behavior can be predicted or studied with validity. Human response can be difficult to measure . Human behavior is dependent on individual responses. Rationalizing behavior through experimentation does not account for the process of thought, making outcomes of that process fallible (Eisenberg, 1996).

Nevertheless, we perform experiments daily anyway . When we brush our teeth every morning, we are experimenting to see if this behavior will result in fewer cavities. We are relying on previous experimentation and we are transferring the experimentation to our daily lives.

Moreover, experimentation can be combined with other research methods to ensure rigor . Other qualitative methods such as case study, ethnography, observational research and interviews can function as preconditions for experimentation or conducted simultaneously to add validity to a study.

We have few alternatives to experimentation. Mere anecdotal research , for example is unscientific, unreplicatable, and easily manipulated. Should we rely on Ed walking into a faculty meeting and telling the story of Sally? Sally screamed, "I love writing!" ten times before she wrote her essay and produced a quality paper. Therefore, all the other faculty members should hear this anecdote and know that all other students should employ this similar technique.

On final disadvantage: frequently, political pressure drives experimentation and forces unreliable results. Specific funding and support may drive the outcomes of experimentation and cause the results to be skewed. The reader of these results may not be aware of these biases and should approach experimentation with a critical eye.

Advantages and Disadvantages of Experimental Research: Quick Reference List

Experimental and quasi-experimental research can be summarized in terms of their advantages and disadvantages. This section combines and elaborates upon many points mentioned previously in this guide.

gain insight into methods of instruction

subject to human error

intuitive practice shaped by research

personal bias of researcher may intrude

teachers have bias but can be reflective

sample may not be representative

researcher can have control over variables

can produce artificial results

humans perform experiments anyway

results may only apply to one situation and may be difficult to replicate

can be combined with other research methods for rigor

groups may not be comparable

use to determine what is best for population

human response can be difficult to measure

provides for greater transferability than anecdotal research

political pressure may skew results

Ethical Concerns

Experimental research may be manipulated on both ends of the spectrum: by researcher and by reader. Researchers who report on experimental research, faced with naive readers of experimental research, encounter ethical concerns. While they are creating an experiment, certain objectives and intended uses of the results might drive and skew it. Looking for specific results, they may ask questions and look at data that support only desired conclusions. Conflicting research findings are ignored as a result. Similarly, researchers, seeking support for a particular plan, look only at findings which support that goal, dismissing conflicting research.

Editors and journals do not publish only trouble-free material. As readers of experiments members of the press might report selected and isolated parts of a study to the public, essentially transferring that data to the general population which may not have been intended by the researcher. Take, for example, oat bran. A few years ago, the press reported how oat bran reduces high blood pressure by reducing cholesterol. But that bit of information was taken out of context. The actual study found that when people ate more oat bran, they reduced their intake of saturated fats high in cholesterol. People started eating oat bran muffins by the ton, assuming a causal relationship when in actuality a number of confounding variables might influence the causal link.

Ultimately, ethical use and reportage of experimentation should be addressed by researchers, reporters and readers alike.

Reporters of experimental research often seek to recognize their audience's level of knowledge and try not to mislead readers. And readers must rely on the author's skill and integrity to point out errors and limitations. The relationship between researcher and reader may not sound like a problem, but after spending months or years on a project to produce no significant results, it may be tempting to manipulate the data to show significant results in order to jockey for grants and tenure.

Meanwhile, the reader may uncritically accept results that receive validity by being published in a journal. However, research that lacks credibility often is not published; consequentially, researchers who fail to publish run the risk of being denied grants, promotions, jobs, and tenure. While few researchers are anything but earnest in their attempts to conduct well-designed experiments and present the results in good faith, rhetorical considerations often dictate a certain minimization of methodological flaws.

Concerns arise if researchers do not report all, or otherwise alter, results. This phenomenon is counterbalanced, however, in that professionals are also rewarded for publishing critiques of others' work. Because the author of an experimental study is in essence making an argument for the existence of a causal relationship, he or she must be concerned not only with its integrity, but also with its presentation. Achieving persuasiveness in any kind of writing involves several elements: choosing a topic of interest, providing convincing evidence for one's argument, using tone and voice to project credibility, and organizing the material in a way that meets expectations for a logical sequence. Of course, what is regarded as pertinent, accepted as evidence, required for credibility, and understood as logical varies according to context. If the experimental researcher hopes to make an impact on the community of professionals in their field, she must attend to the standards and orthodoxy's of that audience.

Related Links

Contrasts: Traditional and computer-supported writing classrooms. This Web presents a discussion of the Transitions Study, a year-long exploration of teachers and students in computer-supported and traditional writing classrooms. Includes description of study, rationale for conducting the study, results and implications of the study.

http://kairos.technorhetoric.net/2.2/features/reflections/page1.htm

Annotated Bibliography

A cozy world of trivial pursuits? (1996, June 28) The Times Educational Supplement . 4174, pp. 14-15.

A critique discounting the current methods Great Britain employs to fund and disseminate educational research. The belief is that research is performed for fellow researchers not the teaching public and implications for day to day practice are never addressed.

Anderson, J. A. (1979, Nov. 10-13). Research as argument: the experimental form. Paper presented at the annual meeting of the Speech Communication Association, San Antonio, TX.

In this paper, the scientist who uses the experimental form does so in order to explain that which is verified through prediction.

Anderson, Linda M. (1979). Classroom-based experimental studies of teaching effectiveness in elementary schools . (Technical Report UTR&D-R- 4102). Austin: Research and Development Center for Teacher Education, University of Texas.

Three recent large-scale experimental studies have built on a database established through several correlational studies of teaching effectiveness in elementary school.

Asher, J. W. (1976). Educational research and evaluation methods . Boston: Little, Brown.

Abstract unavailable by press time.

Babbie, Earl R. (1979). The Practice of Social Research . Belmont, CA: Wadsworth.

A textbook containing discussions of several research methodologies used in social science research.

Bangert-Drowns, R.L. (1993). The word processor as instructional tool: a meta-analysis of word processing in writing instruction. Review of Educational Research, 63 (1), 69-93.

Beach, R. (1993). The effects of between-draft teacher evaluation versus student self-evaluation on high school students' revising of rough drafts. Research in the Teaching of English, 13 , 111-119.

The question of whether teacher evaluation or guided self-evaluation of rough drafts results in increased revision was addressed in Beach's study. Differences in the effects of teacher evaluations, guided self-evaluation (using prepared guidelines,) and no evaluation of rough drafts were examined. The final drafts of students (10th, 11th, and 12th graders) were compared with their rough drafts and rated by judges according to degree of change.

Beishuizen, J. & Moonen, J. (1992). Research in technology enriched schools: a case for cooperation between teachers and researchers . (ERIC Technical Report ED351006).

This paper describes the research strategies employed in the Dutch Technology Enriched Schools project to encourage extensive and intensive use of computers in a small number of secondary schools, and to study the effects of computer use on the classroom, the curriculum, and school administration and management.

Borg, W. P. (1989). Educational Research: an Introduction . (5th ed.). New York: Longman.

An overview of educational research methodology, including literature review and discussion of approaches to research, experimental design, statistical analysis, ethics, and rhetorical presentation of research findings.

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research . Boston: Houghton Mifflin.

A classic overview of research designs.

Campbell, D.T. (1988). Methodology and epistemology for social science: selected papers . ed. E. S. Overman. Chicago: University of Chicago Press.

This is an overview of Campbell's 40-year career and his work. It covers in seven parts measurement, experimental design, applied social experimentation, interpretive social science, epistemology and sociology of science. Includes an extensive bibliography.

Caporaso, J. A., & Roos, Jr., L. L. (Eds.). Quasi-experimental approaches: Testing theory and evaluating policy. Evanston, WA: Northwestern University Press.

A collection of articles concerned with explicating the underlying assumptions of quasi-experimentation and relating these to true experimentation. With an emphasis on design. Includes a glossary of terms.

Collier, R. Writing and the word processor: How wary of the gift-giver should we be? Unpublished manuscript.

Unpublished typescript. Charts the developments to date in computers and composition and speculates about the future within the framework of Willie Sypher's model of the evolution of creative discovery.

Cook, T.D. & Campbell, D.T. (1979). Quasi-experimentation: design and analysis issues for field settings . Boston: Houghton Mifflin Co.

The authors write that this book "presents some quasi-experimental designs and design features that can be used in many social research settings. The designs serve to probe causal hypotheses about a wide variety of substantive issues in both basic and applied research."

Cutler, A. (1970). An experimental method for semantic field study. Linguistic Communication, 2 , N. pag.

This paper emphasizes the need for empirical research and objective discovery procedures in semantics, and illustrates a method by which these goals may be obtained.

Daniels, L. B. (1996, Summer). Eisenberg's Heisenberg: The indeterminancies of rationality. Curriculum Inquiry, 26 , 181-92.

Places Eisenberg's theories in relation to the death of foundationalism by showing that he distorts rational studies into a form of relativism. He looks at Eisenberg's ideas on indeterminacy, methods and evidence, what he is against and what we should think of what he says.

Danziger, K. (1990). Constructing the subject: Historical origins of psychological research. Cambridge: Cambridge University Press.

Danzinger stresses the importance of being aware of the framework in which research operates and of the essentially social nature of scientific activity.

Diener, E., et al. (1972, December). Leakage of experimental information to potential future subjects by debriefed subjects. Journal of Experimental Research in Personality , 264-67.

Research regarding research: an investigation of the effects on the outcome of an experiment in which information about the experiment had been leaked to subjects. The study concludes that such leakage is not a significant problem.

Dudley-Marling, C., & Rhodes, L. K. (1989). Reflecting on a close encounter with experimental research. Canadian Journal of English Language Arts. 12 , 24-28.

Researchers, Dudley-Marling and Rhodes, address some problems they met in their experimental approach to a study of reading comprehension. This article discusses the limitations of experimental research, and presents an alternative to experimental or quantitative research.

Edgington, E. S. (1985). Random assignment and experimental research. Educational Administration Quarterly, 21 , N. pag.

Edgington explores ways on which random assignment can be a part of field studies. The author discusses both non-experimental and experimental research and the need for using random assignment.

Eisenberg, J. (1996, Summer). Response to critiques by R. Floden, J. Zeuli, and L. Daniels. Curriculum Inquiry, 26 , 199-201.

A response to critiques of his argument that rational educational research methods are at best suspect and at worst futile. He believes indeterminacy controls this method and worries that chaotic research is failing students.

Eisner, E. (1992, July). Are all causal claims positivistic? A reply to Francis Schrag. Educational Researcher, 21 (5), 8-9.

Eisner responds to Schrag who claimed that critics like Eisner cannot escape a positivistic paradigm whatever attempts they make to do so. Eisner argues that Schrag essentially misses the point for trying to argue for the paradigm solely on the basis of cause and effect without including the rest of positivistic philosophy. This weakens his argument against multiple modal methods, which Eisner argues provides opportunities to apply the appropriate research design where it is most applicable.

Floden, R.E. (1996, Summer). Educational research: limited, but worthwhile and maybe a bargain. (response to J.A. Eisenberg). Curriculum Inquiry, 26 , 193-7.

Responds to John Eisenberg critique of educational research by asserting the connection between improvement of practice and research results. He places high value of teacher discrepancy and knowledge that research informs practice.

Fortune, J. C., & Hutson, B. A. (1994, March/April). Selecting models for measuring change when true experimental conditions do not exist. Journal of Educational Research, 197-206.

This article reviews methods for minimizing the effects of nonideal experimental conditions by optimally organizing models for the measurement of change.

Fox, R. F. (1980). Treatment of writing apprehension and tts effects on composition. Research in the Teaching of English, 14 , 39-49.

The main purpose of Fox's study was to investigate the effects of two methods of teaching writing on writing apprehension among entry level composition students, A conventional teaching procedure was used with a control group, while a workshop method was employed with the treatment group.

Gadamer, H-G. (1976). Philosophical hermeneutics . (D. E. Linge, Trans.). Berkeley, CA: University of California Press.

A collection of essays with the common themes of the mediation of experience through language, the impossibility of objectivity, and the importance of context in interpretation.

Gaise, S. J. (1981). Experimental vs. non-experimental research on classroom second language learning. Bilingual Education Paper Series, 5 , N. pag.

Aims on classroom-centered research on second language learning and teaching are considered and contrasted with the experimental approach.

Giordano, G. (1983). Commentary: Is experimental research snowing us? Journal of Reading, 27 , 5-7.

Do educational research findings actually benefit teachers and students? Giordano states his opinion that research may be helpful to teaching, but is not essential and often is unnecessary.

Goldenson, D. R. (1978, March). An alternative view about the role of the secondary school in political socialization: A field-experimental study of theory and research in social education. Theory and Research in Social Education , 44-72.

This study concludes that when political discussion among experimental groups of secondary school students is led by a teacher, the degree to which the students' views were impacted is proportional to the credibility of the teacher.

Grossman, J., and J. P. Tierney. (1993, October). The fallibility of comparison groups. Evaluation Review , 556-71.

Grossman and Tierney present evidence to suggest that comparison groups are not the same as nontreatment groups.

Harnisch, D. L. (1992). Human judgment and the logic of evidence: A critical examination of research methods in special education transition literature. In D. L. Harnisch et al. (Eds.), Selected readings in transition.

This chapter describes several common types of research studies in special education transition literature and the threats to their validity.

Hawisher, G. E. (1989). Research and recommendations for computers and composition. In G. Hawisher and C. Selfe. (Eds.), Critical Perspectives on Computers and Composition Instruction . (pp. 44-69). New York: Teacher's College Press.

An overview of research in computers and composition to date. Includes a synthesis grid of experimental research.

Hillocks, G. Jr. (1982). The interaction of instruction, teacher comment, and revision in teaching the composing process. Research in the Teaching of English, 16 , 261-278.

Hillock conducted a study using three treatments: observational or data collecting activities prior to writing, use of revisions or absence of same, and either brief or lengthy teacher comments to identify effective methods of teaching composition to seventh and eighth graders.

Jenkinson, J. C. (1989). Research design in the experimental study of intellectual disability. International Journal of Disability, Development, and Education, 69-84.

This article catalogues the difficulties of conducting experimental research where the subjects are intellectually disables and suggests alternative research strategies.

Jones, R. A. (1985). Research Methods in the Social and Behavioral Sciences. Sunderland, MA: Sinauer Associates, Inc..

A textbook designed to provide an overview of research strategies in the social sciences, including survey, content analysis, ethnographic approaches, and experimentation. The author emphasizes the importance of applying strategies appropriately and in variety.

Kamil, M. L., Langer, J. A., & Shanahan, T. (1985). Understanding research in reading and writing . Newton, Massachusetts: Allyn and Bacon.

Examines a wide variety of problems in reading and writing, with a broad range of techniques, from different perspectives.

Kennedy, J. L. (1985). An Introduction to the Design and Analysis of Experiments in Behavioral Research . Lanham, MD: University Press of America.

An introductory textbook of psychological and educational research.

Keppel, G. (1991). Design and analysis: a researcher's handbook . Englewood Cliffs, NJ: Prentice Hall.

This updates Keppel's earlier book subtitled "a student's handbook." Focuses on extensive information about analytical research and gives a basic picture of research in psychology. Covers a range of statistical topics. Includes a subject and name index, as well as a glossary.

Knowles, G., Elija, R., & Broadwater, K. (1996, Spring/Summer). Teacher research: enhancing the preparation of teachers? Teaching Education, 8 , 123-31.

Researchers looked at one teacher candidate who participated in a class which designed their own research project correlating to a question they would like answered in the teaching world. The goal of the study was to see if preservice teachers developed reflective practice by researching appropriate classroom contexts.

Lace, J., & De Corte, E. (1986, April 16-20). Research on media in western Europe: A myth of sisyphus? Paper presented at the annual meeting of the American Educational Research Association. San Francisco.

Identifies main trends in media research in western Europe, with emphasis on three successive stages since 1960: tools technology, systems technology, and reflective technology.

Latta, A. (1996, Spring/Summer). Teacher as researcher: selected resources. Teaching Education, 8 , 155-60.

An annotated bibliography on educational research including milestones of thought, practical applications, successful outcomes, seminal works, and immediate practical applications.

Lauer. J.M. & Asher, J. W. (1988). Composition research: Empirical designs . New York: Oxford University Press.

Approaching experimentation from a humanist's perspective to it, authors focus on eight major research designs: Case studies, ethnographies, sampling and surveys, quantitative descriptive studies, measurement, true experiments, quasi-experiments, meta-analyses, and program evaluations. It takes on the challenge of bridging language of social science with that of the humanist. Includes name and subject indexes, as well as a glossary and a glossary of symbols.

Mishler, E. G. (1979). Meaning in context: Is there any other kind? Harvard Educational Review, 49 , 1-19.

Contextual importance has been largely ignored by traditional research approaches in social/behavioral sciences and in their application to the education field. Developmental and social psychologists have increasingly noted the inadequacies of this approach. Drawing examples for phenomenology, sociolinguistics, and ethnomethodology, the author proposes alternative approaches for studying meaning in context.

Mitroff, I., & Bonoma, T. V. (1978, May). Psychological assumptions, experimentations, and real world problems: A critique and an alternate approach to evaluation. Evaluation Quarterly , 235-60.

The authors advance the notion of dialectic as a means to clarify and examine the underlying assumptions of experimental research methodology, both in highly controlled situations and in social evaluation.

Muller, E. W. (1985). Application of experimental and quasi-experimental research designs to educational software evaluation. Educational Technology, 25 , 27-31.

Muller proposes a set of guidelines for the use of experimental and quasi-experimental methods of research in evaluating educational software. By obtaining empirical evidence of student performance, it is possible to evaluate if programs are making the desired learning effect.

Murray, S., et al. (1979, April 8-12). Technical issues as threats to internal validity of experimental and quasi-experimental designs . San Francisco: University of California.

The article reviews three evaluation models and analyzes the flaws common to them. Remedies are suggested.

Muter, P., & Maurutto, P. (1991). Reading and skimming from computer screens and books: The paperless office revisited? Behavior and Information Technology, 10 (4), 257-66.

The researchers test for reading and skimming effectiveness, defined as accuracy combined with speed, for written text compared to text on a computer monitor. They conclude that, given optimal on-line conditions, both are equally effective.

O'Donnell, A., Et al. (1992). The impact of cooperative writing. In J. R. Hayes, et al. (Eds.). Reading empirical research studies: The rhetoric of research . (pp. 371-84). Hillsdale, NJ: Lawrence Erlbaum Associates.

A model of experimental design. The authors investigate the efficacy of cooperative writing strategies, as well as the transferability of skills learned to other, individual writing situations.

Palmer, D. (1988). Looking at philosophy . Mountain View, CA: Mayfield Publishing.

An introductory text with incisive but understandable discussions of the major movements and thinkers in philosophy from the Pre-Socratics through Sartre. With illustrations by the author. Includes a glossary.

Phelps-Gunn, T., & Phelps-Terasaki, D. (1982). Written language instruction: Theory and remediation . London: Aspen Systems Corporation.

The lack of research in written expression is addressed and an application on the Total Writing Process Model is presented.

Poetter, T. (1996, Spring/Summer). From resistance to excitement: becoming qualitative researchers and reflective practitioners. Teaching Education , 8109-19.

An education professor reveals his own problematic research when he attempted to institute a educational research component to a teacher preparation program. He encountered dissent from students and cooperating professionals and ultimately was rewarded with excitement towards research and a recognized correlation to practice.

Purves, A. C. (1992). Reflections on research and assessment in written composition. Research in the Teaching of English, 26 .

Three issues concerning research and assessment is writing are discussed: 1) School writing is a matter of products not process, 2) school writing is an ill-defined domain, 3) the quality of school writing is what observers report they see. Purves discusses these issues while looking at data collected in a ten-year study of achievement in written composition in fourteen countries.

Rathus, S. A. (1987). Psychology . (3rd ed.). Poughkeepsie, NY: Holt, Rinehart, and Winston.

An introductory psychology textbook. Includes overviews of the major movements in psychology, discussions of prominent examples of experimental research, and a basic explanation of relevant physiological factors. With chapter summaries.

Reiser, R. A. (1982). Improving the research skills of instructional designers. Educational Technology, 22 , 19-21.

In his paper, Reiser starts by stating the importance of research in advancing the field of education, and points out that graduate students in instructional design lack the proper skills to conduct research. The paper then goes on to outline the practicum in the Instructional Systems Program at Florida State University which includes: 1) Planning and conducting an experimental research study; 2) writing the manuscript describing the study; 3) giving an oral presentation in which they describe their research findings.

Report on education research . (Journal). Washington, DC: Capitol Publication, Education News Services Division.

This is an independent bi-weekly newsletter on research in education and learning. It has been publishing since Sept. 1969.

Rossell, C. H. (1986). Why is bilingual education research so bad?: Critique of the Walsh and Carballo study of Massachusetts bilingual education programs . Boston: Center for Applied Social Science, Boston University. (ERIC Working Paper 86-5).

The Walsh and Carballo evaluation of the effectiveness of transitional bilingual education programs in five Massachusetts communities has five flaws and the five flaws are discussed in detail.

Rubin, D. L., & Greene, K. (1992). Gender-typical style in written language. Research in the Teaching of English, 26.

This study was designed to find out whether the writing styles of men and women differ. Rubin and Green discuss the pre-suppositions that women are better writers than men.

Sawin, E. (1992). Reaction: Experimental research in the context of other methods. School of Education Review, 4 , 18-21.

Sawin responds to Gage's article on methodologies and issues in educational research. He agrees with most of the article but suggests the concept of scientific should not be regarded in absolute terms and recommends more emphasis on scientific method. He also questions the value of experiments over other types of research.

Schoonmaker, W. E. (1984). Improving classroom instruction: A model for experimental research. The Technology Teacher, 44, 24-25.

The model outlined in this article tries to bridge the gap between classroom practice and laboratory research, using what Schoonmaker calls active research. Research is conducted in the classroom with the students and is used to determine which two methods of classroom instruction chosen by the teacher is more effective.

Schrag, F. (1992). In defense of positivist research paradigms. Educational Researcher, 21, (5), 5-8.

The controversial defense of the use of positivistic research methods to evaluate educational strategies; the author takes on Eisner, Erickson, and Popkewitz.

Smith, J. (1997). The stories educational researchers tell about themselves. Educational Researcher, 33 (3), 4-11.

Recapitulates main features of an on-going debate between advocates for using vocabularies of traditional language arts and whole language in educational research. An "impasse" exists were advocates "do not share a theoretical disposition concerning both language instruction and the nature of research," Smith writes (p. 6). He includes a very comprehensive history of the debate of traditional research methodology and qualitative methods and vocabularies. Definitely worth a read by graduates.

Smith, N. L. (1980). The feasibility and desirability of experimental methods in evaluation. Evaluation and Program Planning: An International Journal , 251-55.

Smith identifies the conditions under which experimental research is most desirable. Includes a review of current thinking and controversies.

Stewart, N. R., & Johnson, R. G. (1986, March 16-20). An evaluation of experimental methodology in counseling and counselor education research. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.

The purpose of this study was to evaluate the quality of experimental research in counseling and counselor education published from 1976 through 1984.

Spector, P. E. (1990). Research Designs. Newbury Park, California: Sage Publications.

In this book, Spector introduces the basic principles of experimental and nonexperimental design in the social sciences.

Tait, P. E. (1984). Do-it-yourself evaluation of experimental research. Journal of Visual Impairment and Blindness, 78 , 356-363 .

Tait's goal is to provide the reader who is unfamiliar with experimental research or statistics with the basic skills necessary for the evaluation of research studies.

Walsh, S. M. (1990). The current conflict between case study and experimental research: A breakthrough study derives benefits from both . (ERIC Document Number ED339721).

This paper describes a study that was not experimentally designed, but its major findings were generalizable to the overall population of writers in college freshman composition classes. The study was not a case study, but it provided insights into the attitudes and feelings of small clusters of student writers.

Waters, G. R. (1976). Experimental designs in communication research. Journal of Business Communication, 14 .

The paper presents a series of discussions on the general elements of experimental design and the scientific process and relates these elements to the field of communication.

Welch, W. W. (March 1969). The selection of a national random sample of teachers for experimental curriculum evaluation. Scholastic Science and Math , 210-216.

Members of the evaluation section of Harvard project physics describe what is said to be the first attempt to select a national random sample of teachers, and list 6 steps to do so. Cost and comparison with a volunteer group are also discussed.

Winer, B.J. (1971). Statistical principles in experimental design , (2nd ed.). New York: McGraw-Hill.

Combines theory and application discussions to give readers a better understanding of the logic behind statistical aspects of experimental design. Introduces the broad topic of design, then goes into considerable detail. Not for light reading. Bring your aspirin if you like statistics. Bring morphine is you're a humanist.

Winn, B. (1986, January 16-21). Emerging trends in educational technology research. Paper presented at the Annual Convention of the Association for Educational Communication Technology.

This examination of the topic of research in educational technology addresses four major areas: (1) why research is conducted in this area and the characteristics of that research; (2) the types of research questions that should or should not be addressed; (3) the most appropriate methodologies for finding answers to research questions; and (4) the characteristics of a research report that make it good and ultimately suitable for publication.

Citation Information

Luann Barnes, Jennifer Hauser, Luana Heikes, Anthony J. Hernandez, Paul Tim Richard, Katherine Ross, Guo Hua Yang, and Mike Palmquist. (1994-2024). Experimental and Quasi-Experimental Research. The WAC Clearinghouse. Colorado State University. Available at https://wac.colostate.edu/repository/writing/guides/.

Copyright Information

Copyright © 1994-2024 Colorado State University and/or this site's authors, developers, and contributors . Some material displayed on this site is used with permission.

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.3 Quasi-Experimental Research

Learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001). Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952). But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate without receiving psychotherapy. This suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here:

http://psychclassics.yorku.ca/Eysenck/psychotherapy.htm

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980). They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Han Eysenck

In a classic 1952 article, researcher Hans Eysenck pointed out the shortcomings of the simple pretest-posttest design for evaluating the effectiveness of psychotherapy.

Wikimedia Commons – CC BY-SA 3.0.

Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979). Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Figure 7.5 A Hypothetical Interrupted Time-Series Design

A Hypothetical Interrupted Time-Series Design - The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not

The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

Discussion: Imagine that a group of obese children is recruited for a study in which their weight is measured, then they participate for 3 months in a program that encourages them to be more active, and finally their weight is measured again. Explain how each of the following might affect the results:

  • regression to the mean
  • spontaneous remission

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin.

Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324.

Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146.

Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • Privacy Policy

Research Method

Home » Quasi-Experimental Research Design – Types, Methods

Quasi-Experimental Research Design – Types, Methods

Table of Contents

Quasi-Experimental Design

Quasi-Experimental Design

Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable(s) that is available in a true experimental design.

In a quasi-experimental design, the researcher uses an existing group of participants that is not randomly assigned to the experimental and control groups. Instead, the groups are selected based on pre-existing characteristics or conditions, such as age, gender, or the presence of a certain medical condition.

Types of Quasi-Experimental Design

There are several types of quasi-experimental designs that researchers use to study causal relationships between variables. Here are some of the most common types:

Non-Equivalent Control Group Design

This design involves selecting two groups of participants that are similar in every way except for the independent variable(s) that the researcher is testing. One group receives the treatment or intervention being studied, while the other group does not. The two groups are then compared to see if there are any significant differences in the outcomes.

Interrupted Time-Series Design

This design involves collecting data on the dependent variable(s) over a period of time, both before and after an intervention or event. The researcher can then determine whether there was a significant change in the dependent variable(s) following the intervention or event.

Pretest-Posttest Design

This design involves measuring the dependent variable(s) before and after an intervention or event, but without a control group. This design can be useful for determining whether the intervention or event had an effect, but it does not allow for control over other factors that may have influenced the outcomes.

Regression Discontinuity Design

This design involves selecting participants based on a specific cutoff point on a continuous variable, such as a test score. Participants on either side of the cutoff point are then compared to determine whether the intervention or event had an effect.

Natural Experiments

This design involves studying the effects of an intervention or event that occurs naturally, without the researcher’s intervention. For example, a researcher might study the effects of a new law or policy that affects certain groups of people. This design is useful when true experiments are not feasible or ethical.

Data Analysis Methods

Here are some data analysis methods that are commonly used in quasi-experimental designs:

Descriptive Statistics

This method involves summarizing the data collected during a study using measures such as mean, median, mode, range, and standard deviation. Descriptive statistics can help researchers identify trends or patterns in the data, and can also be useful for identifying outliers or anomalies.

Inferential Statistics

This method involves using statistical tests to determine whether the results of a study are statistically significant. Inferential statistics can help researchers make generalizations about a population based on the sample data collected during the study. Common statistical tests used in quasi-experimental designs include t-tests, ANOVA, and regression analysis.

Propensity Score Matching

This method is used to reduce bias in quasi-experimental designs by matching participants in the intervention group with participants in the control group who have similar characteristics. This can help to reduce the impact of confounding variables that may affect the study’s results.

Difference-in-differences Analysis

This method is used to compare the difference in outcomes between two groups over time. Researchers can use this method to determine whether a particular intervention has had an impact on the target population over time.

Interrupted Time Series Analysis

This method is used to examine the impact of an intervention or treatment over time by comparing data collected before and after the intervention or treatment. This method can help researchers determine whether an intervention had a significant impact on the target population.

Regression Discontinuity Analysis

This method is used to compare the outcomes of participants who fall on either side of a predetermined cutoff point. This method can help researchers determine whether an intervention had a significant impact on the target population.

Steps in Quasi-Experimental Design

Here are the general steps involved in conducting a quasi-experimental design:

  • Identify the research question: Determine the research question and the variables that will be investigated.
  • Choose the design: Choose the appropriate quasi-experimental design to address the research question. Examples include the pretest-posttest design, non-equivalent control group design, regression discontinuity design, and interrupted time series design.
  • Select the participants: Select the participants who will be included in the study. Participants should be selected based on specific criteria relevant to the research question.
  • Measure the variables: Measure the variables that are relevant to the research question. This may involve using surveys, questionnaires, tests, or other measures.
  • Implement the intervention or treatment: Implement the intervention or treatment to the participants in the intervention group. This may involve training, education, counseling, or other interventions.
  • Collect data: Collect data on the dependent variable(s) before and after the intervention. Data collection may also include collecting data on other variables that may impact the dependent variable(s).
  • Analyze the data: Analyze the data collected to determine whether the intervention had a significant impact on the dependent variable(s).
  • Draw conclusions: Draw conclusions about the relationship between the independent and dependent variables. If the results suggest a causal relationship, then appropriate recommendations may be made based on the findings.

Quasi-Experimental Design Examples

Here are some examples of real-time quasi-experimental designs:

  • Evaluating the impact of a new teaching method: In this study, a group of students are taught using a new teaching method, while another group is taught using the traditional method. The test scores of both groups are compared before and after the intervention to determine whether the new teaching method had a significant impact on student performance.
  • Assessing the effectiveness of a public health campaign: In this study, a public health campaign is launched to promote healthy eating habits among a targeted population. The behavior of the population is compared before and after the campaign to determine whether the intervention had a significant impact on the target behavior.
  • Examining the impact of a new medication: In this study, a group of patients is given a new medication, while another group is given a placebo. The outcomes of both groups are compared to determine whether the new medication had a significant impact on the targeted health condition.
  • Evaluating the effectiveness of a job training program : In this study, a group of unemployed individuals is enrolled in a job training program, while another group is not enrolled in any program. The employment rates of both groups are compared before and after the intervention to determine whether the training program had a significant impact on the employment rates of the participants.
  • Assessing the impact of a new policy : In this study, a new policy is implemented in a particular area, while another area does not have the new policy. The outcomes of both areas are compared before and after the intervention to determine whether the new policy had a significant impact on the targeted behavior or outcome.

Applications of Quasi-Experimental Design

Here are some applications of quasi-experimental design:

  • Educational research: Quasi-experimental designs are used to evaluate the effectiveness of educational interventions, such as new teaching methods, technology-based learning, or educational policies.
  • Health research: Quasi-experimental designs are used to evaluate the effectiveness of health interventions, such as new medications, public health campaigns, or health policies.
  • Social science research: Quasi-experimental designs are used to investigate the impact of social interventions, such as job training programs, welfare policies, or criminal justice programs.
  • Business research: Quasi-experimental designs are used to evaluate the impact of business interventions, such as marketing campaigns, new products, or pricing strategies.
  • Environmental research: Quasi-experimental designs are used to evaluate the impact of environmental interventions, such as conservation programs, pollution control policies, or renewable energy initiatives.

When to use Quasi-Experimental Design

Here are some situations where quasi-experimental designs may be appropriate:

  • When the research question involves investigating the effectiveness of an intervention, policy, or program : In situations where it is not feasible or ethical to randomly assign participants to intervention and control groups, quasi-experimental designs can be used to evaluate the impact of the intervention on the targeted outcome.
  • When the sample size is small: In situations where the sample size is small, it may be difficult to randomly assign participants to intervention and control groups. Quasi-experimental designs can be used to investigate the impact of an intervention without requiring a large sample size.
  • When the research question involves investigating a naturally occurring event : In some situations, researchers may be interested in investigating the impact of a naturally occurring event, such as a natural disaster or a major policy change. Quasi-experimental designs can be used to evaluate the impact of the event on the targeted outcome.
  • When the research question involves investigating a long-term intervention: In situations where the intervention or program is long-term, it may be difficult to randomly assign participants to intervention and control groups for the entire duration of the intervention. Quasi-experimental designs can be used to evaluate the impact of the intervention over time.
  • When the research question involves investigating the impact of a variable that cannot be manipulated : In some situations, it may not be possible or ethical to manipulate a variable of interest. Quasi-experimental designs can be used to investigate the relationship between the variable and the targeted outcome.

Purpose of Quasi-Experimental Design

The purpose of quasi-experimental design is to investigate the causal relationship between two or more variables when it is not feasible or ethical to conduct a randomized controlled trial (RCT). Quasi-experimental designs attempt to emulate the randomized control trial by mimicking the control group and the intervention group as much as possible.

The key purpose of quasi-experimental design is to evaluate the impact of an intervention, policy, or program on a targeted outcome while controlling for potential confounding factors that may affect the outcome. Quasi-experimental designs aim to answer questions such as: Did the intervention cause the change in the outcome? Would the outcome have changed without the intervention? And was the intervention effective in achieving its intended goals?

Quasi-experimental designs are useful in situations where randomized controlled trials are not feasible or ethical. They provide researchers with an alternative method to evaluate the effectiveness of interventions, policies, and programs in real-life settings. Quasi-experimental designs can also help inform policy and practice by providing valuable insights into the causal relationships between variables.

Overall, the purpose of quasi-experimental design is to provide a rigorous method for evaluating the impact of interventions, policies, and programs while controlling for potential confounding factors that may affect the outcome.

Advantages of Quasi-Experimental Design

Quasi-experimental designs have several advantages over other research designs, such as:

  • Greater external validity : Quasi-experimental designs are more likely to have greater external validity than laboratory experiments because they are conducted in naturalistic settings. This means that the results are more likely to generalize to real-world situations.
  • Ethical considerations: Quasi-experimental designs often involve naturally occurring events, such as natural disasters or policy changes. This means that researchers do not need to manipulate variables, which can raise ethical concerns.
  • More practical: Quasi-experimental designs are often more practical than experimental designs because they are less expensive and easier to conduct. They can also be used to evaluate programs or policies that have already been implemented, which can save time and resources.
  • No random assignment: Quasi-experimental designs do not require random assignment, which can be difficult or impossible in some cases, such as when studying the effects of a natural disaster. This means that researchers can still make causal inferences, although they must use statistical techniques to control for potential confounding variables.
  • Greater generalizability : Quasi-experimental designs are often more generalizable than experimental designs because they include a wider range of participants and conditions. This can make the results more applicable to different populations and settings.

Limitations of Quasi-Experimental Design

There are several limitations associated with quasi-experimental designs, which include:

  • Lack of Randomization: Quasi-experimental designs do not involve randomization of participants into groups, which means that the groups being studied may differ in important ways that could affect the outcome of the study. This can lead to problems with internal validity and limit the ability to make causal inferences.
  • Selection Bias: Quasi-experimental designs may suffer from selection bias because participants are not randomly assigned to groups. Participants may self-select into groups or be assigned based on pre-existing characteristics, which may introduce bias into the study.
  • History and Maturation: Quasi-experimental designs are susceptible to history and maturation effects, where the passage of time or other events may influence the outcome of the study.
  • Lack of Control: Quasi-experimental designs may lack control over extraneous variables that could influence the outcome of the study. This can limit the ability to draw causal inferences from the study.
  • Limited Generalizability: Quasi-experimental designs may have limited generalizability because the results may only apply to the specific population and context being studied.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Applied Research

Applied Research – Types, Methods and Examples

Experimental Research Design

Experimental Design – Types, Methods, Guide

Questionnaire

Questionnaire – Definition, Types, and Examples

Observational Research

Observational Research – Methods and Guide

One-to-One Interview in Research

One-to-One Interview – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

experimental and quasi experimental examples

Home Market Research Research Tools and Apps

Quasi-experimental Research: What It Is, Types & Examples

quasi-experimental research is research that appears to be experimental but is not.

Much like an actual experiment, quasi-experimental research tries to demonstrate a cause-and-effect link between a dependent and an independent variable. A quasi-experiment, on the other hand, does not depend on random assignment, unlike an actual experiment. The subjects are sorted into groups based on non-random variables.

What is Quasi-Experimental Research?

“Resemblance” is the definition of “quasi.” Individuals are not randomly allocated to conditions or orders of conditions, even though the regression analysis is changed. As a result, quasi-experimental research is research that appears to be experimental but is not.

The directionality problem is avoided in quasi-experimental research since the regression analysis is altered before the multiple regression is assessed. However, because individuals are not randomized at random, there are likely to be additional disparities across conditions in quasi-experimental research.

As a result, in terms of internal consistency, quasi-experiments fall somewhere between correlational research and actual experiments.

The key component of a true experiment is randomly allocated groups. This means that each person has an equivalent chance of being assigned to the experimental group or the control group, depending on whether they are manipulated or not.

Simply put, a quasi-experiment is not a real experiment. A quasi-experiment does not feature randomly allocated groups since the main component of a real experiment is randomly assigned groups. Why is it so crucial to have randomly allocated groups, given that they constitute the only distinction between quasi-experimental and actual  experimental research ?

Let’s use an example to illustrate our point. Let’s assume we want to discover how new psychological therapy affects depressed patients. In a genuine trial, you’d split half of the psych ward into treatment groups, With half getting the new psychotherapy therapy and the other half receiving standard  depression treatment .

And the physicians compare the outcomes of this treatment to the results of standard treatments to see if this treatment is more effective. Doctors, on the other hand, are unlikely to agree with this genuine experiment since they believe it is unethical to treat one group while leaving another untreated.

A quasi-experimental study will be useful in this case. Instead of allocating these patients at random, you uncover pre-existing psychotherapist groups in the hospitals. Clearly, there’ll be counselors who are eager to undertake these trials as well as others who prefer to stick to the old ways.

These pre-existing groups can be used to compare the symptom development of individuals who received the novel therapy with those who received the normal course of treatment, even though the groups weren’t chosen at random.

If any substantial variations between them can be well explained, you may be very assured that any differences are attributable to the treatment but not to other extraneous variables.

As we mentioned before, quasi-experimental research entails manipulating an independent variable by randomly assigning people to conditions or sequences of conditions. Non-equivalent group designs, pretest-posttest designs, and regression discontinuity designs are only a few of the essential types.

What are quasi-experimental research designs?

Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn’t give full control over the independent variable(s) like true experimental designs do.

In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at random. Instead, people are put into groups based on things they already have in common, like their age, gender, or how many times they have seen a certain stimulus.

Because the assignments are not random, it is harder to draw conclusions about cause and effect than in a real experiment. However, quasi-experimental designs are still useful when randomization is not possible or ethical.

The true experimental design may be impossible to accomplish or just too expensive, especially for researchers with few resources. Quasi-experimental designs enable you to investigate an issue by utilizing data that has already been paid for or gathered by others (often the government). 

Because they allow better control for confounding variables than other forms of studies, they have higher external validity than most genuine experiments and higher  internal validity  (less than true experiments) than other non-experimental research.

Is quasi-experimental research quantitative or qualitative?

Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. 

It draws statistical conclusions from quantitative data. Qualitative data can enhance quasi-experimental research by revealing participants’ experiences and opinions, but quantitative data is the method’s foundation.

Quasi-experimental research types

There are many different sorts of quasi-experimental designs. Three of the most popular varieties are described below: Design of non-equivalent groups, Discontinuity in regression, and Natural experiments.

Design of Non-equivalent Groups

Example: design of non-equivalent groups, discontinuity in regression, example: discontinuity in regression, natural experiments, example: natural experiments.

However, because they couldn’t afford to pay everyone who qualified for the program, they had to use a random lottery to distribute slots.

Experts were able to investigate the program’s impact by utilizing enrolled people as a treatment group and those who were qualified but did not play the jackpot as an experimental group.

How QuestionPro helps in quasi-experimental research?

QuestionPro can be a useful tool in quasi-experimental research because it includes features that can assist you in designing and analyzing your research study. Here are some ways in which QuestionPro can help in quasi-experimental research:

Design surveys

Randomize participants, collect data over time, analyze data, collaborate with your team.

With QuestionPro, you have access to the most mature market research platform and tool that helps you collect and analyze the insights that matter the most. By leveraging InsightsHub, the unified hub for data management, you can ​​leverage the consolidated platform to organize, explore, search, and discover your  research data  in one organized data repository . 

Optimize Your quasi-experimental research with QuestionPro. Get started now!

FREE TRIAL         LEARN MORE

MORE LIKE THIS

CX Platforms

CX Platform: Top 13 CX Platforms to Drive Customer Success

Jun 17, 2024

experimental and quasi experimental examples

How to Know Whether Your Employee Initiatives are Working

Weighting Survey Data

How to Weighting Survey Data to Enhance Your Data Quality?

Jun 12, 2024

stay interviews

Stay Interviews: What Is It, How to Conduct, 15 Questions

Jun 11, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Logo for UNT Open Books

5 Chapter 5: Experimental and Quasi-Experimental Designs

Case stu dy: the impact of teen court.

Research Study

An Experimental Evaluation of Teen Courts 1

Research Question

Is teen court more effective at reducing recidivism and improving attitudes than traditional juvenile justice processing?

Methodology

Researchers randomly assigned 168 juvenile offenders ages 11 to 17 from four different counties in Maryland to either teen court as experimental group members or to traditional juvenile justice processing as control group members. (Note: Discussion on the technical aspects of experimental designs, including random assignment, is found in detail later in this chapter.) Of the 168 offenders, 83 were assigned to teen court and 85 were assigned to regular juvenile justice processing through random assignment. Of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study.

Upon assignment to teen court or regular juvenile justice processing, all offenders entered their respective sanction. Approximately four months later, offenders in both the experimental group (teen court) and the control group (regular juvenile justice processing) were asked to complete a post-test survey inquiring about a variety of behaviors (frequency of drug use, delinquent behavior, variety of drug use) and attitudinal measures (social skills, rebelliousness, neighborhood attachment, belief in conventional rules, and positive self-concept). The study researchers also collected official re-arrest data for 18 months starting at the time of offender referral to juvenile justice authorities.

Teen court participants self-reported higher levels of delinquency than those processed through regular juvenile justice processing. According to official re-arrests, teen court youth were re-arrested at a higher rate and incurred a higher average number of total arrests than the control group. Teen court offenders also reported significantly lower scores on survey items designed to measure their “belief in conventional rules” compared to offenders processed through regular juvenile justice avenues. Other attitudinal and opinion measures did not differ significantly between the experimental and control group members based on their post-test responses. In sum, those youth randomly assigned to teen court fared worse than control group members who were not randomly assigned to teen court.

Limitations with the Study Procedure

Limitations are inherent in any research study and those research efforts that utilize experimental designs are no exception. It is important to consider the potential impact that a limitation of the study procedure could have on the results of the study.

In the current study, one potential limitation is that teen courts from four different counties in Maryland were utilized. Because of the diversity in teen court sites, it is possible that there were differences in procedure between the four teen courts and such differences could have impacted the outcomes of this study. For example, perhaps staff members at one teen court were more punishment-oriented than staff members at the other county teen courts. This philosophical difference may have affected treatment delivery and hence experimental group members’ belief in conventional attitudes and recidivism. Although the researchers monitored each teen court to help ensure treatment consistency between study sites, it is possible that differences existed in the day-to-day operation of the teen courts that may have affected participant outcomes. This same limitation might also apply to control group members who were sanctioned with regular juvenile justice processing in four different counties.

A researcher must also consider the potential for differences between the experimental and control group members. Although the offenders were randomly assigned to the experimental or control group, and the assumption is that the groups were equivalent to each other prior to program participation, the researchers in this study were only able to compare the experimental and control groups on four variables: age, school grade, gender, and race. It is possible that the experimental and control group members differed by chance on one or more factors not measured or available to the researchers. For example, perhaps a large number of teen court members experienced problems at home that can explain their more dismal post-test results compared to control group members without such problems. A larger sample of juvenile offenders would likely have helped to minimize any differences between the experimental and control group members. The collection of additional information from study participants would have also allowed researchers to be more confident that the experimental and control group members were equivalent on key pieces of information that could have influenced recidivism and participant attitudes.

Finally, while 168 juvenile offenders were randomly assigned to either the experimental or control group, not all offenders agreed to participate in the evaluation. Remember that of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study. While this limitation is unavoidable, it still could have influenced the study. Perhaps those 27 offenders who declined to participate in the teen court group differed significantly from the 56 who agreed to participate. If so, it is possible that the differences among those two groups could have impacted the results of the study. For example, perhaps the 27 youths who were randomly assigned to teen court but did not agree to be a part of the study were some of the least risky of potential teen court participants—less serious histories, better attitudes to begin with, and so on. In this case, perhaps the most risky teen court participants agreed to be a part of the study, and as a result of being more risky, this led to more dismal delinquency outcomes compared to the control group at the end of each respective program. Because parental consent was required for the study authors to be able to compare those who declined to participate in the study to those who agreed, it is unknown if the participants and nonparticipants differed significantly on any variables among either the experimental or control group. Moreover, of the resulting 107 offenders who took part in the study, only 75 offenders accurately completed the post-test survey measuring offending and attitudinal outcomes.

Again, despite the experimental nature of this study, such limitations could have impacted the study results and must be considered.

Impact on Criminal Justice

Teen courts are generally designed to deal with nonserious first time offenders before they escalate to more serious and chronic delinquency. Innovative programs such as “Scared Straight” and juvenile boot camps have inspired an increase in teen court programs across the country, although there is little evidence regarding their effectiveness compared to traditional sanctions for youthful offenders. This study provides more specific evidence as to the effectiveness of teen courts relative to normal juvenile justice processing. Researchers learned that teen court participants fared worse than those in the control group. The potential labeling effects of teen court, including stigma among peers, especially where the offense may have been very minor, may be more harmful than doing less or nothing. The real impact of this study lies in the recognition that teen courts and similar sanctions for minor offenders may do more harm than good.

One important impact of this study is that it utilized an experimental design to evaluate the effectiveness of a teen court compared to traditional juvenile justice processing. Despite the study’s limitations, by using an experimental design it improved upon previous teen court evaluations by attempting to ensure any results were in fact due to the treatment, not some difference between the experimental and control group. This study also utilized both official and self-report measures of delinquency, in addition to self-report measures on such factors as self-concept and belief in conventional rules, which have been generally absent from teen court evaluations. The study authors also attempted to gauge the comparability of the experimental and control groups on factors such as age, gender, and race to help make sure study outcomes were attributable to the program, not the participants.

In This Chapter You Will Learn

The four components of experimental and quasi-experimental research designs and their function in answering a research question

The differences between experimental and quasi-experimental designs

The importance of randomization in an experimental design

The types of questions that can be answered with an experimental or quasi-experimental research design

About the three factors required for a causal relationship

That a relationship between two or more variables may appear causal, but may in fact be spurious, or explained by another factor

That experimental designs are relatively rare in criminal justice and why

About common threats to internal validity or alternative explanations to what may appear to be a causal relationship between variables

Why experimental designs are superior to quasi-experimental designs for eliminating or reducing the potential of alternative explanations

Introduction

The teen court evaluation that began this chapter is an example of an experimental design. The researchers of the study wanted to determine whether teen court was more effective at reducing recidivism and improving attitudes compared to regular juvenile justice case processing. In short, the researchers were interested in the relationship between variables —the relationship of teen court to future delinquency and other outcomes. When researchers are interested in whether a program, policy, practice, treatment, or other intervention impacts some outcome, they often utilize a specific type of research method/design called experimental design. Although there are many types of experimental designs, the foundation for all of them is the classic experimental design. This research design, and some typical variations of this experimental design, are the focus of this chapter.

Although the classic experiment may be appropriate to answer a particular research question, there are barriers that may prevent researchers from using this or another type of experimental design. In these situations, researchers may turn to quasi-experimental designs. Quasi-experiments include a group of research designs that are missing a key element found in the classic experiment and other experimental designs (hence the term “quasi” experiment). Despite this missing part, quasi-experiments are similar in structure to experimental designs and are used to answer similar types of research questions. This chapter will also focus on quasi-experiments and how they are similar to and different from experimental designs.

Uncovering the relationship between variables, such as the impact of teen court on future delinquency, is important in criminal justice and criminology, just as it is in other scientific disciplines such as education, biology, and medicine. Indeed, whereas criminal justice researchers may be interested in whether a teen court reduces recidivism or improves attitudes, medical field researchers may be concerned with whether a new drug reduces cholesterol, or an education researcher may be focused on whether a new teaching style leads to greater academic gains. Across these disciplines and topics of interest, the experimental design is appropriate. In fact, experimental designs are used in all scientific disciplines; the only thing that changes is the topic. Specific to criminal justice, below is a brief sampling of the types of questions that can be addressed using an experimental design:

Does participation in a correctional boot camp reduce recidivism?

What is the impact of an in-cell integration policy on inmate-on-inmate assaults in prisons?

Does police officer presence in schools reduce bullying?

Do inmates who participate in faith-based programming while in prison have a lower recidivism rate upon their release from prison?

Do police sobriety checkpoints reduce drunken driving fatalities?

What is the impact of a no-smoking policy in prisons on inmate-on-inmate assaults?

Does participation in a domestic violence intervention program reduce repeat domestic violence arrests?

A focus on the classic experimental design will demonstrate the usefulness of this research design for addressing criminal justice questions interested in cause and effect relationships. Particular attention is paid to the classic experimental design because it serves as the foundation for all other experimental and quasi-experimental designs, some of which are covered in this chapter. As a result, a clear understanding of the components, organization, and logic of the classic experimental design will facilitate an understanding of other experimental and quasi-experimental designs examined in this chapter. It will also allow the reader to better understand the results produced from those various designs, and importantly, what those results mean. It is a truism that the results of a research study are only as “good” as the design or method used to produce them. Therefore, understanding the various experimental and quasi-experimental designs is the key to becoming an informed consumer of research.

The Challenge of Establishing Cause and Effect

Researchers interested in explaining the relationship between variables, such as whether a treatment program impacts recidivism, are interested in causation or causal relationships. In a simple example, a causal relationship exists when X (independent variable) causes Y (dependent variable), and there are no other factors (Z) that can explain that relationship. For example, offenders who participated in a domestic violence intervention program (X–domestic violence intervention program) experienced fewer re-arrests (Y–re-arrests) than those who did not participate in the domestic violence program, and no other factor other than participation in the domestic violence program can explain these results. The classic experimental design is superior to other research designs in uncovering a causal relationship, if one exists. Before a causal relationship can be established, however, there are three conditions that must be met (see Figure 5.1). 2

FIGURE 5.1 | The Cause and Effect Relationship

experimental and quasi experimental examples

Timing The first condition for a causal relationship is timing. For a causal relationship to exist, it must be shown that the independent variable or cause (X) preceded the dependent variable or outcome (Y) in time. A decrease in domestic violence re-arrests (Y) cannot occur before participation in a domestic violence reduction program (X ), if the domestic violence program is proposed to be the cause of fewer re-arrests. Ensuring that cause comes before effect is not sufficient to establish that a causal relationship exists, but it is one requirement that must be met for a causal relationship.

Association In addition to timing, there must also be an observable association between X and Y, the second necessary condition for a causal relationship. Association is also commonly referred to as covariance or correlation. When an association or correlation exits, this means there is some pattern of relationship between X and Y —as X changes by increasing or decreasing, Y also changes by increasing or decreasing. Here, the notion of X and Y increasing or decreasing can mean an actual increase/decrease in the quantity of some factor, such as an increase/decrease in the number of prison terms or days in a program or re-arrests. It can also refer to an increase/decrease in a particular category, for example, from nonparticipation in a program to participation in a program. For instance, subjects who participated in a domestic violence reduction program (X) incurred fewer domestic violence re-arrests (Y) than those who did not participate in the program. In this example, X and Y are associated—as X change s or increases from nonparticipation to participation in the domestic violence program, Y or the number of re-arrests for domestic violence decreases.

Associations between X and Y can occur in two different directions: positive or negative. A positive association means that as X increases, Y increases, or, as X decreases, Y decreases. A negative association means that as X increases, Y decreases, or, as X decreases, Y increases. In the example above, the association is negative—participation in the domestic violence program was associated with a reduction in re-arrests. This is also sometimes called an inverse relationship.

Elimination of Alternative Explanations Although participation in a domestic violence program may be associated with a reduction in re-arrests, this does not mean for certain that participation in the program was the cause of reduced re-arrests. Just as timing by itself does not imply a causal relationship, association by itself does not imply a causal relationship. For example, instead of the program being the cause of a reduction in re-arrests, perhaps several of the program participants died shortly after completion of the domestic violence program and thus were not able to engage in domestic violence (and their deaths were unknown to the researcher tracking re-arrests). Perhaps a number of the program participants moved out of state and domestic violence re-arrests occurred but were not able to be uncovered by the researcher. Perhaps those in the domestic violence program experienced some other event, such as the trauma of a natural disaster, and that experience led to a reduction in domestic violence, an event not connected to the domestic violence program. If any of these situations occurred, it might appear that the domestic violence program led to fewer re-arrests. However, the observed reduction in re-arrests can actually be attributed to a factor unrelated to the domestic violence program.

The previous discussion leads to the third and final necessary consideration in determining a causal relationship— elimination of alternative explanations. This means that the researcher must rule out any other potential explanation of the results, except for the experimental condition such as a program, policy, or practice. Accounting for or ruling out alternative explanations is much more difficult than ensuring timing and association. Ruling out all alternative explanations is difficult because there are so many potential other explanations that can wholly or partly explain the findings of a research study. This is especially true in the social sciences, where researchers are often interested in relationships explaining human behavior. Because of this difficulty, associations by themselves are sometimes mistaken as causal relationships when in fact they are spurious. A spurious relationship is one where it appears that X and Y are causally related, but the relationship is actually explained by something other than the independent variable, or X.

One only needs to go so far as the daily newspaper to find headlines and stories of mere associations being mistaken, assumed, or represented as causal relationships. For example, a newspaper headline recently proclaimed “Churchgoers live longer.” 3 An uninformed consumer may interpret this headline as evidence of a causal relationship—that going to church by itself will lead to a longer life—but the astute consumer would note possible alternative explanations. For example, people who go to church may live longer because they tend to live healthier lifestyles and tend to avoid risky situations. These are two probable alternative explanations to the relationship independent of simply going to church. In another example, researchers David Kalist and Daniel Yee explored the relationship between first names and delinquent behavior in their manuscript titled “First Names and Crime: Does Unpopularity Spell Trouble?” 4 Kalist and Lee (2009) found that unpopular names are associated with juvenile delinquency. In other words, those individuals with the most unpopular names were more likely to be delinquent than those with more popular names. According to the authors, is it not necessarily someone’s name that leads to delinquent behavior, but rather, the most unpopular names also tend to be correlated with individuals who come from disadvantaged home environments and experience a low socio-economic status of living. Rightly noted by the authors, these alternative explanations help to explain the link between someone’s name and delinquent behavior—a link that is not causal.

A frequently cited example provides more insight to the claim that an association by itself is not sufficient to prove causality. In certain cities in the United States, for example, as ice cream sales increase on a particular day or in a particular month so does the incidence of certain forms of crime. If this association were represented as a causal statement, it would be that ice cream or ice cream sales causes crime. There is an association, no doubt, and let us assume that ice cream sales rose before the increase in crime (timing). Surely, however, this relationship between ice cream sales and crime is spurious. The alternative explanation is that ice cream sales and crime are associated in certain parts of the country because of the weather. Ice cream sales tend to increase in warmer temperatures, and it just so happens that certain forms of crime tend to increase in warmer temperatures as well. This coincidence or association does not mean a causal relationship exists. Additionally, this does not mean that warm temperatures cause crime either. There are plenty of other alternative explanations for the increase in certain forms of crime and warmer temperatures. 6 For another example of a study subject to alternative explanations, read the June 2011 news article titled “Less Crime in U.S. Thanks to Videogames.” 7 Based on your reading, what are some other potential explanations for the crime drop other than videogames?

The preceding examples demonstrate how timing and association can be present, but the final needed condition for a causal relationship is that all alternative explanations are ruled out. While this task is difficult, the classic experimental design helps to ensure these additional explanatory factors are minimized. When other designs are used, such as quasi-experimental designs, the chance that alternative explanations emerge is greater. This potential should become clearer as we explore the organization and logic of the classic experimental design.

CLASSICS IN CJ RESEARCH

Minneapolis Domestic Violence Experiment

The Minneapolis Domestic Violence Experiment (MDVE) 5

Which police action (arrest, separation, or mediation) is most effective at deterring future misdemeanor domestic violence?

The experiment began on March 17, 1981, and continued until August 1, 1982. The experiment was conducted in two of Minneapolis’s four police precincts—the two with the highest number of domestic violence reports and arrests. A total of 314 reports of misdemeanor domestic violence were handled by the police during this time frame.

This study utilized an experimental design with the random assignment of police actions. Each police officer involved in the study was given a pad of report forms. Upon a misdemeanor domestic violence call, the officer’s action (arrest, separation, or mediation) was predetermined by the order and color of report forms in the officer’s notebook. Colored report forms were randomly ordered in the officer’s notebook and the color on the form determined the officer response once at the scene. For example, after receiving a call for domestic violence, an officer would turn to his or her report pad to determine the action. If the top form was pink, the action was arrest. If on the next call the top form was a different color, an action other than arrest would occur. All colored report forms were randomly ordered through a lottery assignment method. The result is that all police officer actions to misdemeanor domestic violence calls were randomly assigned. To ensure the lottery procedure was properly carried out, research staff participated in ride-alongs with officers to ensure that officers did not skip the order of randomly ordered forms. Research staff also made sure the reports were received in the order they were randomly assigned in the pad of report forms.

To examine the relationship of different officer responses to future domestic violence, the researchers examined official arrests of the suspects in a 6-month follow-up period. For example, the researchers examined those initially arrested for misdemeanor domestic violence and how many were subsequently arrested for domestic violence within a 6-month time frame. They did the same procedure for the police actions of separation and mediation. The researchers also interviewed the victim(s) of each incident and asked if a repeat domestic violence incident occurred with the same suspect in the 6-month follow-up period. This allowed researchers to examine domestic violence offenses that may have occurred but did not come to the official attention of police. The researchers then compared official arrests for domestic violence to self-reported domestic violence after the experiment.

Suspects arrested for misdemeanor domestic violence, as opposed to situations where separation or mediation was used, were significantly less likely to engage in repeat domestic violence as measured by official arrest records and victim interviews during the 6-month follow-up period. According to official police records, 10% of those initially arrested engaged in repeat domestic violence in the followup period, 19% of those who initially received mediation engaged in repeat domestic violence, and 24% of those who randomly received separation engaged in repeat domestic violence. According to victim interviews, 19% of those initially arrested engaged in repeat domestic violence, compared to 37% for separation and 33% for mediation. The general conclusion of the experiment was that arrest was preferable to separation or mediation in deterring repeat domestic violence across both official police records and victim interviews.

A few issues that affected the random assignment procedure occurred throughout the study. First, some officers did not follow the randomly assigned action (arrest, separation, or mediation) as a result of other circumstances that occurred at the scene. For example, if the randomly assigned action was separation, but the suspect assaulted the police officer during the call, the officer might arrest the suspect. Second, some officers simply ignored the assigned action if they felt a particular call for domestic violence required another action. For example, if the action was mediation as indicated by the randomly assigned report form, but the officer felt the suspect should be arrested, he or she may have simply ignored the randomly assigned response and substituted his or her own. Third, some officers forgot their report pads and did not know the randomly assigned course of action to take upon a call of domestic violence. Fourth and finally, the police chief also allowed officers to deviate from the randomly assigned action in certain circumstances. In all of these situations, the random assignment procedures broke down.

The results of the MDVE had a rapid and widespread impact on law enforcement practice throughout the United States. Just two years after the release of the study, a 1986 telephone survey of 176 urban police departments serving cities with populations of 100,000 or more found that 46 percent of the departments preferred to make arrests in cases of minor domestic violence, largely due to the effectiveness of this practice in the Minneapolis Domestic Violence Experiment. 8

In an attempt to replicate the findings of the Minneapolis Domestic Violence Experiment, the National Institute of Justice sponsored the Spouse Assault Replication Program. Replication studies were conducted in Omaha, Charlotte, Milwaukee, Miami, and Colorado Springs from 1986–1991. In three of the five replications, offenders randomly assigned to the arrest group had higher levels of continued domestic violence in comparison to other police actions during domestic violence situations. 9 Therefore, rather than providing results that were consistent with the Minneapolis Domestic Violence Experiment, the results from the five replication experiments produced inconsistent findings about whether arrest deters domestic violence. 10

Despite the findings of the replications, the push to arrest domestic violence offenders has continued in law enforcement. Today many police departments require officers to make arrests in domestic violence situations. In agencies that do not mandate arrest, department policy typically states a strong preference toward arrest. State legislatures have also enacted laws impacting police actions regarding domestic violence. Twenty-one states have mandatory arrest laws while eight have pro-arrest statutes for domestic violence. 11

The Classic Experimental Design

Table 5.1 provides an illustration of the classic experimental design. 12 It is important to become familiar with the specific notation and organization of the classic experiment before a full discussion of its components and their purpose.

Major Components of the Classic Experimental Design

The classic experimental design has four major components:

1. Treatment

2. Experimental Group and Control Group

3. Pre-Test and Post-Test

4. Random Assignment

Treatment The first component of the classic experimental design is the treatment, and it is denoted by X in the classic experimental design. The treatment can be a number of things—a program, a new drug, or the implementation of a new policy. In a classic experimental design, the primary goal is to determine what effect, if any, a particular treatment had on some outcome. In this way, the treatment can also be considered the independent variable.

TABLE 5.1 | The Classic Experimental Design

R

O

X

O

R

O

O

Experimental Group = Group that receives the treatment

Control Group = Group that does not receive the treatment

R = Random assignment

O 1 = Observation before the treatment, or the pre-test

X = Treatment or the independent variable

O 2 = Observation after the treatment, or the post-test

Experimental and Control Groups The second component of the classic experiment is an experimental group and a control group. The experimental group receives the treatment, and the control group does not receive the treatment. There will always be at least one group that receives the treatment in experimental and quasi-experimental designs. In some cases, experiments may have multiple experimental groups receiving multiple treatments.

Pre-Test and Post-Test The third component of the classic experiment is a pre-test and a post-test. A pretest is a measure of the dependent variable or outcome before the treatment. The post-test is a measure of the dependent variable after the treatment is administered. It is important to note that the post-test is defined based on the stated goals of the program. For example, if the stated goal of a particular program is to reduce re-arrests, the post-test will be a measure of re-arrests after the program. The dependent variable also defines the pre-test. For example, if a researcher wanted to examine the impact of a domestic violence reduction program (treatment or X) on the goal of reducing re-arrests (dependent variable or Y), the pre-test would be the number of domestic violence arrests incurred before the program. Program goals may be numerous and all can constitute a post-test, and hence, the pre-test. For example, perhaps the goal of the domestic violence program is also that participants learn of different pro-social ways to handle domestic conflicts other than resorting to violence. If researchers wanted to examine this goal, the post-test might be subjects’ level of knowledge about pro-social ways to handle domestic conflicts other than violence. The pre-test would then be subjects’ level of knowledge about these pro-social alternatives to violence before they received the treatment program.

Although all designs have a post-test, it is not always the case that designs have a pre-test. This is because researchers may not have access or be able to collect information constituting the pre-test. For example, researchers may not be able to determine subjects’ level of knowledge about alternatives to domestic violence before the intervention program if the subjects are already enrolled in the domestic violence intervention program. In other cases, there may be financial barriers to collecting pre-test information. In the teen court evaluation that started this chapter, for example, researchers were not able to collect pre-test information on study participants due to the financial strain it would have placed on the agencies involved in the study. 13 There are a number of potential reasons why a pre-test might not be available in a research study. The defining feature, however, is that the pre-test is determined by the post-test.

Random Assignment The fourth component of the classic experiment is random assignment. Random assignment refers to a process whereby members of the experimental group and control group are assigned to the two groups through a random and unbiased process. Random assignment should not be mistaken for random selection as discussed in Chapter 3. Random selection refers to selecting a smaller but representative sample from a larger population. For example, a researcher may randomly select a sample from a larger city population for the purposes of sending sample members a mail survey to determine their attitudes on crime. The goal of random selection in this example is to make sure the sample, although smaller in size than the population, accurately represents the larger population.

Random assignment, on the other hand, refers to the process of assigning subjects to either the experimental or control group with the goal that the groups are similar or equivalent to each other in every way (see Figure 5.2). The exception to this rule is that one group gets the treatment and the other does not (see discussion below on why equivalence is so important). Although the concept of random is similar in each, the goals are different between random selection and random assignment. 14 Experimental designs all feature random assignment, but this is not true of other research designs, in particular quasi-experimental designs.

FIGURE 5.2 | Random Assignment

experimental and quasi experimental examples

The classic experimental design is the foundation for all other experimental and quasi-experimental designs because it retains all of the major components discussed above. As mentioned, sometimes designs do not have a pre-test, a control group, or random assignment. Because the pre-test, control group, and random assignment are so critical to the goal of uncovering a causal relationship, if one exists, we explore them further below.

The Logic of the Classic Experimental Design

Consider a research study using the classic experimental design where the goal is to determine if a domestic violence treatment program has any effect on re-arrests for domestic violence. The randomly assigned experimental and control groups are comprised of persons who had previously been arrested for domestic violence. The pretest is a measure of the number of domestic violence arrests before the program. This is because the goal of the program is to determine whether re-arrests are impacted after the treatment. The post-test is the number of re-arrests following the treatment program.

Once randomly assigned, the experimental group members receive the domestic violence program, and the control group members do not. After the program, the researcher will compare the pre-test arrests for domestic violence of the experimental group to post-test arrests for domestic violence to determine if arrests increased, decreased, or remained constant since the start of the program. The researcher will also compare the post-test re-arrests for domestic violence between the experimental and control groups. With this example, we explore the usefulness of the classic experimental design, and the contribution of the pre-test, random assignment, and the control group to the goal of determining whether a domestic violence program reduces re-arrests.

The Pre-Test As a component of the classic experiment, the pre-test allows an examination of change in the dependent variable from before the domestic violence program to after the domestic violence program. In short, a pre-test allows the researcher to determine if re-arrests increased, decreased, or remained the same following the domestic violence program. Without a pre-test, researchers would not be able to determine the extent of change, if any, from before to after the program for either the experimental or control group.

Although the pre-test is a measure of the dependent variable before the treatment, it can also be thought of as a measure whereby the researcher can compare the experimental group to the control group before the treatment is administered. For example, the pre-test helps researchers to make sure both groups are similar or equivalent on previous arrests for domestic violence. The importance of equivalence between the experimental and control groups on previous arrests is discussed below with random assignment.

Random Assignment Random assignment helps to ensure that the experimental and control groups are equivalent before the introduction of the treatment. This is perhaps one of the most critical aspects of the classic experiment and all experimental designs. Although the experimental and control groups will be made up of different people with different characteristics, assigning them to groups via a random assignment process helps to ensure that any differences or bias between the groups is eliminated or minimized. By minimizing bias, we mean that the groups will balance each other out on all factors except the treatment. If they are balanced out on all factors prior to the administration of the treatment, any differences between the groups at the post-test must be due to the treatment—the only factor that differs between the experimental group and the control group. According to Shadish, Cook, and Campbell: “If implemented correctly, random assignment creates two or more groups of units that are probabilistically similar to each other on the average. Hence, any outcome differences that are observed between those groups at the end of a study are likely to be due to treatment, not to differences between the groups that already existed at the start of the study.” 15 Considered in another way, if the experimental and control group differed significantly on any relevant factor other than the treatment, the researcher would not know if the results observed at the post-test are attributable to the treatment or to the differences between the groups.

Consider an example where 500 domestic abusers were randomly assigned to the experimental group and 500 were randomly assigned to the control group. Because they were randomly assigned, we would likely find more frequent domestic violence arrestees in both groups, older and younger arrestees in both groups, and so on. If random assignment was implemented correctly, it would be highly unlikely that all of the experimental group members were the most serious or frequent arrestees and all of the control group members were less serious and/or less frequent arrestees. While there are no guarantees, we know the chance of this happening is extremely small with random assignment because it is based on known probability theory. Thus, except for a chance occurrence, random assignment will result in equivalence between the experimental and control group in much the same way that flipping a coin multiple times will result in heads approximately 50% of the time and tails approximately 50% of the time. Over 1,000 tosses of a coin, for example, should result in roughly 500 heads and 500 tails. While there is a chance that flipping a coin 1,000 times will result in heads 1,000 times, or some other major imbalance between heads and tails, this potential is small and would only occur by chance.

The same logic from above also applies with randomly assigning people to groups, and this can even be done by flipping a coin. By assigning people to groups through a random and unbiased process, like flipping a coin, only by chance (or researcher error) will one group have more of one characteristic than another, on average. If there are no major (also called statistically significant) differences between the experimental and control group before the treatment, the most plausible explanation for the results at the post-test is the treatment.

As mentioned, it is possible by some chance occurrence that the experimental and control group members are significantly different on some characteristic prior to administration of the treatment. To confirm that the groups are in fact similar after they have been randomly assigned, the researcher can examine the pre-test if one is present. If the researcher has additional information on subjects before the treatment is administered, such as age, or any other factor that might influence post-test results at the end of the study, he or she can also compare the experimental and control group on those measures to confirm that the groups are equivalent. Thus, a researcher can confirm that the experimental and control groups are equivalent on information known to the researcher.

Being able to compare the groups on known measures is an important way to ensure the random assignment process “worked.” However, perhaps most important is that randomization also helps to ensure similarity across unknown variables between the experimental and control group. Because random assignment is based on known probability theory, there is a much higher probability that all potential differences between the groups that could impact the post-test should balance out with random assignment—known or unknown. Without random assignment, it is likely that the experimental and control group would differ on important but unknown factors and such differences could emerge as alternative explanations for the results. For example, if a researcher did not utilize random assignment and instead took the first 500 domestic abusers from an ordered list and assigned them to the experimental group and the last 500 domestic abusers and assigned them to the control group, one of the groups could be “lopsided” or imbalanced on some important characteristic that could impact the outcome of the study. With random assignment, there is a much higher likelihood that these important characteristics among the experimental and control groups will balance out because no individual has a different chance of being placed into one group versus the other. The probability of one or more characteristics being concentrated into one group and not the other is extremely small with random assignment.

To further illustrate the importance of random assignment to group equivalence, suppose the first 500 domestic violence abusers who were assigned to the experimental group from the ordered list had significantly fewer domestic violence arrests before the program than the last 500 domestic violence abusers on the list. Perhaps this is because the ordered list was organized from least to most chronic domestic abusers. In this instance, the control group would be lopsided concerning number of pre-program domestic violence arrests—they would be more chronic than the experimental group. The arrest imbalance then could potentially explain the post-test results following the domestic violence program. For example, the “less risky” offenders in the experimental group might be less likely to be re-arrested regardless of their participation in the domestic violence program, especially compared to the more chronic domestic abusers in the control group. Because of imbalances between the experimental and control group on arrests before the program was implemented, it would not be known for certain whether an observed reduction in re-arrests after the program for the experimental group was due to the program or the natural result of having less risky offenders in the experimental group. In this instance, the results might be taken to suggest that the program significantly reduces re-arrests. This conclusion might be spurious, however, for the association may simply be due to the fact that the offenders in the experimental group were much different (less frequent offenders) than the control group. Here, the program may have had no effect—the experimental group members may have performed the same regardless of the treatment because they were low-level offenders.

The example above suggests that differences between the experimental and control groups based on previous arrest records could have a major impact on the results of a study. Such differences can arise with the lack of random assignment. If subjects were randomly assigned to the experimental and control group, however, there would be a much higher probability that less frequent and more frequent domestic violence arrestees would have been found in both the experimental and control groups and the differences would have balanced out between the groups—leaving any differences between the groups at the post-test attributable to the treatment only.

In summary, random assignment helps to ensure that the experimental and control group members are balanced or equivalent on all factors that could impact the dependent variable or post-test—known or unknown. The only factor they are not balanced or equal on is the treatment. As such, random assignment helps to isolate the impact of the treatment, if any, on the post-test because it increases confidence that the only difference between the groups should be that one group gets the treatment and the other does not. If that is the only difference between the groups, any change in the dependent variable between the experimental and control group must be attributed to the treatment and not an alternative explanation, such as significant arrest history imbalance between the groups (refer to Figure 5.2). This logic also suggests that if the experimental group and control group are imbalanced on any factor that may be relevant to the outcome, that factor then becomes a potential alternative explanation for the results—an explanation that reduces the researcher’s ability to isolate the real impact of the treatment.

WHAT RESEARCH SHOWS: IMPACTING CRIMINAL JUSTICE OPERATIONS

Scared Straight

The 1978 documentary Scared Straight introduced to the public the “Lifer’s Program” at Rahway State Prison in New Jersey. This program sought to decrease juvenile delinquency by bringing at-risk and delinquent juveniles into the prison where they would be “scared straight” by inmates serving life sentences. Participants in the program were talked to and yelled at by the inmates in an effort to scare them. It was believed that the fear felt by the participants would lead to a discontinuation of their problematic behavior so that they would not end up in prison themselves. Although originally touted as a success based on anecdotal evidence, subsequent evaluations of the program and others like it proved otherwise.

Using a classic experimental design, Finckenauer evaluated the original “Lifer’s Program” at Rahway State Prison. 16 Participating juveniles were randomly assigned to the experimental group or the control group. Results of the evaluation were not positive. Post-test measures revealed that juveniles who were assigned to the experimental group and participated in the program were actually more seriously delinquent afterwards than those who did not participate in the program. Also using an experimental design with random assignment, Yarborough evaluated the “Juvenile Offenders Learn Truth” (JOLT) program at the State Prison of Southern Michigan at Jackson. 17 This program was similar to that of the “Lifer’s Program” only with fewer obscenities used by the inmates. Post-test measurements were taken at two intervals, 3 and 6 months after program completion. Again, results were not positive. Findings revealed no significant differences between those juveniles who attended the program and those who did not.

Other experiments conducted on Scared Straight -like programs further revealed their inability to deter juveniles from future criminality. 18 Despite the intuitive popularity of these programs, these evaluations proved that such programs were not successful. In fact, it is postulated that these programs may have actually done more harm than good.

The Control Group The presence of an equivalent control group (created through random assignment) also gives the researcher more confidence that the findings at the post-test are due to the treatment and not some other alternative explanation. This logic is perhaps best demonstrated by considering how interpretation of results is affected without a control group. Absent an equivalent control group, it cannot be known whether the results of the study are due to the program or some other factor. This is because the control group provides a baseline of comparison or a “control.” For example, without a control group, the researcher may find that domestic violence arrests declined from pre-test to post-test. But the researcher would not be able to definitely attribute that finding to the program without a control group. Perhaps the single experimental group incurred fewer arrests because they matured over their time in the program, regardless of participation in the domestic violence program. Having a randomly assigned control group would allow this consideration to be eliminated, because the equivalent control group would also have naturally matured if that was the case.

Because the control group is meant to be similar to the experimental group on all factors with the exception that the experimental group receives the treatment, the logic is that any differences between the experimental and control group after the treatment must then be attributable only to the treatment itself—everything else occurs equally in both the experimental and control groups and thus cannot be the cause of results. The bottom line is that a control group allows the researcher more confidence to attribute any change in the dependent variable from pre- to post-test and between the experimental and control groups to the treatment—and not another alternative explanation. Absent a control group, the researcher would have much less confidence in the results.

Knowledge about the major components of the classic experimental design and how they contribute to an understanding of cause and effect serves as an important foundation for studying different types of experimental and quasi-experimental designs and their organization. A useful way to become familiar with the components of the experimental design and their important role is to consider the impact on the interpretation of results when one or more components are lacking. For example, what if a design lacked a pre-test? How could this impact the interpretation of post-test results and knowledge about the comparability of the experimental and control group? What if a design lacked random assignment? What are some potential problems that could occur and how could those potential problems impact interpretation of results? What if a design lacked a control group? How does the absence of an equivalent control group affect a researcher’s ability to determine the unique effects of the treatment on the outcomes being measured? The ability to discuss the contribution of a pre-test, random assignment, and a control group—and what is the impact when one or more of those components is absent from a research design—is the key to understanding both experimental and quasi-experimental designs that will be discussed in the remainder of this chapter. As designs lose these important parts and transform from a classic experiment to another experimental design or to a quasi-experiment, they become less useful in isolating the impact that a treatment has on the dependent variable and allow more room for alternative explanations of the results.

One more important point must be made before further delving into experimental and quasi-experimental designs. This point is that rarely, if ever, will the average consumer of research be exposed to the symbols or specific language of the classic experiment, or other experimental and quasi-experimental designs examined in this chapter. In fact, it is unlikely that the average consumer will ever be exposed to the terms pre-test, post-test, experimental group, or random assignment in the popular media, among other terms related to experimental and quasi-experimental designs. Yet, consumers are exposed to research results produced from these and other research designs every day. For example, if a national news organization or your regional newspaper reported a story about the effectiveness of a new drug to reduce cholesterol or the effects of different diets on weight loss, it is doubtful that the results would be reported as produced through a classic experimental design that used a control group and random assignment. Rather, these media outlets would use generally nonscientific terminology such as “results of an experiment showed” or “results of a scientific experiment indicated” or “results showed that subjects who received the new drug had greater cholesterol reductions than those who did not receive the new drug.” Even students who regularly search and read academic articles for use in course papers and other projects will rarely come across such design notation in the research studies they utilize. Depiction of the classic experimental design, including a discussion of its components and their function, simply illustrates the organization and notation of the classic experimental design. Unfortunately, the average consumer has to read between the lines to determine what type of design was used to produce the reported results. Understanding the key components of the classic experimental design allows educated consumers of research to read between those lines.

RESEARCH IN THE NEWS

“Swearing Makes Pain More Tolerable” 19

In 2009, Richard Stephens, John Atkins, and Andrew Kingston of the School of Psychology at Keele University conducted a study with 67 undergraduate students to determine if swearing affects an individual’s response to pain. Researchers asked participants to immerse their hand in a container filled with ice-cold water and repeat a preferred swear word. The researchers then asked the same participants to immerse their hand in ice-cold water while repeating a word used to describe a table (a non-swear word). The results showed that swearing increased pain tolerance compared to the non-swearing condition. Participants who used a swear word were able to hold their hand in ice-cold water longer than when they did not swear. Swearing also decreased participants’ perception of pain.

1. This study is an example of a repeated measures design. In this form of experimental design, study participants are exposed to an experimental condition (swearing with hand in ice-cold water) and a control condition (non-swearing with hand in ice-cold water) while repeated outcome measures are taken with each condition, for example, the length of time a participant was able to keep his or her hand submerged in ice-cold water. Conduct an Internet search for “repeated measures design” and explore the various ways such a study could be conducted, including the potential benefits and drawbacks to this design.

2. After researching repeated measures designs, devise a hypothetical repeated measures study of your own.

3. Retrieve and read the full research study “Swearing as a Response to Pain” by Stephens, Atkins, and Kingston while paying attention to the design and methods (full citation information for this study is listed below). Has your opinion of the study results changed after reading the full study? Why or why not?

Full Study Source: Stephens, R., Atkins, J., and Kingston, A. (2009). “Swearing as a response to pain.” NeuroReport 20, 1056–1060.

Variations on the Experimental Design

The classic experimental design is the foundation upon which all experimental and quasi-experimental designs are based. As such, it can be modified in numerous ways to fit the goals (or constraints) of a particular research study. Below are two variations of the experimental design. Again, knowledge about the major components of the classic experiment, how they contribute to an explanation of results, and what the impact is when one or more components are missing provides an understanding of all other experimental designs.

Post-Test Only Experimental Design

The post-test only experimental design could be used to examine the impact of a treatment program on school disciplinary infractions as measured or operationalized by referrals to the principal’s office (see Table 5.2). In this design, the researcher randomly assigns a group of discipline problem students to the experimental group and control group by flipping a coin—heads to the experimental group and tails to the control group. The experimental group then enters the 3-month treatment program. After the program, the researcher compares the number of referrals to the principal’s office between the experimental and control groups over some period of time, for example, discipline referrals at 6 months after the program. The researcher finds that the experimental group has a much lower number of referrals to the principal’s office in the 6 month follow-up period than the control group.

TABLE 5.2 | Post-Test Only Experimental Design

R

X

O

R

O

Several issues arise in this example study. The researcher would not know if discipline problems decreased, increased, or stayed the same from before to after the treatment program because the researcher did not have a count of disciplinary referrals prior to the treatment program (e.g., a pre-test). Although the groups were randomly assigned and are presumed equivalent, the absence of a pre-test means the researcher cannot confirm that the experimental and control groups were equivalent before the treatment was administered, particularly on the number of referrals to the principal’s office. The groups could have differed by a chance occurrence even with random assignment, and any such differences between the groups could potentially explain the post-test difference in the number of referrals to the principal’s office. For example, if the control group included much more serious or frequent discipline problem students than the experimental group by chance, this difference might explain the lower number of referrals for the experimental group, not that the treatment produced this result.

Experimental Design with Two Treatments and a Control Group

This design could be used to determine the impact of boot camp versus juvenile detention on post-release recidivism (see Table 5.3). Recidivism in this study is operationalized as re-arrest for delinquent behavior. First, a population of known juvenile delinquents is randomly assigned to either boot camp, juvenile detention, or a control condition where they receive no sanction. To accomplish random assignment to groups, the researcher places the names of all youth into a hat and assigns the groups in order. For example, the first name pulled goes into experimental group 1, the next into experimental group 2, and the next into the control group, and so on. Once randomly assigned, the experimental group youth receive either boot camp or juvenile detention for a period of 3 months, whereas members of the control group are released on their own recognizance to their parents. At the end of the experiment, the researcher compares the re-arrest activity of boot camp participants to detention delinquents to control group members during a 6-month follow-up period.

TABLE 5.3 | Experimental Design with Two Treatments and a Control Group

R

O

X

O

R

O

X

O

R

O

O

This design has several advantages. First, it includes all major components of the classic experimental design, and simply adds an additional treatment for comparison purposes. Random assignment was utilized and this means that the groups have a higher probability of being equivalent on all factors that could impact the post-test. Thus, random assignment in this example helps to ensure the only differences between the groups are the treatment conditions. Without random assignment, there is a greater chance that one group of youth was somehow different, and this difference could impact the post-test. For example, if the boot camp youth were much less serious and frequent delinquents than the juvenile detention youth or control group youth, the results might erroneously show that the boot camp reduced recidivism when in fact the youth in boot camp may have been the “best risks”—unlikely to get re-arrested with or without boot camp. The pre-test in the example above allows the researcher to determine change in re-arrests from pretest to post-test. Thus, the researcher can determine if delinquent behavior, as measured by re-arrest, increased, decreased, or remained constant from pre- to post-test. The pre-test also allows the researcher to confirm that the random assignment process resulted in equivalent groups based on the pre-test. Finally, the presence of a control group allows the researcher to have more confidence that any differences in the post-test are due to the treatment. For example, if the control group had more re-arrests than the boot camp or juvenile detention experimental groups 6 months after their release from those programs, the researcher would have more confidence that the programs produced fewer re-arrests because the control group members were the same as the experimental groups; the only difference was that they did not receive a treatment.

The one key feature of experimental designs is that they all retain random assignment. This is why they are considered “experimental” designs. Sometimes, however, experimental designs lack a pre-test. Knowledge of the usefulness of a pre-test demonstrates the potential problems with those designs where it is missing. For example, in the post-test only experimental design, a researcher would not be able to make a determination of change in the dependent variable from pre- to post-test. Perhaps most importantly, the researcher would not be able to confirm that the experimental and control groups were in fact equivalent on a pre-test measure before the introduction of the treatment. Even though both groups were randomly assigned, and probability theory suggests they should be equivalent, without a pre-test measure the researcher could not confirm similarity because differences could occur by chance even with random assignment. If there were any differences at the post-test between the experimental group and control group, the results might be due to some explanation other than the treatment, namely that the groups differed prior to the administration of the treatment. The same limitation could apply in any form of experimental design that does not utilize a pre-test for conformational purposes.

Understanding the contribution of a pre-test to an experimental design shows that it is a critical component. It provides a measure of change and also gives the researcher more confidence that the observed results are due to the treatment, and not some difference between the experimental and control groups. Despite the usefulness of a pre-test, however, perhaps the most critical ingredient of any experimental design is random assignment. It is important to note that all experimental designs retain random assignment.

Experimental Designs Are Rare in Criminal Justice and Criminology

The classic experiment is the foundation for other types of experimental and quasi-experimental designs. The unfortunate reality, however, is that the classic experiment, or other experimental designs, are few and far between in criminal justice. 20 Recall that one of the major components of an experimental design is random assignment. Achieving random assignment is often a barrier to experimental research in criminal justice. Achieving random assignment might, for example, require the approval of the chief (or city council or both) of a major metropolitan police agency to allow researchers to randomly assign patrol officers to certain areas of a city and/or randomly assign police officer actions. Recall the MDVE. This experiment required the full cooperation of the chief of police and other decision-makers to allow researchers to randomly assign police actions. In another example, achieving random assignment might require a judge to randomly assign a group of youthful offenders to a certain juvenile court sanction (experimental group), and another group of similar youthful offenders to no sanction or an alternative sanction as a control group. 21 In sum, random assignment typically requires the cooperation of a number of individuals and sometimes that cooperation is difficult to obtain.

Even when random assignment can be accomplished, sometimes it is not implemented correctly and the random assignment procedure breaks down. This is another barrier to conducting experimental research. For example, in the MDVE, researchers randomly assigned officer responses, but the officers did not always follow the assigned course of action. Moreover, some believe that the random assignment of criminal justice programs, sentences, or randomly assigning officer responses may be unethical in certain circumstances, and even a violation of the rights of citizens. For example, some believe it is unfair when random assignment results in some delinquents being sentenced to boot camp while others get assigned to a control group without any sanction at all or a less restrictive sanction than boot camp. In the MDVE, some believe it is unfair that some suspects were arrested and received an official record whereas others were not arrested for the same type of behavior. In other cases, subjects in the experimental group may receive some benefit from the treatment that is essentially denied to the control group for a period of time and this can become an issue as well.

There are other important reasons why random assignment is difficult to accomplish. Random assignment may, for example, involve a disruption of the normal procedures of agencies and their officers. In the MDVE, officers had to adjust their normal and established routine, and this was a barrier at times in that study. Shadish, Cook, and Campbell also note that random assignment may not always be feasible or desirable when quick answers are needed. 22 This is because experimental designs sometimes take a long time to produce results. In addition to the time required in planning and organizing the experiment, and treatment delivery, researchers may need several months if not years to collect and analyze the data before they have answers. This is particularly important because time is often of the essence in criminal justice research, especially in research efforts testing the effect of some policy or program where it is not feasible to wait years for answers. Waiting for the results of an experimental design means that many policy-makers may make decisions without the results.

Quasi-Experimental Designs

In general terms, quasi-experiments include a group of designs that lack random assignment. Quasi-experiments may also lack other parts, such as a pre-test or a control group, just like some experimental designs. The absence of random assignment, however, is the ingredient that transforms an otherwise experimental design into a quasi-experiment. Lacking random assignment is a major disadvantage because it increases the chances that the experimental and control groups differ on relevant factors before the treatment—both known and unknown—differences that may then emerge as alternative explanations of the outcomes.

Just like experimental designs, quasi-experimental designs can be organized in many different ways. This section will discuss three types of quasi-experiments: nonequivalent group design, one-group longitudinal design, and two-group longitudinal design.

Nonequivalent Group Design

The nonequivalent group design is perhaps the most common type of quasi-experiment. 23 Notice that it is very similar to the classic experimental design with the exception that it lacks random assignment (see Table 5.4). Additionally, what was labeled the experimental group in an experimental design is sometimes called the treatment group in the nonequivalent group design. What was labeled the control group in the experimental design is sometimes called the comparison group in the nonequivalent group design. This terminological distinction is an indicator that the groups were not created through random assignment.

TABLE 5.4 | Nonequivalent Group Design

NR

O

X

O

NR

O

O

NR = Not Randomly assigned

One of the main problems with the nonequivalent group design is that it lacks random assignment, and without random assignment, there is a greater chance that the treatment and comparison groups may be different in some way that can impact study results. Take, for example, a nonequivalent group design where a researcher is interested in whether an aggression-reduction treatment program can reduce inmate-on-inmate assaults in a prison setting. Assume that the researcher asked for inmates who had previously been involved in assaultive activity to volunteer for the aggression-reduction program. Suppose the researcher placed the first 50 volunteers into the treatment group and the next 50 volunteers into the comparison group. Note that this method of assignment is not random but rather first come, first serve.

Because the study utilized volunteers and there was no random assignment, it is possible that the first 50 volunteers placed into the treatment group differed significantly from the last 50 volunteers who were placed in the comparison group. This can lead to alternative explanations for the results. For example, if the treatment group was much younger than the comparison group, the researcher may find at the end of the program that the treatment group still maintained a higher rate of infractions than the comparison group—even after the aggression-reduction program! The conclusion might be that the aggression program actually increased the level of violence among the treatment group. This conclusion would likely be spurious and may be due to the age differential between the treatment and comparison groups. Indeed, research has revealed that younger inmates are significantly more likely to engage in prison assaults than older inmates. The fact that the treatment group incurred more assaults than the comparison group after the aggression-reduction program may only relate to the age differential between the groups, not that the program had no effect or that it somehow may have increased aggression. The previous example highlights the importance of random assignment and the potential problems that can occur in its absence.

Although researchers who utilize a quasi-experimental design are not able to randomly assign their subjects to groups, they can employ other techniques in an attempt to make the groups as equivalent as possible on known or measured factors before the treatment is given. In the example above, it is likely that the researcher would have known the age of inmates, their prior assault record, and various other pieces of information (e.g., previous prison stays). Through a technique called matching, the researcher could make sure the treatment and comparison groups were “matched” on these important factors before administering the aggression reduction program to the treatment group. This type of matching can be done individual to individual (e.g., subject #1 in treatment group is matched to a selected subject #1 in comparison group on age, previous arrests, gender), or aggregately, such that the comparison group is similar to the treatment group overall (e.g., average ages between groups are similar, equal proportions of males and females). Knowledge of these and other important variables, for example, would allow the researcher to make sure that the treatment group did not have heavy concentrations of younger or more frequent or serious offenders than the comparison group—factors that are related to assaultive activity independent of the treatment program. In short, matching allows the researcher some control over who goes into the treatment and comparison groups so as to balance these groups on important factors absent random assignment. If unbalanced on one or more factors, these factors could emerge as alternative explanations of the results. Figure 5.3 demonstrates the logic of matching both at the individual and aggregate level in a quasi-experimental design.

Matching is an important part of the nonequivalent group design. By matching, the researcher can approximate equivalence between the groups on important variables that may influence the post-test. However, it is important to note that a researcher can only match subjects on factors that they have information about—a researcher cannot match the treatment and comparison group members on factors that are unmeasured or otherwise unknown but which may still impact outcomes. For example, if the researcher has no knowledge about the number of previous incarcerations, the researcher cannot match the treatment and comparison groups on this factor. Matching also requires that the information used for matching is valid and reliable, which is not always the case. Agency records, for example, are notorious for inconsistencies, errors, omissions, and for being dated, but are often utilized for matching purposes. Asking survey questions to generate information for matching (for example, how many times have you been incarcerated?) can also be problematic because some respondents may lie, forget, or exaggerate their behavior or experiences.

In addition to the above considerations, the more factors a researcher wishes to match the group members on, the more difficult it becomes to find appropriate matches. Matching on prior arrests or age is less complex than matching on several additional pieces of information. Finally, matching is never considered superior to random assignment when the goal is to construct equitable groups. This is because there is a much higher likelihood of equivalence with random assignment on factors that are both measured and unknown to the researcher. Thus, the results produced from a nonequivalent group design, even with matching, are at a greater risk of alternative explanations than an experimental design that features random assignment.

FIGURE 5.3 | (a) Individual Matching (b) Aggregate Matching

experimental and quasi experimental examples

The previous discussion is not to suggest that the nonequivalent group design cannot be useful in answering important research questions. Rather, it is to suggest that the nonequivalent group design, and hence any quasi-experiment, is more susceptible to alternative explanations than the classic experimental design because of the absence of random assignment. As a result, a researcher must be prepared to rule out potential alternative explanations. Quasi-experimental designs that lack a pre-test or a comparison group are even less desirable than the nonequivalent group design and are subject to additional alternative explanations because of these missing parts. Although the quasi-experiment may be all that is available and still can serve as an important design in evaluating the impact of a particular treatment, it is not preferable to the classic experiment. Researchers (and consumers) must be attuned to the potential issues of this design so as to make informed conclusions about the results produced from such research studies.

The Effects of Red Light Camera (RLC) Enforcement

On March 15, 2009, an article appeared in the Santa Cruz Sentinel entitled “Ticket’s in the Mail: Red-Light Cameras Questioned.” The article stated “while studies show fewer T-bone crashes at lights with cameras and fewer drivers running red lights, the number of rear-end crashes increases.” 24 The study mentioned in the newspaper, which showed fewer drivers running red lights with cameras, was conducted by Richard Retting, Susan Ferguson, and Charles Farmer of the Insurance Institute for Highway Safety (IIHS). 25 They completed a quasi-experimental study in Philadelphia to determine the impact of red light cameras (RLC) on red light violations. In the study, the researchers selected nine intersections—six of which were experimental sites that utilized RLCs and three comparison sites that did not utilize RLCs. The six experimental sites were located in Philadelphia, Pennsylvania, and the three comparison sites were located in Atlantic County, New Jersey. The researchers chose the comparison sites based on the proximity to Philadelphia, the ability to collect data using the same methods as at experimental intersections (e.g., the use of cameras for viewing red light traffic), and the fact that police officials in Atlantic County had offered assistance selecting and monitoring the intersections.

The authors collected three phases of information in the RLC study at the experimental and comparison sites:

Phase 1 Data Collection: Baseline (pre-test) data collection at the experimental and comparison sites consisting of the number of vehicles passing through each intersection, the number of red light violations, and the rate of red light violations per 10,000 vehicles.

Phase 2 Data Collection: Number of vehicles traveling through experimental and comparison intersections, number of red light violations after a 1-second yellow light increase at the experimental sites (treatment 1), number of red light violations at comparison sites without a 1-second yellow light increase, and red light violations per 10,000 vehicles at both experimental and comparison sites.

Phase 3 Data Collection: Red light violations after a 1-second yellow light increase and RLC enforcement at the experimental sites (treatment 2), red light violations at comparison sites without a 1-second yellow increase or RLC enforcement, number of vehicles passing through the experimental and comparison intersections, and the rate of red light violations per 10,000 vehicles.

The researchers operationalized “red light violations” as those where the vehicle entered the intersection one-half of a second or more after the onset of the red signal where the vehicle’s rear tires had to be positioned behind the crosswalk or stop line prior to entering on red. Vehicles already in the intersection at the onset of the red light, or those making a right turn on red with or without stopping were not considered red light violations.

The researchers collected video data at each of the experimental and comparison sites during Phases 1–3. This allowed the researchers to examine red light violations before, during, and after the implementation of red light enforcement and yellow light time increases. Based on an analysis of data, the researchers revealed that the implementation of a 1-second yellow light increase led to reductions in the rate of red light violations from Phase 1 to Phase 2 in all of the experimental sites. In 2 out of 3 comparison sites, the rate of red light violations also decreased, despite no yellow light increase. From Phase 2 to Phase 3 (the enforcement of red light camera violations in addition to a 1-second yellow light increase at experimental sites), the authors noted decreases in the rate of red light violations in all experimental sites, and decreases among 2 of 3 comparison sites without red light enforcement in effect.

Concluding their study, the researchers noted that the study “found large and highly significant incremental reductions in red light running associated with increased yellow signal timing followed by the introduction of red light cameras.” Despite these findings, the researchers noted a number of potential factors to consider in light of the findings: the follow-up time periods utilized when counting red light violations before and after the treatment conditions were instituted; publicity about red light camera enforcement; and the size of fines associated with red light camera enforcement (the fine in Philadelphia was $100, higher than in many other cities), among others.

After reading about the study used in the newspaper article, has your impression of the newspaper headline and quote changed?

For more information and research on the effect of RLCs, visit the Insurance Institute for Highway Safety at http://www .iihs.org/research/topics/rlr.html .

One-Group Longitudinal Design

Like all experimental designs, the quasi-experimental design can come in a variety of forms. The second quasi-experimental design (above) is the one-group longitudinal design (also called a simple interrupted time series design). 26 An examination of this design shows that it lacks both random assignment and a comparison group (see Table 5.5). A major difference between this design and others we have covered is that it includes multiple pre-test and post-test observations.

TABLE 5.5 | One-Group Longitudinal Design

NR

O

O

O

O

X

O

O

O

O

The one-group longitudinal design is useful when researchers are interested in exploring longer-term patterns. Indeed, the term longitudinal generally means “over time”—repeated measurements of the pre-test and post-test over time. This is different from cross-sectional designs, which examine the pre-test and post-test at only one point in time (e.g., at a single point before the application of the treatment and at a single point after the treatment). For example, in the nonequivalent group design and the classic experimental design previously examined, both are cross-sectional because pre-tests and post-tests are measured at one point in time (e.g., at a point 6 months after the treatment). Yet, these designs could easily be considered longitudinal if researchers took repeated measures of the pre-test and post-test.

The organization of the one-group longitudinal design is to examine a baseline of several pre-test observations, introduce a treatment or intervention, and then examine the post-test at several different time intervals. As organized, this design is useful for gauging the impact that a particular program, policy, or law has, if any, and how long the treatment impact lasts. Consider an example whereby a researcher is interested in gauging the impact of a tobacco ban on inmate-on-inmate assaults in a prison setting. This is an important question, for recent years have witnessed correctional systems banning all tobacco products from prison facilities. Correctional administrators predicted that there would be a major increase of inmate-on-inmate violence once the bans took effect. The one-group longitudinal design would be one appropriate design to examine the impact of banning tobacco on inmate assaults.

To construct this study using the one-group longitudinal design, the researcher would first examine the rate of inmate-on-inmate assaults in the prison system (or at an individual prison, a particular cellblock, or whatever the unit of analysis) prior to the removal of tobacco. This is the pre-test, or a baseline of assault activity before the ban goes into effect. In the design presented above, perhaps the researcher would measure the level of assaults in the preceding four months prior to the tobacco ban. When establishing a pre-test baseline, the general rule is that, in a longitudinal design, the more time utilized, both in overall time and number of intervals, the better. For example, the rate of assaults in the preceding month is not as useful as an entire year of data on inmate assaults prior to the tobacco ban. Next, once the tobacco ban is implemented, the researcher would then measure the rate of inmate assaults in the coming months to determine what impact the ban had on inmate-on-inmate assaults. This is shown in Table 5.5 as the multiple post-test measures of assaults. Assaults may increase, decrease, or remain constant from the pre-test baseline over the term of the post-test.

If assaults increased at the same time as the ban went into effect, the researcher might conclude that the increase was due only to the tobacco ban. But, could there be alternative explanations? The answer to this question is yes, there may be other plausible explanations for the increase even with several months of pre-test data. Unfortunately, without a comparison group there is no way for the researcher to be certain if the increase in assaults was due to the tobacco ban, or some other factor that may have spurred the increase in assaults and happened at the same time as the tobacco ban. What if assaults decreased after the tobacco ban went into effect? In this scenario, because there is no comparison group, the researcher would still not know if the results would have happened anyway without the tobacco ban. In these instances, the lack of a comparison group prevents the researcher from confidently attributing the results to the tobacco ban, and interpretation is subject to numerous alternative explanations.

Two-Group Longitudinal Design

A remedy for the previous situation would be to introduce a comparison group (see Table 5.6). Prior to the full tobacco ban, suppose prison administrators conducted a pilot program at one prison to provide insight as to what would happen once the tobacco ban went into effect systemwide. To conduct this pilot, the researcher identified one prison. At this prison, the researcher identified two different cellblocks, C-Block and D-Block. C-Block constitutes the treatment group, or the cellblock of inmates who will have their tobacco taken away. D-Block is the comparison group—inmates in this cellblock will retain their tobacco privileges during the course of the study and during a determined follow-up period to measure post-test assaults (e.g., 12-months). This is a two-group longitudinal design (also sometimes called a multiple interrupted time series design), and adding a comparison group makes this design superior to the one-group longitudinal design.

TABLE 5.6 | Two-Group Longitudinal Design

NR

O

O

O

O

X

O

O

O

O

NR

O

O

O

O

O

O

O

O

The usefulness of adding a comparison group to the study means that the researcher can have more confidence that the results at the post-test are due to the tobacco ban and not some alternative explanation. This is because any difference in assaults at the post-test between the treatment and comparison group should be attributed to the only difference between them, the tobacco ban. For this interpretation to hold, however, the researcher must be sure that C-Block and D-Block are similar or equivalent on all factors that might influence the post-test. There are many potential factors that should be considered. For example, the researcher will want to make sure that the same types of inmates are housed in both cellblocks. If a chronic group of assaultive inmates constitutes members of C-Block, but not D-Block, this differential could explain the results, not the treatment.

The researcher might also want to make sure equitable numbers of tobacco and non-tobacco users are found in each cellblock. If very few inmates in C-Block are smokers, the real effect of removing tobacco may be hidden. The researcher might also examine other areas where potential differences might arise, for example, that both cellblocks are staffed with equal numbers of officers, that officers in each cellblock tend to resolve inmate disputes similarly, and other potential issues that could influence post-test measure of assaults. Equivalence could also be ensured by comparing the groups on additional evidence before the ban takes effect: number of prior prison sentences, time served in prison, age, seriousness of conviction crime, and other factors that might relate to assaultive behavior, regardless of the tobacco ban. Moreover, the researcher should ensure that inmates in C-Block do not know that their D-Block counterparts are still allowed tobacco during the pilot study, and vice versa. If either group knows about the pilot program being an experiment, they might act differently than normal, and this could become an explanation of results. Additionally, the researchers might also try to make sure that C-Block inmates are completely tobacco free after the ban goes into effect—that they do not hoard, smuggle, or receive tobacco from officers or other inmates during the tobacco ban in or outside of the cellblock. If these and other important differences are accounted for at the individual and cellblock level, the researcher will have more confidence that any differences in assaults at the post-test between the treatment and comparison groups are related to the tobacco ban, and not some other difference between the two groups or the two cellblocks.

The addition of a comparison group aids in the ability of the researcher to isolate the true impact of a tobacco ban on inmate-on-inmate assaults. All factors that influence the treatment group should also influence the comparison group because the groups are made up of equivalent individuals in equivalent circumstances, with the exception of the tobacco ban. If this is the only difference, the results can be attributed to the ban. Although the addition of the comparison group in the two-group longitudinal design provides more confidence that the findings are attributed to the tobacco ban, the fact that this design lacks randomization means that alternative explanations cannot be completely ruled out—but they can be minimized. This example also suggests that the quasi-experiment in this instance may actually be preferable to an experimental design—noting the realities of prison administration. For example, prison inmates are not typically randomly assigned to different cellblocks by prison officers. Moreover, it is highly unlikely that a prison would have two open cellblocks waiting for a researcher to randomly assign incoming inmates to the prison for a tobacco ban study. Therefore, it is likely there would be differences among the groups in the quasi-experiment.

Fortunately, if differences between the groups are present, the researcher can attempt to determine their potential impact before interpretation of results. The researcher can also use statistical models after the ban takes effect to determine the impact of any differences between the groups on the post-test. While the two-group longitudinal quasi-experiment just discussed could also take the form of an experimental design, if random assignment could somehow be accomplished, the previous discussion provides one situation where an experimental design might be appropriate and desired for a particular research question, but would not be realistic considering the many barriers.

The Threat of Alternative Explanations

Alternative explanations are those factors that could explain the post-test results, other than the treatment. Throughout this chapter, we have noted the potential for alternative explanations and have given several examples of explanations other than the treatment. It is important to know that potential alternative explanations can arise in any research design discussed in this chapter. However, alternative explanations often arise because some design part is missing, for example, random assignment, a pre-test, or a control or comparison group. This is especially true in criminal justice where researchers often conduct field studies and have less control over their study conditions than do researchers who conduct experiments under highly controlled laboratory conditions. A prime example of this is the tobacco ban study, where it would be difficult for researchers to ensure that C-Block inmates, the treatment group, were completely tobacco free during the course of the study.

Alternative explanations are typically referred to as threats to internal validity. In this context, if an experiment is internally valid, it means that alternative explanations have been ruled out and the treatment is the only factor that produced the results. If a study is not internally valid, this means that alternative explanations for the results exist or potentially exist. In this section, we focus on some common alternative explanations that may arise in experimental and quasi-experimental designs. 27

Selection Bias

One of the more common alternative explanations that may occur is selection bias. Selection bias generally indicates that the treatment group (or experimental group) is somehow different from the comparison group (or control group) on a factor that could influence the post-test results. Selection bias is more often a threat in quasi-experimental designs than experimental designs due to the lack of random assignment. Suppose in our study of the prison tobacco ban, members of C-Block were substantially younger than members of D-Block, the comparison group. Such an imbalance between the groups would mean the researcher would not know if the differences in assaults are real (meaning the result of the tobacco ban) or a result of the age differential. Recall that research shows that younger inmates are more assaultive than older inmates and so we would expect more assaults among the younger offenders independent of the tobacco ban.

In a quasi-experiment, selection bias is perhaps the most prevalent type of alternative explanation and can seriously compromise results. Indeed, many of the examples above have referred to potential situations where the groups are imbalanced or not equivalent on some important factor. Although selection bias is a common threat in quasi-experimental designs because of lack of random assignment, and can be a threat in experimental designs because the groups could differ by chance alone or the practice of randomization was not maintained throughout the study (see Classics in CJ Research-MDVE above), a researcher may be able to detect such differentials. For example, the researcher could detect such differences by comparing the groups on the pre-test or other types of information before the start of the study. If differences were found, the researcher could take measures to correct them. The researcher could also use a statistical model that could account or control for differences between the groups and isolate the impact of the treatment, if any. This discussion is beyond the scope of this text but would be a potential way to deal with selection bias and estimate the impact of this bias on study results. The researcher could also, if possible, attempt to re-match the groups in a quasi-experiment or randomly assign the groups a second time in an experimental design to ensure equivalence. At the least, the researcher could recognize the group differences and discuss their potential impact on the results. Without a pre-test or other pre-study information on study participants, however, such differences might not be able to be detected and, therefore, it would be more difficult to determine how the differences, as a result of selection bias, influenced the results.

Another potential alternative explanation is history. History refers to any event experienced differently by the treatment and comparison groups in the time between the pre-test and the post-test that could impact results. Suppose during the course of the tobacco ban study several riots occurred on D-Block, the comparison group. Because of the riots, prison officers “locked down” this cellblock numerous times. Because D-Block inmates were locked down at various times, this could have affected their ability to otherwise engage in inmate assaults. At the end of the study, the assaults in D-Block might have decreased from their pre-test levels because of the lockdowns, whereas in C-Block assaults may have occurred at their normal pace because there was not a lockdown, or perhaps even increased from the pretest because tobacco was also taken away. Even if the tobacco ban had no effect and assaults remained constant in C-Block from pre- to post-test, the lockdown in D-Block might make it appear that the tobacco ban led to increased assaults in C-Block. Thus, the researcher would not know if the post-test results for the C-Block treatment group were attributable to the tobacco ban or the simple fact that D-Block inmates were locked down and their assault activity was artificially reduced. In this instance, the comparison group becomes much less useful because the lockdown created a historical factor that imbalanced the groups during the treatment phase and nullified the comparison.

Another potential alternative explanation is maturation. Maturation refers to the natural biological, psychological, or emotional processes we all experience as time passes—aging, becoming more or less intelligent, becoming bored, and so on. For example, if a researcher was interested in the effect of a boot camp on recidivism for juvenile offenders, it is possible that over the course of the boot camp program the delinquents naturally matured as they aged and this produced the reduction in recidivism—not that the boot camp somehow led to this reduction. This threat is particularly applicable in situations that deal with populations that rapidly change over a relatively short period of time or when a treatment lasts a considerable period of time. However, this threat could be eliminated with a comparison group that is similar to the treatment group. This is because the maturation effects would occur in both groups and the effect of the boot camp, if any, could be isolated. This assumes, however, that the groups are matched and equitable on factors subject to the maturation process, such as age. If not, such differentials could be an alternative explanation of results. For example, if the treatment and comparison groups differ by age, on average, this could mean that one group changes or matures at a different rate than the other group. This differential rate of change or maturation as a result of the age differential could explain the results, not the treatment. This example demonstrates how selection bias and maturation can interact at the same time as alternative explanations. This example also suggests the importance of an equivalent control or comparison group to eliminate or minimize the impact of maturation as an alternative explanation.

Attrition or Subject Mortality

Attrition or subject mortality is another typical alternative explanation. Attrition refers to differential loss in the number or type of subjects between the treatment and comparison groups and can occur in both experimental and quasi-experimental designs. Suppose we wanted to conduct a study to determine who is the better research methods professor among the authors of this textbook. Let’s assume that we have an experimental design where students were randomly assigned to professor 1, professor 2, or professor 3. By randomly assigning students to each respective professor, there is greater probability that the groups are equivalent and thus there are no differences between the three groups with one exception—the professor they receive and his or her particular teaching and delivery style. This is the treatment. Let’s also assume that the professors will be administering the same tests and using the same textbook. After the group members are randomly assigned, a pre-treatment evaluation shows the groups are in fact equivalent on all important known factors that could influence post-test scores, such as grade point average, age, time in school, and exposure to research methods concepts. Additionally, all groups scored comparably on a pre-test of knowledge about research methods, thus there is more confidence that the groups are in fact equivalent.

At the conclusion of the study, we find that professor 2’s group has the lowest final test scores of the three. However, because professor 2 is such an outstanding professor, the results appear odd. At first glance, the researcher thinks the results could have been influenced by students dropping out of the class. For example, perhaps several of professor 2’s students dropped the course but none did from the classes of professor 1 or 3. It is revealed, however, that an equal number of students dropped out of all three courses before the post-test and, therefore, this could not be the reason for the low scores in professor 2’s course. Upon further investigation, however, the researcher finds that although an equal number of students dropped out of each class, the dropouts in professor 2’s class were some of his best students. In contrast, those who dropped out of professor 1’s and professor 3’s courses were some of their poorest students. In this example, professor 2 appears to be the least effective teacher. However, this result appears to be due to the fact that his best students dropped out, and this highly influenced the final test average for his group. Although there was not a differential loss of subjects in terms of numbers (which can also be an attrition issue), there was differential loss in the types of students. This differential loss, not the teaching style, is an alternative explanation of the results.

Testing or Testing Bias

Another potential alternative explanation is testing or testing bias. Suppose that after the pre-test of research methods knowledge, professor 1 and professor 3 reviewed the test with their students and gave them the correct answers. Professor 2 did not. The fact that professor l’s and professor 3’s groups did better on the post-test final exam may be explained by the finding that students in those groups remembered the answers to the pre-test, were thus biased at the pre-test, and this artificially inflated their post-test scores. Testing bias can explain the results because students in groups 1 and 3 may have simply remembered the answers from the pre-test review. In fact, the students in professor l’s and 3’s courses may have scored high on the post-test without ever having been exposed to the treatment because they were biased at the pre-test.

Instrumentation

Another alternative explanation that can arise is instrumentation. Instrumentation refers to changes in the measuring instrument from pre- to post-test. Using the previous example, suppose professors 1 and 3 did not give the same final exam as professor 2. For example, professors 1 and 3 changed the final exam and professor 2 kept the final exam the same as the pretest. Because professors 1 and 3 changed the exam, and perhaps made it easier or somehow different from the pre-test exam, results that showed lower scores for professor 2’s students may be related only to instrumentation changes from pre- to post-test. Obviously, to limit the influence of instrumentation, researchers should make sure that instruments remain consistent from pre- to post-test.

A final alternative explanation is reactivity. Reactivity occurs when members of the treatment or experimental group change their behavior simply as a result of being part of a study. This is akin to the finding that people tend to change their behavior when they are being watched or are aware they are being studied. If members of the experiment know they are part of an experiment and are being studied and watched, it is possible that their behavior will change independent of the treatment. If this occurs, the researcher will not know if the behavior change is the result of the treatment, or simply a result of being part of a study. For example, suppose a researcher wants to determine if a boot camp program impacts the recidivism of delinquent offenders. Members of the experimental group are sentenced to boot camp and members of the control group are released on their own recognizance to their parents. Because members of the experimental group know they are part of the experiment, and hence being watched closely after they exit boot camp, they may artificially change their behavior and avoid trouble. Their change of behavior may be totally unrelated to boot camp, but rather, to their knowledge of being part of an experiment.

Other Potential Alternative Explanations

The above discussion provided some typical alternative explanations that may arise with the designs discussed in this chapter. There are, however, other potential alternative explanations that may arise. These alternative explanations arise only when a control or comparison group is present.

One such alternative explanation is diffusion of treatment. Diffusion of treatment occurs when the control or comparison group learns about the treatment its members are being denied and attempts to mimic the behavior of the treatment group. If the control group is successful in mimicking the experimental group, for example, the results at the end of the study may show similarity in outcomes between groups and cause the researcher to conclude that the program had no effect. In fact, however, the finding of no effect can be explained by the comparison group mimicking the treatment group. 28 In reality, there may be no effect of the treatment, but the researcher would not know this for sure because the control group effectively transformed into another experimental group—there is then no baseline of comparison. Consider a study where a researcher wants to determine the impact of a training program on class behavior and participation. In this study, the experimental group is exposed to several sessions of training on how to act appropriately in class and how to engage in class participation. The control group does not receive such training, but they are aware that they are part of an experiment. Suppose after a few class sessions the control group starts to mimic the behavior of the experimental group, acting the same way and participating in class the same way. At the conclusion of the study, the researcher might determine that the program had no impact because the comparison group, which did not receive the new program, showed similar progress.

In a related explanation, sometimes the comparison or control group learns about the experiment and attempts to compete with the experimental or treatment group. This alternative explanation is called compensatory rivalry. For example, suppose a police chief wants to determine if a new training program will increase the endurance of SWAT team officers. The chief randomly assigns SWAT members to either an experimental or control group. The experimental group will receive the new endurance training program and the control group will receive the normal program that has been used for years. During the course of the study, suppose the control group learns that the treatment group is receiving the new endurance program and starts to compete with the experimental group. Perhaps the control group runs five more miles per day and works out an extra hour in the weight room, in addition to their normal endurance program. At the end of the study, and due to the control group’s extra and competing effort, the results might show no effect of the new endurance program, and at worst, experimental group members may show a decline in endurance compared to the control group. The rivalry or competing behavior actually explains the results, not that the new endurance program has no effect or a damaging effect. Although the new endurance program may in reality have no effect, this cannot be known because of the actions of the control group, who learned about the treatment and competed with the experimental group.

Closely related to compensatory rivalry is the alternative explanation of comparison or control group demoralization. 29 In this instance, instead of competing with the experimental or treatment group, the control or comparison group simply gives up and changes their normal behavior. Using the SWAT example, perhaps the control group simply quits their normal endurance program when they learn about the treatment group receiving the new endurance program. At the post-test, their endurance will likely drop considerably compared to the treatment group. Because of this, the new endurance program might emerge as a shining success. In reality, however, the researcher will not know if any changes in endurance between the experimental and control groups are a result of the new endurance program or the control group giving up. Due to their giving up, there is no longer a comparison group of equitable others, the change in endurance among the treatment group members could be attributed to a number of alternative explanations, for example, maturation. If the comparison group behaves normally, the researcher will be able to exclude maturation as a potential explanation. This is because any maturation effects will occur in both groups.

The previous discussion suggests that when the control or comparison group learns about the experiment and the treatment they are denied, potential alternative explanations can arise. Perhaps the best remedy to protect from the alternative explanations just discussed is to make sure the treatment and comparison groups do not have contact with one another. In laboratory experiments this can be ensured, but sometimes this is a problem in criminal justice studies, which are often conducted in the field.

The previous discussion also suggests that there are numerous alternative explanations that can impact the interpretation of results from a study. A careful researcher would know that alternative explanations must be ruled out before reaching a definitive conclusion about the impact of a particular program. The researcher must be attuned to these potential alternative explanations because they can influence results and how results are interpreted. Moreover, the discussion shows that several alternative explanations can occur at the same time. For example, it is possible that selection bias, maturation, attrition, and compensatory rivalry all emerge as alternative explanations in the same study. Knowing about these potential alternative explanations and how they can impact the results of a study is what distinguishes a consumer of research from an educated consumer of research.

Chapter Summary

The primary focus of this chapter was the classic experimental design, the foundation for other types of experimental and quasi-experimental designs. The classic experimental design is perhaps the most useful design when exploring causal relationships. Often, however, researchers cannot employ the classic experimental design to answer a research question. In fact, the classic experimental design is rare in criminal justice and criminology because it is often difficult to ensure random assignment for a variety of reasons. In circumstances where an experimental design is appropriate but not feasible, researchers may turn to one of many quasi-experimental designs. The most important difference between the two is that quasi-experimental designs do not feature random assignment. This can create potential problems for researchers. The main problem is that there is a greater chance the treatment and comparison groups may differ on important characteristics that could influence the results of a study. Although researchers can attempt to prevent imbalances between the groups by matching them on important known characteristics, it is still much more difficult to establish equivalence than it is in the classic experiment. As such, it becomes more difficult to determine what impact a treatment had, if any, as one moves from an experimental to a quasi-experimental design.

Perhaps the most important lesson to be learned in this chapter is that to be an educated consumer of research results requires an understanding of the type of design that produced the results. There are numerous ways experimental and quasi-experimental designs can be structured. This is why much attention was paid to the classic experimental design. In reality, all experimental and quasi-experimental designs are variations of the classic experiment in some way—adding or deleting certain components. If the components and organization and logic of the classic experimental design are understood, consumers of research will have a better understanding of the results produced from any sort of research design. For example, what problems in interpretation arise when a design lacks a pre-test, a control group, or random assignment? Having an answer to this question is a good start toward being an informed consumer of research results produced through experimental and quasi-experimental designs.

Critical Thinking Questions

1. Why is randomization/random assignment preferable to matching? Provide several reasons with explanation.

2. What are some potential reasons a researcher would not be able to utilize random assignment?

3. What is a major limitation of matching?

4. What is the difference between a longitudinal study and a cross-sectional study?

5. Describe a hypothetical study where maturation, and not the treatment, could explain the outcomes of the research.

association (or covariance or correlation): One of three conditions that must be met for establishing cause and effect, or a causal relationship. Association refers to the condition that X and Y must be related for a causal relationship to exist. Association is also referred to as covariance or correlation. Although two variables may be associated (or covary or be correlated), this does not automatically imply that they are causally related

attrition or subject mortality: A threat to internal validity, it refers to the differential loss of subjects between the experimental (treatment) and control (comparison) groups during the course of a study

cause and effect relationship: A cause and effect relationship occurs when one variable causes another, and no other explanation for that relationship exists

classic experimental design or experimental design: A design in a research study that features random assignment to an experimental or control group. Experimental designs can vary tremendously, but a constant feature is random assignment, experimental and control groups, and a post-test. For example, a classic experimental design features random assignment, a treatment, experimental and control groups, and pre- and post-tests

comparison group: The group in a quasi-experimental design that does not receive the treatment. In an experimental design, the comparison group is referred to as the control group

compensatory rivalry: A threat to internal validity, it occurs when the control or comparison group attempts to compete with the experimental or treatment group

control group: In an experimental design, the control group does not receive the treatment. The control group serves as a baseline of comparison to the experimental group. It serves as an example of what happens when a group equivalent to the experimental group does not receive the treatment

cross-sectional designs: A measurement of the pre-test and post-test at one point in time (e.g., six months before and six months after the program)

demoralization: A threat to internal validity closely associated with compensatory rivalry, it occurs when the control or comparison group gives up and changes their normal behavior. While in compensatory rivalry the group members compete, in demoralization, they simply quit. Both are not normal behavioral reactions

dependent variable: Also known as the outcome in a research study. A post-test is a measure of the dependent variable

diffusion of treatment: A threat to internal validity, it occurs when the control or comparison group members learn that they are not getting the treatment and attempt to mimic the behavior of the experimental or treatment group. This mimicking may make it seem as if the treatment is having no effect, when in fact it may be

elimination of alternative explanations: One of three conditions that must be met for establishing cause and effect. Elimination of alternative explanations means that the researcher has ruled out other explanations for an observed relationship between X and Y

experimental group: In an experimental design, the experimental group receives the treatment

history: A threat to internal validity, it refers to any event experienced differently by the treatment and comparison groups—an event that could explain the results other than the supposed cause

independent variable: Also called the cause

instrumentation: A threat to internal validity, it refers to changes in the measuring instrument from pre- to post-test

longitudinal: Refers to repeated measurements of the pre-test and post-test over time, typically for the same group of individuals. This is the opposite of cross-sectional

matching: A process sometimes utilized in some quasi-experimental designs that feature treatment and comparison groups. Matching is a process whereby the researcher attempts to ensure equivalence between the treatment and comparison groups on known information, in the absence of the ability to randomly assign the groups

maturation: A threat to internal validity, maturation refers to the natural biological, psychological, or emotional processes as time passes

negative association: Refers to a negative association between two variables. A negative association is demonstrated when X increases and Y decreases, or X decreases and Y increases. Also known as an inverse relationship—the variables moving in opposite directions

operationalized or operationalization: Refers to the process of assigning a working definition to a concept. For example, the concept of intelligence can be operationalized or defined as grade point average or score on a standardized exam, among others

pilot program or test: Refers to a smaller test study or pilot to work out problems before a larger study and to anticipate changes needed for a larger study. Similar to a test run

positive association: Refers to a positive association between two variables. A positive association means as X increases, Y increases, or as X decreases, Y decreases

post-test: The post-test is a measure of the dependent variable after the treatment has been administered

pre-test: The pre-test is a measure of the dependent variable or outcome before a treatment is administered

quasi-experiment: A quasi-experiment refers to any number of research design configurations that resemble an experimental design but primarily lack random assignment. In the absence of random assignment, quasi-experimental designs feature matching to attempt equivalence

random assignment: Refers to a process whereby members of the experimental group and control group are assigned to each group through a random and unbiased process

random selection: Refers to selecting a smaller but representative subset from a population. Not to be confused with random assignment

reactivity: A threat to internal validity, it occurs when members of the experimental (treatment) or control (comparison) group change their behavior unnaturally as a result of being part of a study

selection bias: A threat to internal validity, selection bias occurs when the experimental (treatment) group and control (comparison) group are not equivalent. The difference between the groups can be a threat to internal validity, or, an alternative explanation to the findings

spurious: A spurious relationship is one where X and Y appear to be causally related, but in fact the relationship is actually explained by a variable or factor other than X

testing or testing bias: A threat to internal validity, it refers to the potential of study members being biased prior to a treatment, and this bias, rather than the treatment, may explain study results

threat to internal validity: Also known as alternative explanation to a relationship between X and Y. Threats to internal validity are factors that explain Y, or the dependent variable, and are not X, or the independent variable

timing: One of three conditions that must be met for establishing cause and effect. Timing refers to the condition that X must come before Y in time for X to be a cause of Y. While timing is necessary for a causal relationship, it is not sufficient, and considerations of association and eliminating other alternative explanations must be met

treatment: A component of a research design, it is typically denoted by the letter X. In a research study on the impact of teen court on juvenile recidivism, teen court is the treatment. In a classic experimental design, the treatment is given only to the experimental group, not the control group

treatment group: The group in a quasi-experimental design that receives the treatment. In an experimental design, this group is called the experimental group

unit of analysis: Refers to the focus of a research study as being individuals, groups, or other units of analysis, such as prisons or police agencies, and so on

variable(s): A variable is a concept that has been given a working definition and can take on different values. For example, intelligence can be defined as a person’s grade point average and can range from low to high or can be defined numerically by different values such as 3.5 or 4.0

1 Povitsky, W., N. Connell, D. Wilson, & D. Gottfredson. (2008). “An experimental evaluation of teen courts.” Journal of Experimental Criminology, 4, 137–163.

2 Hirschi, T., and H. Selvin (1966). “False criteria of causality in delinquency.” Social Problems, 13, 254–268.

3 Robert Roy Britt, “Churchgoers Live Longer.” April, 3, 2006. http://www.livescience.com/health/060403_church_ good.html. Retrieved on September 30, 2008.

4 Kalist, D., and D. Yee (2009). “First names and crime: Does unpopularity spell trouble?” Social Science Quarterly, 90 (1), 39–48.

5 Sherman, L. (1992). Policing domestic violence. New York: The Free Press.

6 For historical and interesting reading on the effects of weather on crime and other disorder, see Dexter, E. (1899). “Influence of weather upon crime.” Popular Science Monthly, 55, 653–660 in Horton, D. (2000). Pioneering Perspectives in Criminology. Incline Village, NV: Copperhouse.

7 http://www.escapistmagazine.com/news/view/111191-Less-Crime-in-U-S-Thanks-to-Videogames , retrieved on September 13, 2011. This news article was in response to a study titled “Understanding the effects of violent videogames on violent crime.” See Cunningham, Scott, Engelstätter, Benjamin, and Ward, (April 7, 2011). Available at SSRN: http://ssm.com/abstract= 1804959.

8 Cohn, E. G. (1987). “Changing the domestic violence policies of urban police departments: Impact of the Minneapolis experiment.” Response, 10 (4), 22–24.

9 Schmidt, Janell D., & Lawrence W. Sherman (1993). “Does arrest deter domestic violence?” American Behavioral Scientist, 36 (5), 601–610.

10 Maxwell, Christopher D., Joel H. Gamer, & Jeffrey A. Fagan. (2001). The effects of arrest on intimate partner violence: New evidence for the spouse assault replication program. Washington D.C.: National Institute of Justice.

11 Miller, N. (2005). What does research and evaluation say about domestic violence laws? A compendium of justice system laws and related research assessments. Alexandria, VA: Institute for Law and Justice.

12 The sections on experimental and quasi-experimental designs rely heavily on the seminal work of Campbell and Stanley (Campbell, D.T., & J. C. Stanley. (1963). Experimental and quasi-experimental designs for research. Chicago: RandMcNally) and more recently, Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin.

13 Povitsky et al. (2008). p. 146, note 9.

14 Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin Company.

15 Ibid, 15.

16 Finckenauer, James O. (1982). Scared straight! and the panacea phenomenon. Englewood Cliffs, N.J.: Prentice Hall.

17 Yarborough, J.C. (1979). Evaluation of JOLT (Juvenile Offenders Learn Truth) as a deterrence program. Lansing, MI: Michigan Department of Corrections.

18 Petrosino, Anthony, Carolyn Turpin-Petrosino, & James O. Finckenauer. (2000). “Well-meaning programs can have harmful effects! Lessons from experiments of programs such as Scared Straight.” Crime and Delinquency, 46, 354–379.

19 “Swearing makes pain more tolerable” retrieved at http:// www.livescience.com/health/090712-swearing-pain.html (July 13, 2009). Also see “Bleep! My finger! Why swearing helps ease pain” by Tiffany Sharpies, retrieved at http://www.time.com/time/health/article /0,8599,1910691,00.html?xid=rss-health (July 16, 2009).

20 For an excellent discussion of the value of controlled experiments and why they are so rare in the social sciences, see Sherman, L. (1992). Policing domestic violence. New York: The Free Press, 55–74.

21 For discussion, see Weisburd, D., T. Einat, & M. Kowalski. (2008). “The miracle of the cells: An experimental study of interventions to increase payment of court-ordered financial obligations.” Criminology and Public Policy, 7, 9–36.

22 Shadish, Cook, & Campbell. (2002).

24 Kelly, Cathy. (March 15, 2009). “Tickets in the mail: Red-light cameras questioned.” Santa Cruz Sentinel.

25 Retting, Richard, Susan Ferguson, & Charles Farmer. (January 2007). “Reducing red light running through longer yellow signal timing and red light camera enforcement: Results of a field investigation.” Arlington, VA: Insurance Institute for Highway Safety.

26 Shadish, Cook, & Campbell. (2002).

27 See Shadish, Cook, & Campbell. (2002), pp. 54–61 for an excellent discussion of threats to internal validity. Also see Chapter 2 for an extended discussion of all forms of validity considered in research design.

28 Trochim, W. (2001). The research methods knowledge base, 2nd ed. Cincinnati, OH: Atomic Dog.

Applied Research Methods in Criminal Justice and Criminology by University of North Texas is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

12.2: Pre-experimental and quasi-experimental design

  • Last updated
  • Save as PDF
  • Page ID 25667

  • Matthew DeCarlo
  • Radford University via Open Social Work Education

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Learning Objectives

  • Identify and describe the various types of quasi-experimental designs
  • Distinguish true experimental designs from quasi-experimental and pre-experimental designs
  • Identify and describe the various types of quasi-experimental and pre-experimental designs

As we discussed in the previous section, time, funding, and ethics may limit a researcher’s ability to conduct a true experiment. For researchers in the medical sciences and social work, conducting a true experiment could require denying needed treatment to clients, which is a clear ethical violation. Even those whose research may not involve the administration of needed medications or treatments may be limited in their ability to conduct a classic experiment. When true experiments are not possible, researchers often use quasi-experimental designs.

Quasi-experimental designs are similar to true experiments, but they lack random assignment to experimental and control groups. The most basic of these quasi-experimental designs is the nonequivalent comparison groups design (Rubin & Babbie, 2017). [1] The nonequivalent comparison group design looks a lot like the classic experimental design, except it does not use random assignment. In many cases, these groups may already exist. For example, a researcher might conduct research at two different agency sites, one of which receives the intervention and the other does not. No one was assigned to treatment or comparison groups. Those groupings existed prior to the study. While this method is more convenient for real-world research, researchers cannot be sure that the groups are comparable. Perhaps the treatment group has a characteristic that is unique–for example, higher income or different diagnoses–that make the treatment more effective.

Quasi-experiments are particularly useful in social welfare policy research. Social welfare policy researchers like me often look for what are termed natural experiments , or situations in which comparable groups are created by differences that already occur in the real world. For example, Stratmann and Wille (2016) [2] were interested in the effects of a state healthcare policy called Certificate of Need on the quality of hospitals. They clearly cannot assign states to adopt one set of policies or another. Instead, researchers used hospital referral regions, or the areas from which hospitals draw their patients, that spanned across state lines. Because the hospitals were in the same referral region, researchers could be pretty sure that the client characteristics were pretty similar. In this way, they could classify patients in experimental and comparison groups without affecting policy or telling people where to live.

There are important examples of policy experiments that use random assignment, including the Oregon Medicaid experiment. In the Oregon Medicaid experiment, the wait list for Oregon was so long, state officials conducted a lottery to see who from the wait list would receive Medicaid (Baicker et al., 2013). [3] Researchers used the lottery as a natural experiment that included random assignment. People selected to be a part of Medicaid were the experimental group and those on the wait list were in the control group. There are some practical complications with using people on a wait list as a control group—most obviously, what happens when people on the wait list are accepted into the program while you’re still collecting data? Natural experiments aren’t a specific kind of experiment like quasi- or pre-experimental designs. Instead, they are more like a feature of the social world that allows researchers to use the logic of experimental design to investigate the connection between variables.

96-1024x682.jpg

Matching is another approach in quasi-experimental design to assigning experimental and comparison groups. Researchers should think about what variables are important in their study, particularly demographic variables or attributes that might impact their dependent variable. Individual matching involves pairing participants with similar attributes. When this is done at the beginning of an experiment, the matched pair is split—with one participant going to the experimental group and the other to the control group. An ex post facto control group , in contrast, is when a researcher matches individuals after the intervention is administered to some participants. Finally, researchers may engage in aggregate matching , in which the comparison group is determined to be similar on important variables.

There are many different quasi-experimental designs in addition to the nonequivalent comparison group design described earlier. Describing all of them is beyond the scope of this textbook, but one more design is worth mentioning. The time series design uses multiple observations before and after an intervention. In some cases, experimental and comparison groups are used. In other cases where that is not feasible, a single experimental group is used. By using multiple observations before and after the intervention, the researcher can better understand the true value of the dependent variable in each participant before the intervention starts. Additionally, multiple observations afterwards allow the researcher to see whether the intervention had lasting effects on participants. Time series designs are similar to single-subjects designs, which we will discuss in Chapter 15.

When true experiments and quasi-experiments are not possible, researchers may turn to a pre-experimental design (Campbell & Stanley, 1963). [4] Pre-experimental designs are called such because they often happen before a true experiment is conducted. Researchers want to see if their interventions will have some effect on a small group of people before they seek funding and dedicate time to conduct a true experiment. Pre-experimental designs, thus, are usually conducted as a first step towards establishing the evidence for or against an intervention. However, this type of design comes with some unique disadvantages, which we’ll describe as we review the pre-experimental designs available.

If we wished to measure the impact of a natural disaster, such as Hurricane Katrina for example, we might conduct a pre-experiment by identifying an experimental group from a community that experienced the hurricane and a control group from a similar community that had not been hit by the hurricane. This study design, called a static group comparison , has the advantage of including a comparison group that did not experience the stimulus (in this case, the hurricane). Unfortunately, it is difficult to know those groups are truly comparable because the experimental and control groups were determined by factors other than random assignment. Additionally, the design would only allow for posttests, unless one were lucky enough to be gathering the data already before Katrina. As you might have guessed from our example, static group comparisons are useful in cases where a researcher cannot control or predict whether, when, or how the stimulus is administered, as in the case of natural disasters.

In cases where the administration of the stimulus is quite costly or otherwise not possible, a one- shot case study design might be used. In this instance, no pretest is administered, nor is a control group present. In our example of the study of the impact of Hurricane Katrina, a researcher using this design would test the impact of Katrina only among a community that was hit by the hurricane and would not seek a comparison group from a community that did not experience the hurricane. Researchers using this design must be extremely cautious about making claims regarding the effect of the stimulus, though the design could be useful for exploratory studies aimed at testing one’s measures or the feasibility of further study.

Finally, if a researcher is unlikely to be able to identify a sample large enough to split into control and experimental groups, or if she simply doesn’t have access to a control group, the researcher might use a one-group pre-/posttest design. In this instance, pre- and posttests are both taken, but there is no control group to which to compare the experimental group. We might be able to study of the impact of Hurricane Katrina using this design if we’d been collecting data on the impacted communities prior to the hurricane. We could then collect similar data after the hurricane. Applying this design involves a bit of serendipity and chance. Without having collected data from impacted communities prior to the hurricane, we would be unable to employ a one- group pre-/posttest design to study Hurricane Katrina’s impact.

As implied by the preceding examples where we considered studying the impact of Hurricane Katrina, experiments do not necessarily need to take place in the controlled setting of a lab. In fact, many applied researchers rely on experiments to assess the impact and effectiveness of various programs and policies. You might recall our discussion of arresting perpetrators of domestic violence in Chapter 6, which is an excellent example of an applied experiment. Researchers did not subject participants to conditions in a lab setting; instead, they applied their stimulus (in this case, arrest) to some subjects in the field and they also had a control group in the field that did not receive the stimulus (and therefore were not arrested).

Key Takeaways

  • Quasi-experimental designs do not use random assignment.
  • Comparison groups are often used in quasi-experiments.
  • Matching is a way of improving the comparability of experimental and comparison groups.
  • Quasi-experimental designs and pre-experimental designs are often used when experimental designs are impractical.
  • Quasi-experimental and pre-experimental designs may be easier to carry out, but they lack the rigor of true experiments.
  • Aggregate matching- when the comparison group is determined to be similar to the experimental group along important variables
  • Ex post facto control group- a control group created when a researcher matches individuals after the intervention is administered
  • Individual matching- pairing participants with similar attributes for the purpose of assignment to groups
  • Natural experiments- situations in which comparable groups are created by differences that already occur in the real world
  • Nonequivalent comparison group design- a quasi-experimental design similar to a classic experimental design but without random assignment
  • One-group pre-/posttest design- a pre-experimental design that applies an intervention to one group but also includes a pretest
  • One-shot case study- a pre-experimental design that applies an intervention to only one group without a pretest
  • Pre-experimental designs- a variation of experimental design that lacks the rigor of experiments and is often used before a true experiment is conducted
  • Quasi-experimental design- designs lack random assignment to experimental and control groups
  • Static group design- uses an experimental group and a comparison group, without random assignment and pretesting
  • Time series design- a quasi-experimental design that uses multiple observations before and after an intervention

Image attributions

cat and kitten matching avocado costumes on the couch looking at the camera by Your Best Digs CC-BY-2.0

  • Rubin, C. & Babbie, S. (2017). Research methods for social work (9th edition) . Boston, MA: Cengage. ↵
  • Stratmann, T. & Wille, D. (2016). Certificate-of-need laws and hospital quality . Mercatus Center at George Mason University, Arlington, VA. Retrieved from: https://www.mercatus.org/system/files/mercatus-stratmann-wille-con-hospital-quality-v1.pdf ↵
  • Baicker, K., Taubman, S. L., Allen, H. L., Bernstein, M., Gruber, J. H., Newhouse, J. P., ... & Finkelstein, A. N. (2013). The Oregon experiment—effects of Medicaid on clinical outcomes. New England Journal of Medicine , 368 (18), 1713-1722. ↵
  • Campbell, D., & Stanley, J. (1963). Experimental and quasi-experimental designs for research . Chicago, IL: Rand McNally. ↵

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

experimental and quasi experimental examples

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved June 18, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Introduction to Experimental and Quasi-Experimental Design

  • First Online: 26 April 2022

Cite this chapter

experimental and quasi experimental examples

  • Melissa Whatley   ORCID: orcid.org/0000-0002-7073-6772 2  

Part of the book series: Springer Texts in Education ((SPTE))

634 Accesses

This chapter introduces readers to main concepts in experimental and quasi-experimental design. First, randomized control trials are introduced as the primary example of experimental design. Next, nonexperimental contexts, and particularly the use of propensity score matching to approximate the conditions of randomized control trials, are described. Finally, this chapter introduces two quasi-experimental design that are particularly useful in international education research: regression discontinuity and difference-in-differences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The extent to which randomization is possible in these scenarios is an important consideration that we will discuss later in this chapter.

Note that here I report only a subset of Meegan and Kashima’s (2010) results as I focus on only one of their two experimental conditions.

Advanced readers who are interested in these alternatives can read more about exact matching (Rubin, 1973 ), genetic matching (Diamond & Sekhon, 2013 ), or coarsened exact matching (Iacus, King, & Porro, 2011 ).

Sometimes, researchers are especially interested in estimating the impact of the treatment on the treated units (the average treatment effect on the treated [ATT]) rather than the average treatment effect for all units in a dataset (ATE). In this case, no treatment units are dropped from the dataset, even though some control units are.

A discussion of propensity score matching approaches is beyond the scope of this book, but several of the suggested readings at the end of this chapter delve deeper into matching techniques and their benefits and drawbacks.

Note that Iriondo (2020) estimates the impact of Erasmus participation on employment and salary using two different datasets. The results presented in this section correspond to his results for the Labor Insertion Survey.

Regression discontinuity and difference-in-differences are not the only two quasi-experimental designs available to researchers (other quasi-experimental designs include instrumental variable and time-series analyses). However, they are two of the more common designs and are also possibly the most useful to individuals conducting research in international education.

Recommended Reading

A deeper dive.

Cunningham, S. (2021). Causal inference: The mixtape . Yale University Press.

Google Scholar  

DesJardins, S. L., & Flaster, A. (2013). Nonexperimental designs and causal analyses of college access, persistence, and completion. In L. W. Perna & A. Jones (Eds.), The state of college access and completion: Improving college success for students from underrepresented groups (pp. 190–207). Routledge.

Diamond, A., & Sekhon, J. S. (2013). Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. Review of Economics and Statistics, 95 (3), 932–945.

Article   Google Scholar  

Furquim, F., Corral, D., & Hillman, N. (2020). A primer for interpreting and designing difference-in-difference studies in higher education research. In L. Perna (Ed.), Higher education: Handbook of theory and research (Vol. 35, pp. 2–53).

Iacus, S. M., King, G., & Porro, G. (2011). Multivariate matching methods that are monotonic imbalance bounding. Journal of the American Statistical Association, 106 (493), 345–361.

Murnane, R. J., & Willett, J. B. (2011). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.

Reynolds, C. L., & DesJardins, S. L. (2009). The use of matching methods in higher education research: Answering whether attendance at a 2-year institution results in differences in educational attainment. In  Higher education: Handbook of theory and research  (pp. 47–97). Springer.

Rosenbaum, P. R., & Rubin, D. B. (1985). Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 39 (1), 33–38.

Rubin, D. B. (1973). The use of matched sampling and regression adjustment to remove bias in observational studies.  Biometrics , (Vol. 29, pp.185–203).

Additional Examples

d’Hombres, B., & Schnepf, S. V. (2021). International mobility of students in Italy and the UK: Does it pay off and for whom? Higher Education . https://doi.org/10.1007/s10734-020-00631-1

Dicks, A., & Lancee, B. (2018). Double disadvantage in school? Children of immigrants and the relative age effect: A regression discontinuity design based on the month of birth. European Sociological Review, 34 (3), 319–333.

Marini, G., & Yang, L. (2021). Globally bred Chinese talents returning home: An analysis of a reverse brain-drain flagship policy. Science and Public Policy . https://doi.org/10.1093/scipol/scab021

Monogan, J. E., & Doctor, A. C. (2017). Immigration politics and partisan realignment: California, Texas, and the 1994 election. State Politics & Policy Quarterly, 17 (1), 3–23.

Download references

Author information

Authors and affiliations.

School for International Training, Brattleboro, VT, USA

Melissa Whatley

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Melissa Whatley .

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Whatley, M. (2022). Introduction to Experimental and Quasi-Experimental Design. In: Introduction to Quantitative Analysis for International Educators. Springer Texts in Education. Springer, Cham. https://doi.org/10.1007/978-3-030-93831-4_9

Download citation

DOI : https://doi.org/10.1007/978-3-030-93831-4_9

Published : 26 April 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-93830-7

Online ISBN : 978-3-030-93831-4

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Logo for Mavs Open Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

8.2 Quasi-experimental and pre-experimental designs

Learning objectives.

  • Identify and describe the various types of quasi-experimental designs
  • Distinguish true experimental designs from quasi-experimental and pre-experimental designs
  • Identify and describe the various types of quasi-experimental and pre-experimental designs

As we discussed in the previous section, time, funding, and ethics may limit a researcher’s ability to conduct a true experiment. For researchers in the medical sciences and social work, conducting a true experiment could require denying needed treatment to clients, which is a clear ethical violation. Even those whose research may not involve the administration of needed medications or treatments may be limited in their ability to conduct a classic experiment. When true experiments are not possible, researchers often use quasi-experimental designs.

Quasi-experimental designs

Quasi-experimental designs are similar to true experiments, but they lack random assignment to experimental and control groups. Quasi-experimental designs have a comparison group that is similar to a control group except assignment to the comparison group is not determined by random assignment. The most basic of these quasi-experimental designs is the nonequivalent comparison groups design (Rubin & Babbie, 2017).  The nonequivalent comparison group design looks a lot like the classic experimental design, except it does not use random assignment. In many cases, these groups may already exist. For example, a researcher might conduct research at two different agency sites, one of which receives the intervention and the other does not. No one was assigned to treatment or comparison groups. Those groupings existed prior to the study. While this method is more convenient for real-world research, it is less likely that that the groups are comparable than if they had been determined by random assignment. Perhaps the treatment group has a characteristic that is unique–for example, higher income or different diagnoses–that make the treatment more effective.

Quasi-experiments are particularly useful in social welfare policy research. Social welfare policy researchers often look for what are termed natural experiments , or situations in which comparable groups are created by differences that already occur in the real world. Natural experiments are a feature of the social world that allows researchers to use the logic of experimental design to investigate the connection between variables. For example, Stratmann and Wille (2016) were interested in the effects of a state healthcare policy called Certificate of Need on the quality of hospitals. They clearly could not randomly assign states to adopt one set of policies or another. Instead, researchers used hospital referral regions, or the areas from which hospitals draw their patients, that spanned across state lines. Because the hospitals were in the same referral region, researchers could be pretty sure that the client characteristics were pretty similar. In this way, they could classify patients in experimental and comparison groups without dictating state policy or telling people where to live.

experimental and quasi experimental examples

Matching is another approach in quasi-experimental design for assigning people to experimental and comparison groups. It begins with researchers thinking about what variables are important in their study, particularly demographic variables or attributes that might impact their dependent variable. Individual matching involves pairing participants with similar attributes. Then, the matched pair is split—with one participant going to the experimental group and the other to the comparison group. An ex post facto control group , in contrast, is when a researcher matches individuals after the intervention is administered to some participants. Finally, researchers may engage in aggregate matching , in which the comparison group is determined to be similar on important variables.

Time series design

There are many different quasi-experimental designs in addition to the nonequivalent comparison group design described earlier. Describing all of them is beyond the scope of this textbook, but one more design is worth mentioning. The time series design uses multiple observations before and after an intervention. In some cases, experimental and comparison groups are used. In other cases where that is not feasible, a single experimental group is used. By using multiple observations before and after the intervention, the researcher can better understand the true value of the dependent variable in each participant before the intervention starts. Additionally, multiple observations afterwards allow the researcher to see whether the intervention had lasting effects on participants. Time series designs are similar to single-subjects designs, which we will discuss in Chapter 15.

Pre-experimental design

When true experiments and quasi-experiments are not possible, researchers may turn to a pre-experimental design (Campbell & Stanley, 1963).  Pre-experimental designs are called such because they often happen as a pre-cursor to conducting a true experiment.  Researchers want to see if their interventions will have some effect on a small group of people before they seek funding and dedicate time to conduct a true experiment. Pre-experimental designs, thus, are usually conducted as a first step towards establishing the evidence for or against an intervention. However, this type of design comes with some unique disadvantages, which we’ll describe below.

A commonly used type of pre-experiment is the one-group pretest post-test design . In this design, pre- and posttests are both administered, but there is no comparison group to which to compare the experimental group. Researchers may be able to make the claim that participants receiving the treatment experienced a change in the dependent variable, but they cannot begin to claim that the change was the result of the treatment without a comparison group.   Imagine if the students in your research class completed a questionnaire about their level of stress at the beginning of the semester.  Then your professor taught you mindfulness techniques throughout the semester.  At the end of the semester, she administers the stress survey again.  What if levels of stress went up?  Could she conclude that the mindfulness techniques caused stress?  Not without a comparison group!  If there was a comparison group, she would be able to recognize that all students experienced higher stress at the end of the semester than the beginning of the semester, not just the students in her research class.

In cases where the administration of a pretest is cost prohibitive or otherwise not possible, a one- shot case study design might be used. In this instance, no pretest is administered, nor is a comparison group present. If we wished to measure the impact of a natural disaster, such as Hurricane Katrina for example, we might conduct a pre-experiment by identifying  a community that was hit by the hurricane and then measuring the levels of stress in the community.  Researchers using this design must be extremely cautious about making claims regarding the effect of the treatment or stimulus. They have no idea what the levels of stress in the community were before the hurricane hit nor can they compare the stress levels to a community that was not affected by the hurricane.  Nonetheless, this design can be useful for exploratory studies aimed at testing a measures or the feasibility of further study.

In our example of the study of the impact of Hurricane Katrina, a researcher might choose to examine the effects of the hurricane by identifying a group from a community that experienced the hurricane and a comparison group from a similar community that had not been hit by the hurricane. This study design, called a static group comparison , has the advantage of including a comparison group that did not experience the stimulus (in this case, the hurricane). Unfortunately, the design only uses for post-tests, so it is not possible to know if the groups were comparable before the stimulus or intervention.  As you might have guessed from our example, static group comparisons are useful in cases where a researcher cannot control or predict whether, when, or how the stimulus is administered, as in the case of natural disasters.

As implied by the preceding examples where we considered studying the impact of Hurricane Katrina, experiments, quasi-experiments, and pre-experiments do not necessarily need to take place in the controlled setting of a lab. In fact, many applied researchers rely on experiments to assess the impact and effectiveness of various programs and policies. You might recall our discussion of arresting perpetrators of domestic violence in Chapter 2, which is an excellent example of an applied experiment. Researchers did not subject participants to conditions in a lab setting; instead, they applied their stimulus (in this case, arrest) to some subjects in the field and they also had a control group in the field that did not receive the stimulus (and therefore were not arrested).

Key Takeaways

  • Quasi-experimental designs do not use random assignment.
  • Comparison groups are used in quasi-experiments.
  • Matching is a way of improving the comparability of experimental and comparison groups.
  • Quasi-experimental designs and pre-experimental designs are often used when experimental designs are impractical.
  • Quasi-experimental and pre-experimental designs may be easier to carry out, but they lack the rigor of true experiments.
  • Aggregate matching – when the comparison group is determined to be similar to the experimental group along important variables
  • Comparison group – a group in quasi-experimental design that does not receive the experimental treatment; it is similar to a control group except assignment to the comparison group is not determined by random assignment
  • Ex post facto control group – a control group created when a researcher matches individuals after the intervention is administered
  • Individual matching – pairing participants with similar attributes for the purpose of assignment to groups
  • Natural experiments – situations in which comparable groups are created by differences that already occur in the real world
  • Nonequivalent comparison group design – a quasi-experimental design similar to a classic experimental design but without random assignment
  • One-group pretest post-test design – a pre-experimental design that applies an intervention to one group but also includes a pretest
  • One-shot case study – a pre-experimental design that applies an intervention to only one group without a pretest
  • Pre-experimental designs – a variation of experimental design that lacks the rigor of experiments and is often used before a true experiment is conducted
  • Quasi-experimental design – designs lack random assignment to experimental and control groups
  • Static group design – uses an experimental group and a comparison group, without random assignment and pretesting
  • Time series design – a quasi-experimental design that uses multiple observations before and after an intervention

Image attributions

cat and kitten   matching avocado costumes on the couch looking at the camera by Your Best Digs CC-BY-2.0

Foundations of Social Work Research Copyright © 2020 by Rebecca L. Mauldin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Study.com

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

experimental and quasi experimental examples

  • > The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • > Quasi-Experimental Research

experimental and quasi experimental examples

Book contents

  • The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • Cambridge Handbooks in Psychology
  • Copyright page
  • Contributors
  • Part I From Idea to Reality: The Basics of Research
  • Part II The Building Blocks of a Study
  • Part III Data Collection
  • 13 Cross-Sectional Studies
  • 14 Quasi-Experimental Research
  • 15 Non-equivalent Control Group Pretest–Posttest Design in Social and Behavioral Research
  • 16 Experimental Methods
  • 17 Longitudinal Research: A World to Explore
  • 18 Online Research Methods
  • 19 Archival Data
  • 20 Qualitative Research Design
  • Part IV Statistical Approaches
  • Part V Tips for a Successful Research Career

14 - Quasi-Experimental Research

from Part III - Data Collection

Published online by Cambridge University Press:  25 May 2023

In this chapter, we discuss the logic and practice of quasi-experimentation. Specifically, we describe four quasi-experimental designs – one-group pretest–posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs – and their statistical analyses in detail. Both simple quasi-experimental designs and embellishments of these simple designs are presented. Potential threats to internal validity are illustrated along with means of addressing their potentially biasing effects so that these effects can be minimized. In contrast to quasi-experiments, randomized experiments are often thought to be the gold standard when estimating the effects of treatment interventions. However, circumstances frequently arise where quasi-experiments can usefully supplement randomized experiments or when quasi-experiments can fruitfully be used in place of randomized experiments. Researchers need to appreciate the relative strengths and weaknesses of the various quasi-experiments so they can choose among pre-specified designs or craft their own unique quasi-experiments.

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Quasi-Experimental Research
  • By Charles S. Reichardt , Daniel Storage , Damon Abraham
  • Edited by Austin Lee Nichols , Central European University, Vienna , John Edlund , Rochester Institute of Technology, New York
  • Book: The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • Online publication: 25 May 2023
  • Chapter DOI: https://doi.org/10.1017/9781009010054.015

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Am Med Inform Assoc
  • v.13(1); Jan-Feb 2006

The Use and Interpretation of Quasi-Experimental Studies in Medical Informatics

Associated data.

Quasi-experimental study designs, often described as nonrandomized, pre-post intervention studies, are common in the medical informatics literature. Yet little has been written about the benefits and limitations of the quasi-experimental approach as applied to informatics studies. This paper outlines a relative hierarchy and nomenclature of quasi-experimental study designs that is applicable to medical informatics intervention studies. In addition, the authors performed a systematic review of two medical informatics journals, the Journal of the American Medical Informatics Association (JAMIA) and the International Journal of Medical Informatics (IJMI), to determine the number of quasi-experimental studies published and how the studies are classified on the above-mentioned relative hierarchy. They hope that future medical informatics studies will implement higher level quasi-experimental study designs that yield more convincing evidence for causal links between medical informatics interventions and outcomes.

Quasi-experimental studies encompass a broad range of nonrandomized intervention studies. These designs are frequently used when it is not logistically feasible or ethical to conduct a randomized controlled trial. Examples of quasi-experimental studies follow. As one example of a quasi-experimental study, a hospital introduces a new order-entry system and wishes to study the impact of this intervention on the number of medication-related adverse events before and after the intervention. As another example, an informatics technology group is introducing a pharmacy order-entry system aimed at decreasing pharmacy costs. The intervention is implemented and pharmacy costs before and after the intervention are measured.

In medical informatics, the quasi-experimental, sometimes called the pre-post intervention, design often is used to evaluate the benefits of specific interventions. The increasing capacity of health care institutions to collect routine clinical data has led to the growing use of quasi-experimental study designs in the field of medical informatics as well as in other medical disciplines. However, little is written about these study designs in the medical literature or in traditional epidemiology textbooks. 1 , 2 , 3 In contrast, the social sciences literature is replete with examples of ways to implement and improve quasi-experimental studies. 4 , 5 , 6

In this paper, we review the different pretest-posttest quasi-experimental study designs, their nomenclature, and the relative hierarchy of these designs with respect to their ability to establish causal associations between an intervention and an outcome. The example of a pharmacy order-entry system aimed at decreasing pharmacy costs will be used throughout this article to illustrate the different quasi-experimental designs. We discuss limitations of quasi-experimental designs and offer methods to improve them. We also perform a systematic review of four years of publications from two informatics journals to determine the number of quasi-experimental studies, classify these studies into their application domains, determine whether the potential limitations of quasi-experimental studies were acknowledged by the authors, and place these studies into the above-mentioned relative hierarchy.

The authors reviewed articles and book chapters on the design of quasi-experimental studies. 4 , 5 , 6 , 7 , 8 , 9 , 10 Most of the reviewed articles referenced two textbooks that were then reviewed in depth. 4 , 6

Key advantages and disadvantages of quasi-experimental studies, as they pertain to the study of medical informatics, were identified. The potential methodological flaws of quasi-experimental medical informatics studies, which have the potential to introduce bias, were also identified. In addition, a summary table outlining a relative hierarchy and nomenclature of quasi-experimental study designs is described. In general, the higher the design is in the hierarchy, the greater the internal validity that the study traditionally possesses because the evidence of the potential causation between the intervention and the outcome is strengthened. 4

We then performed a systematic review of four years of publications from two informatics journals. First, we determined the number of quasi-experimental studies. We then classified these studies on the above-mentioned hierarchy. We also classified the quasi-experimental studies according to their application domain. The categories of application domains employed were based on categorization used by Yearbooks of Medical Informatics 1992–2005 and were similar to the categories of application domains employed by Annual Symposiums of the American Medical Informatics Association. 11 The categories were (1) health and clinical management; (2) patient records; (3) health information systems; (4) medical signal processing and biomedical imaging; (5) decision support, knowledge representation, and management; (6) education and consumer informatics; and (7) bioinformatics. Because the quasi-experimental study design has recognized limitations, we sought to determine whether authors acknowledged the potential limitations of this design. Examples of acknowledgment included mention of lack of randomization, the potential for regression to the mean, the presence of temporal confounders and the mention of another design that would have more internal validity.

All original scientific manuscripts published between January 2000 and December 2003 in the Journal of the American Medical Informatics Association (JAMIA) and the International Journal of Medical Informatics (IJMI) were reviewed. One author (ADH) reviewed all the papers to identify the number of quasi-experimental studies. Other authors (ADH, JCM, JF) then independently reviewed all the studies identified as quasi-experimental. The three authors then convened as a group to resolve any disagreements in study classification, application domain, and acknowledgment of limitations.

Results and Discussion

What is a quasi-experiment.

Quasi-experiments are studies that aim to evaluate interventions but that do not use randomization. Similar to randomized trials, quasi-experiments aim to demonstrate causality between an intervention and an outcome. Quasi-experimental studies can use both preintervention and postintervention measurements as well as nonrandomly selected control groups.

Using this basic definition, it is evident that many published studies in medical informatics utilize the quasi-experimental design. Although the randomized controlled trial is generally considered to have the highest level of credibility with regard to assessing causality, in medical informatics, researchers often choose not to randomize the intervention for one or more reasons: (1) ethical considerations, (2) difficulty of randomizing subjects, (3) difficulty to randomize by locations (e.g., by wards), (4) small available sample size. Each of these reasons is discussed below.

Ethical considerations typically will not allow random withholding of an intervention with known efficacy. Thus, if the efficacy of an intervention has not been established, a randomized controlled trial is the design of choice to determine efficacy. But if the intervention under study incorporates an accepted, well-established therapeutic intervention, or if the intervention has either questionable efficacy or safety based on previously conducted studies, then the ethical issues of randomizing patients are sometimes raised. In the area of medical informatics, it is often believed prior to an implementation that an informatics intervention will likely be beneficial and thus medical informaticians and hospital administrators are often reluctant to randomize medical informatics interventions. In addition, there is often pressure to implement the intervention quickly because of its believed efficacy, thus not allowing researchers sufficient time to plan a randomized trial.

For medical informatics interventions, it is often difficult to randomize the intervention to individual patients or to individual informatics users. So while this randomization is technically possible, it is underused and thus compromises the eventual strength of concluding that an informatics intervention resulted in an outcome. For example, randomly allowing only half of medical residents to use pharmacy order-entry software at a tertiary care hospital is a scenario that hospital administrators and informatics users may not agree to for numerous reasons.

Similarly, informatics interventions often cannot be randomized to individual locations. Using the pharmacy order-entry system example, it may be difficult to randomize use of the system to only certain locations in a hospital or portions of certain locations. For example, if the pharmacy order-entry system involves an educational component, then people may apply the knowledge learned to nonintervention wards, thereby potentially masking the true effect of the intervention. When a design using randomized locations is employed successfully, the locations may be different in other respects (confounding variables), and this further complicates the analysis and interpretation.

In situations where it is known that only a small sample size will be available to test the efficacy of an intervention, randomization may not be a viable option. Randomization is beneficial because on average it tends to evenly distribute both known and unknown confounding variables between the intervention and control group. However, when the sample size is small, randomization may not adequately accomplish this balance. Thus, alternative design and analytical methods are often used in place of randomization when only small sample sizes are available.

What Are the Threats to Establishing Causality When Using Quasi-experimental Designs in Medical Informatics?

The lack of random assignment is the major weakness of the quasi-experimental study design. Associations identified in quasi-experiments meet one important requirement of causality since the intervention precedes the measurement of the outcome. Another requirement is that the outcome can be demonstrated to vary statistically with the intervention. Unfortunately, statistical association does not imply causality, especially if the study is poorly designed. Thus, in many quasi-experiments, one is most often left with the question: “Are there alternative explanations for the apparent causal association?” If these alternative explanations are credible, then the evidence of causation is less convincing. These rival hypotheses, or alternative explanations, arise from principles of epidemiologic study design.

Shadish et al. 4 outline nine threats to internal validity that are outlined in ▶ . Internal validity is defined as the degree to which observed changes in outcomes can be correctly inferred to be caused by an exposure or an intervention. In quasi-experimental studies of medical informatics, we believe that the methodological principles that most often result in alternative explanations for the apparent causal effect include (a) difficulty in measuring or controlling for important confounding variables, particularly unmeasured confounding variables, which can be viewed as a subset of the selection threat in ▶ ; (b) results being explained by the statistical principle of regression to the mean . Each of these latter two principles is discussed in turn.

Threats to Internal Validity

1. Ambiguous temporal precedence: Lack of clarity about whether intervention occurred before outcome
2. Selection: Systematic differences over conditions in respondent characteristics that could also cause the observed effect
3. History: Events occurring concurrently with intervention could cause the observed effect
4. Maturation: Naturally occurring changes over time could be confused with a treatment effect
5. Regression: When units are selected for their extreme scores, they will often have less extreme subsequent scores, an occurrence that can be confused with an intervention effect
6. Attrition: Loss of respondents can produce artifactual effects if that loss is correlated with intervention
7. Testing: Exposure to a test can affect scores on subsequent exposures to that test
8. Instrumentation: The nature of a measurement may change over time or conditions
9. Interactive effects: The impact of an intervention may depend on the level of another intervention

Adapted from Shadish et al. 4

An inability to sufficiently control for important confounding variables arises from the lack of randomization. A variable is a confounding variable if it is associated with the exposure of interest and is also associated with the outcome of interest; the confounding variable leads to a situation where a causal association between a given exposure and an outcome is observed as a result of the influence of the confounding variable. For example, in a study aiming to demonstrate that the introduction of a pharmacy order-entry system led to lower pharmacy costs, there are a number of important potential confounding variables (e.g., severity of illness of the patients, knowledge and experience of the software users, other changes in hospital policy) that may have differed in the preintervention and postintervention time periods ( ▶ ). In a multivariable regression, the first confounding variable could be addressed with severity of illness measures, but the second confounding variable would be difficult if not nearly impossible to measure and control. In addition, potential confounding variables that are unmeasured or immeasurable cannot be controlled for in nonrandomized quasi-experimental study designs and can only be properly controlled by the randomization process in randomized controlled trials.

An external file that holds a picture, illustration, etc.
Object name is 16f01.jpg

Example of confounding. To get the true effect of the intervention of interest, we need to control for the confounding variable.

Another important threat to establishing causality is regression to the mean. 12 , 13 , 14 This widespread statistical phenomenon can result in wrongly concluding that an effect is due to the intervention when in reality it is due to chance. The phenomenon was first described in 1886 by Francis Galton who measured the adult height of children and their parents. He noted that when the average height of the parents was greater than the mean of the population, the children tended to be shorter than their parents, and conversely, when the average height of the parents was shorter than the population mean, the children tended to be taller than their parents.

In medical informatics, what often triggers the development and implementation of an intervention is a rise in the rate above the mean or norm. For example, increasing pharmacy costs and adverse events may prompt hospital informatics personnel to design and implement pharmacy order-entry systems. If this rise in costs or adverse events is really just an extreme observation that is still within the normal range of the hospital's pharmaceutical costs (i.e., the mean pharmaceutical cost for the hospital has not shifted), then the statistical principle of regression to the mean predicts that these elevated rates will tend to decline even without intervention. However, often informatics personnel and hospital administrators cannot wait passively for this decline to occur. Therefore, hospital personnel often implement one or more interventions, and if a decline in the rate occurs, they may mistakenly conclude that the decline is causally related to the intervention. In fact, an alternative explanation for the finding could be regression to the mean.

What Are the Different Quasi-experimental Study Designs?

In the social sciences literature, quasi-experimental studies are divided into four study design groups 4 , 6 :

  • Quasi-experimental designs without control groups
  • Quasi-experimental designs that use control groups but no pretest
  • Quasi-experimental designs that use control groups and pretests
  • Interrupted time-series designs

There is a relative hierarchy within these categories of study designs, with category D studies being sounder than categories C, B, or A in terms of establishing causality. Thus, if feasible from a design and implementation point of view, investigators should aim to design studies that fall in to the higher rated categories. Shadish et al. 4 discuss 17 possible designs, with seven designs falling into category A, three designs in category B, and six designs in category C, and one major design in category D. In our review, we determined that most medical informatics quasi-experiments could be characterized by 11 of 17 designs, with six study designs in category A, one in category B, three designs in category C, and one design in category D because the other study designs were not used or feasible in the medical informatics literature. Thus, for simplicity, we have summarized the 11 study designs most relevant to medical informatics research in ▶ .

Relative Hierarchy of Quasi-experimental Designs

Quasi-experimental Study DesignsDesign Notation
A. Quasi-experimental designs without control groups
    1. The one-group posttest-only designX O1
    2. The one-group pretest-posttest designO1 X O2
    3. The one-group pretest-posttest design using a double pretestO1 O2 X O3
    4. The one-group pretest-posttest design using a nonequivalent dependent variable(O1a, O1b) X (O2a, O2b)
    5. The removed-treatment designO1 X O2 O3 removeX O4
    6. The repeated-treatment designO1 X O2 removeX O3 X O4
B. Quasi-experimental designs that use a control group but no pretest
    1. Posttest-only design with nonequivalent groupsIntervention group: X O1
Control group: O2
C. Quasi-experimental designs that use control groups and pretests
    1. Untreated control group with dependent pretest and posttest samplesIntervention group: O1a X O2a
Control group: O1b O2b
    2. Untreated control group design with dependent pretest and posttest samples using a double pretestIntervention group: O1a O2a X O3a
Control group: O1b O2b O3b
    3. Untreated control group design with dependent pretest and posttest samples using switching replicationsIntervention group: O1a X O2a O3a
Control group: O1b O2b X O3b
D. Interrupted time-series design
    1. Multiple pretest and posttest observations spaced at equal intervals of timeO1 O2 O3 O4 O5 X O6 O7 O8 O9 O10

O = Observational Measurement; X = Intervention Under Study. Time moves from left to right.

The nomenclature and relative hierarchy were used in the systematic review of four years of JAMIA and the IJMI. Similar to the relative hierarchy that exists in the evidence-based literature that assigns a hierarchy to randomized controlled trials, cohort studies, case-control studies, and case series, the hierarchy in ▶ is not absolute in that in some cases, it may be infeasible to perform a higher level study. For example, there may be instances where an A6 design established stronger causality than a B1 design. 15 , 16 , 17

Quasi-experimental Designs without Control Groups

equation M1

Here, X is the intervention and O is the outcome variable (this notation is continued throughout the article). In this study design, an intervention (X) is implemented and a posttest observation (O1) is taken. For example, X could be the introduction of a pharmacy order-entry intervention and O1 could be the pharmacy costs following the intervention. This design is the weakest of the quasi-experimental designs that are discussed in this article. Without any pretest observations or a control group, there are multiple threats to internal validity. Unfortunately, this study design is often used in medical informatics when new software is introduced since it may be difficult to have pretest measurements due to time, technical, or cost constraints.

equation M2

This is a commonly used study design. A single pretest measurement is taken (O1), an intervention (X) is implemented, and a posttest measurement is taken (O2). In this instance, period O1 frequently serves as the “control” period. For example, O1 could be pharmacy costs prior to the intervention, X could be the introduction of a pharmacy order-entry system, and O2 could be the pharmacy costs following the intervention. Including a pretest provides some information about what the pharmacy costs would have been had the intervention not occurred.

equation M3

The advantage of this study design over A2 is that adding a second pretest prior to the intervention helps provide evidence that can be used to refute the phenomenon of regression to the mean and confounding as alternative explanations for any observed association between the intervention and the posttest outcome. For example, in a study where a pharmacy order-entry system led to lower pharmacy costs (O3 < O2 and O1), if one had two preintervention measurements of pharmacy costs (O1 and O2) and they were both elevated, this would suggest that there was a decreased likelihood that O3 is lower due to confounding and regression to the mean. Similarly, extending this study design by increasing the number of measurements postintervention could also help to provide evidence against confounding and regression to the mean as alternate explanations for observed associations.

equation M4

This design involves the inclusion of a nonequivalent dependent variable ( b ) in addition to the primary dependent variable ( a ). Variables a and b should assess similar constructs; that is, the two measures should be affected by similar factors and confounding variables except for the effect of the intervention. Variable a is expected to change because of the intervention X, whereas variable b is not. Taking our example, variable a could be pharmacy costs and variable b could be the length of stay of patients. If our informatics intervention is aimed at decreasing pharmacy costs, we would expect to observe a decrease in pharmacy costs but not in the average length of stay of patients. However, a number of important confounding variables, such as severity of illness and knowledge of software users, might affect both outcome measures. Thus, if the average length of stay did not change following the intervention but pharmacy costs did, then the data are more convincing than if just pharmacy costs were measured.

The Removed-Treatment Design

equation M5

This design adds a third posttest measurement (O3) to the one-group pretest-posttest design and then removes the intervention before a final measure (O4) is made. The advantage of this design is that it allows one to test hypotheses about the outcome in the presence of the intervention and in the absence of the intervention. Thus, if one predicts a decrease in the outcome between O1 and O2 (after implementation of the intervention), then one would predict an increase in the outcome between O3 and O4 (after removal of the intervention). One caveat is that if the intervention is thought to have persistent effects, then O4 needs to be measured after these effects are likely to have disappeared. For example, a study would be more convincing if it demonstrated that pharmacy costs decreased after pharmacy order-entry system introduction (O2 and O3 less than O1) and that when the order-entry system was removed or disabled, the costs increased (O4 greater than O2 and O3 and closer to O1). In addition, there are often ethical issues in this design in terms of removing an intervention that may be providing benefit.

The Repeated-Treatment Design

equation M6

The advantage of this design is that it demonstrates reproducibility of the association between the intervention and the outcome. For example, the association is more likely to be causal if one demonstrates that a pharmacy order-entry system results in decreased pharmacy costs when it is first introduced and again when it is reintroduced following an interruption of the intervention. As for design A5, the assumption must be made that the effect of the intervention is transient, which is most often applicable to medical informatics interventions. Because in this design, subjects may serve as their own controls, this may yield greater statistical efficiency with fewer numbers of subjects.

Quasi-experimental Designs That Use a Control Group but No Pretest

equation M7

An intervention X is implemented for one group and compared to a second group. The use of a comparison group helps prevent certain threats to validity including the ability to statistically adjust for confounding variables. Because in this study design, the two groups may not be equivalent (assignment to the groups is not by randomization), confounding may exist. For example, suppose that a pharmacy order-entry intervention was instituted in the medical intensive care unit (MICU) and not the surgical intensive care unit (SICU). O1 would be pharmacy costs in the MICU after the intervention and O2 would be pharmacy costs in the SICU after the intervention. The absence of a pretest makes it difficult to know whether a change has occurred in the MICU. Also, the absence of pretest measurements comparing the SICU to the MICU makes it difficult to know whether differences in O1 and O2 are due to the intervention or due to other differences in the two units (confounding variables).

Quasi-experimental Designs That Use Control Groups and Pretests

The reader should note that with all the studies in this category, the intervention is not randomized. The control groups chosen are comparison groups. Obtaining pretest measurements on both the intervention and control groups allows one to assess the initial comparability of the groups. The assumption is that if the intervention and the control groups are similar at the pretest, the smaller the likelihood there is of important confounding variables differing between the two groups.

equation M8

The use of both a pretest and a comparison group makes it easier to avoid certain threats to validity. However, because the two groups are nonequivalent (assignment to the groups is not by randomization), selection bias may exist. Selection bias exists when selection results in differences in unit characteristics between conditions that may be related to outcome differences. For example, suppose that a pharmacy order-entry intervention was instituted in the MICU and not the SICU. If preintervention pharmacy costs in the MICU (O1a) and SICU (O1b) are similar, it suggests that it is less likely that there are differences in the important confounding variables between the two units. If MICU postintervention costs (O2a) are less than preintervention MICU costs (O1a), but SICU costs (O1b) and (O2b) are similar, this suggests that the observed outcome may be causally related to the intervention.

equation M9

In this design, the pretests are administered at two different times. The main advantage of this design is that it controls for potentially different time-varying confounding effects in the intervention group and the comparison group. In our example, measuring points O1 and O2 would allow for the assessment of time-dependent changes in pharmacy costs, e.g., due to differences in experience of residents, preintervention between the intervention and control group, and whether these changes were similar or different.

equation M10

With this study design, the researcher administers an intervention at a later time to a group that initially served as a nonintervention control. The advantage of this design over design C2 is that it demonstrates reproducibility in two different settings. This study design is not limited to two groups; in fact, the study results have greater validity if the intervention effect is replicated in different groups at multiple times. In the example of a pharmacy order-entry system, one could implement or intervene in the MICU and then at a later time, intervene in the SICU. This latter design is often very applicable to medical informatics where new technology and new software is often introduced or made available gradually.

Interrupted Time-Series Designs

equation M11

An interrupted time-series design is one in which a string of consecutive observations equally spaced in time is interrupted by the imposition of a treatment or intervention. The advantage of this design is that with multiple measurements both pre- and postintervention, it is easier to address and control for confounding and regression to the mean. In addition, statistically, there is a more robust analytic capability, and there is the ability to detect changes in the slope or intercept as a result of the intervention in addition to a change in the mean values. 18 A change in intercept could represent an immediate effect while a change in slope could represent a gradual effect of the intervention on the outcome. In the example of a pharmacy order-entry system, O1 through O5 could represent monthly pharmacy costs preintervention and O6 through O10 monthly pharmacy costs post the introduction of the pharmacy order-entry system. Interrupted time-series designs also can be further strengthened by incorporating many of the design features previously mentioned in other categories (such as removal of the treatment, inclusion of a nondependent outcome variable, or the addition of a control group).

Systematic Review Results

The results of the systematic review are in ▶ . In the four-year period of JAMIA publications that the authors reviewed, 25 quasi-experimental studies among 22 articles were published. Of these 25, 15 studies were of category A, five studies were of category B, two studies were of category C, and no studies were of category D. Although there were no studies of category D (interrupted time-series analyses), three of the studies classified as category A had data collected that could have been analyzed as an interrupted time-series analysis. Nine of the 25 studies (36%) mentioned at least one of the potential limitations of the quasi-experimental study design. In the four-year period of IJMI publications reviewed by the authors, nine quasi-experimental studies among eight manuscripts were published. Of these nine, five studies were of category A, one of category B, one of category C, and two of category D. Two of the nine studies (22%) mentioned at least one of the potential limitations of the quasi-experimental study design.

Systematic Review of Four Years of Quasi-designs in JAMIA

StudyJournalInformatics Topic CategoryQuasi-experimental DesignLimitation of Quasi-design Mentioned in Article
Staggers and Kobus JAMIA1Counterbalanced study designYes
Schriger et al. JAMIA1A5Yes
Patel et al. JAMIA2A5 (study 1, phase 1)No
Patel et al. JAMIA2A2 (study 1, phase 2)No
Borowitz JAMIA1A2No
Patterson and Harasym JAMIA6C1Yes
Rocha et al. JAMIA5A2Yes
Lovis et al. JAMIA1Counterbalanced study designNo
Hersh et al. JAMIA6B1No
Makoul et al. JAMIA2B1Yes
Ruland JAMIA3B1No
DeLusignan et al. JAMIA1A1No
Mekhjian et al. JAMIA1A2 (study design 1)Yes
Mekhjian et al. JAMIA1B1 (study design 2)Yes
Ammenwerth et al. JAMIA1A2No
Oniki et al. JAMIA5C1Yes
Liederman and Morefield JAMIA1A1 (study 1)No
Liederman and Morefield JAMIA1A2 (study 2)No
Rotich et al. JAMIA2A2 No
Payne et al. JAMIA1A1No
Hoch et al. JAMIA3A2 No
Laerum et al. JAMIA1B1Yes
Devine et al. JAMIA1Counterbalanced study design
Dunbar et al. JAMIA6A1
Lenert et al. JAMIA6A2
Koide et al. IJMI5D4No
Gonzalez-Hendrich et al. IJMI2A1No
Anantharaman and Swee Han IJMI3B1No
Chae et al. IJMI6A2No
Lin et al. IJMI3A1No
Mikulich et al. IJMI1A2Yes
Hwang et al. IJMI1A2Yes
Park et al. IJMI1C2No
Park et al. IJMI1D4No

JAMIA = Journal of the American Medical Informatics Association; IJMI = International Journal of Medical Informatics.

In addition, three studies from JAMIA were based on a counterbalanced design. A counterbalanced design is a higher order study design than other studies in category A. The counterbalanced design is sometimes referred to as a Latin-square arrangement. In this design, all subjects receive all the different interventions but the order of intervention assignment is not random. 19 This design can only be used when the intervention is compared against some existing standard, for example, if a new PDA-based order entry system is to be compared to a computer terminal–based order entry system. In this design, all subjects receive the new PDA-based order entry system and the old computer terminal-based order entry system. The counterbalanced design is a within-participants design, where the order of the intervention is varied (e.g., one group is given software A followed by software B and another group is given software B followed by software A). The counterbalanced design is typically used when the available sample size is small, thus preventing the use of randomization. This design also allows investigators to study the potential effect of ordering of the informatics intervention.

Although quasi-experimental study designs are ubiquitous in the medical informatics literature, as evidenced by 34 studies in the past four years of the two informatics journals, little has been written about the benefits and limitations of the quasi-experimental approach. As we have outlined in this paper, a relative hierarchy and nomenclature of quasi-experimental study designs exist, with some designs being more likely than others to permit causal interpretations of observed associations. Strengths and limitations of a particular study design should be discussed when presenting data collected in the setting of a quasi-experimental study. Future medical informatics investigators should choose the strongest design that is feasible given the particular circumstances.

Supplementary Material

Dr. Harris was supported by NIH grants K23 AI01752-01A1 and R01 AI60859-01A1. Dr. Perencevich was supported by a VA Health Services Research and Development Service (HSR&D) Research Career Development Award (RCD-02026-1). Dr. Finkelstein was supported by NIH grant RO1 HL71690.

Connecting to

Sign in with your account to access State.gov CMS

Javascript is required

Javascript is disabled on your browser. please enable javascript and refresh this page., your onedrive version is not supported.

Upgrade now by installing the OneDrive for Business Next Generation Sync Client to login to Okta

Cookies are required

Cookies are disabled on your browser. Please enable Cookies and refresh this page.

Powered by Okta

Privacy Policy

Department of State

The page has timed out

If this page does not reload automatically, please refresh your browser.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Effectiveness of an Emergency Department-Based Machine Learning Clinical Decision Support Tool to Prevent Outpatient Falls Among Older Adults: Protocol for a Quasi-Experimental Study

Affiliations.

  • 1 BerbeeWalsh Department of Emergency Medicine, University of Wisconsin-Madison, Madison, WI, United States.
  • 2 Department of Population Health, University of Wisconsin-Madison, Madison, WI, United States.
  • 3 Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI, United States.
  • 4 Health Innovation Program, University of Wisconsin-Madison, Madison, WI, United States.
  • 5 Department of Applied Data Science, UWHealth Hospitals and Clinics, University of Wisconsin-Madison, Madison, WI, United States.
  • PMID: 37535416
  • PMCID: PMC10436111
  • DOI: 10.2196/48128

Background: Emergency department (ED) providers are important collaborators in preventing falls for older adults because they are often the first health care providers to see a patient after a fall and because at-home falls are often preceded by previous ED visits. Previous work has shown that ED referrals to falls interventions can reduce the risk of an at-home fall by 38%. Screening patients at risk for a fall can be time-consuming and difficult to implement in the ED setting. Machine learning (ML) and clinical decision support (CDS) offer the potential of automating the screening process. However, it remains unclear whether automation of screening and referrals can reduce the risk of future falls among older patients.

Objective: The goal of this paper is to describe a research protocol for evaluating the effectiveness of an automated screening and referral intervention. These findings will inform ongoing discussions about the use of ML and artificial intelligence to augment medical decision-making.

Methods: To assess the effectiveness of our program for patients receiving the falls risk intervention, our primary analysis will be to obtain referral completion rates at 3 different EDs. We will use a quasi-experimental design known as a sharp regression discontinuity with regard to intent-to-treat, since the intervention is administered to patients whose risk score falls above a threshold. A conditional logistic regression model will be built to describe 6-month fall risk at each site as a function of the intervention, patient demographics, and risk score. The odds ratio of a return visit for a fall and the 95% CI will be estimated by comparing those identified as high risk by the ML-based CDS (ML-CDS) and those who were not but had a similar risk profile.

Results: The ML-CDS tool under study has been implemented at 2 of the 3 EDs in our study. As of April 2023, a total of 1326 patient encounters have been flagged for providers, and 339 unique patients have been referred to the mobility and falls clinic. To date, 15% (45/339) of patients have scheduled an appointment with the clinic.

Conclusions: This study seeks to quantify the impact of an ML-CDS intervention on patient behavior and outcomes. Our end-to-end data set allows for a more meaningful analysis of patient outcomes than other studies focused on interim outcomes, and our multisite implementation plan will demonstrate applicability to a broad population and the possibility to adapt the intervention to other EDs and achieve similar results. Our statistical methodology, regression discontinuity design, allows for causal inference from observational data and a staggered implementation strategy allows for the identification of secular trends that could affect causal associations and allow mitigation as necessary.

Trial registration: ClinicalTrials.gov NCT05810064 ; https://www.clinicaltrials.gov/study/NCT05810064.

International registered report identifier (irrid): DERR1-10.2196/48128.

Keywords: automated screening; clinical decision support; emergency medicine; falls; geriatrics; machine learning.

©Daniel J Hekman, Amy L Cochran, Apoorva P Maru, Hanna J Barton, Manish N Shah, Douglas Wiegmann, Maureen A Smith, Frank Liao, Brian W Patterson. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 03.08.2023.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Operationalization of automated falls risk…

Operationalization of automated falls risk screening and referral process in the ED (adapted…

Similar articles

  • Academic Detailing as a Health Information Technology Implementation Method: Supporting the Design and Implementation of an Emergency Department-Based Clinical Decision Support Tool to Prevent Future Falls. Barton HJ, Maru A, Leaf MA, Hekman DJ, Wiegmann DA, Shah MN, Patterson BW. Barton HJ, et al. JMIR Hum Factors. 2024 Apr 18;11:e52592. doi: 10.2196/52592. JMIR Hum Factors. 2024. PMID: 38635318 Free PMC article.
  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
  • Interventions to Prevent Falls in Community-Dwelling Older Adults: A Systematic Review for the U.S. Preventive Services Task Force [Internet]. Guirguis-Blake JM, Michael YL, Perdue LA, Coppola EL, Beil TL, Thompson JH. Guirguis-Blake JM, et al. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Apr. Report No.: 17-05232-EF-1. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Apr. Report No.: 17-05232-EF-1. PMID: 30234932 Free Books & Documents. Review.
  • Evidence Brief: Comparative Effectiveness of Appointment Recall Reminder Procedures for Follow-up Appointments [Internet]. Peterson K, McCleery E, Anderson J, Waldrip K, Helfand M. Peterson K, et al. Washington (DC): Department of Veterans Affairs (US); 2015 Jul. Washington (DC): Department of Veterans Affairs (US); 2015 Jul. PMID: 27606388 Free Books & Documents. Review.
  • Prevention of falls and fall-related injuries in community-dwelling seniors: an evidence-based analysis. Medical Advisory Secretariat. Medical Advisory Secretariat. Ont Health Technol Assess Ser. 2008;8(2):1-78. Epub 2008 Oct 1. Ont Health Technol Assess Ser. 2008. PMID: 23074507 Free PMC article.
  • Dashboarding to Monitor Machine-Learning-Based Clinical Decision Support Interventions. Hekman DJ, Barton HJ, Maru AP, Wills G, Cochran AL, Fritsch C, Wiegmann DA, Liao F, Patterson BW. Hekman DJ, et al. Appl Clin Inform. 2024 Jan;15(1):164-169. doi: 10.1055/a-2219-5175. Epub 2023 Nov 29. Appl Clin Inform. 2024. PMID: 38029792
  • Moreland BL, Kakara R, Haddad YK, Shakya I, Bergen G. A descriptive analysis of location of older adult falls that resulted in emergency department visits in the United States, 2015. Am J Lifestyle Med. 2021;15(6):590–597. doi: 10.1177/1559827620942187. https://europepmc.org/abstract/MED/34916877 10.1177_1559827620942187 - DOI - PMC - PubMed
  • CDC Older adult falls data. US Centers for Disease Control and Prevention. 2020. [2022-09-14]. https://www.cdc.gov/falls/data/index.html .
  • Bloch F, Jegou D, Dhainaut JF, Rigaud AS, Coste J, Lundy JE, Claessens YE. Do ED staffs have a role to play in the prevention of repeat falls in elderly patients? Am J Emerg Med. 2009 Mar;27(3):303–7. doi: 10.1016/j.ajem.2008.02.026.S0735-6757(08)00160-5 - DOI - PubMed
  • Patterson BW, Jacobsohn GC, Maru AP, Venkatesh AK, Smith MA, Shah MN, Mendonça Eneida A. Comparing strategies for identifying falls in older adult emergency department visits using EHR data. J Am Geriatr Soc. 2020 Dec;68(12):2965–2967. doi: 10.1111/jgs.16831. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7744315/ - DOI - PMC - PubMed
  • Ayoung-Chee P, McIntyre L, Ebel BE, Mack CD, McCormick W, Maier RV. Long-term outcomes of ground-level falls in the elderly. J Trauma Acute Care Surg. 2014;76(2):498–503. doi: 10.1097/TA.0000000000000102. discussion 503.01586154-201402000-00037 - DOI - PubMed

Associated data

  • Search in ClinicalTrials.gov

Related information

Grants and funding.

  • R18 HS027735/HS/AHRQ HHS/United States

LinkOut - more resources

Full text sources.

  • Europe PubMed Central
  • JMIR Publications
  • PubMed Central

Research Materials

  • NCI CPTC Antibody Characterization Program

Miscellaneous

  • NCI CPTAC Assay Portal
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Assessing rates and predictors of cannabis-associated psychotic symptoms across observational, experimental and medical research

  • Tabea Schoeler   ORCID: orcid.org/0000-0003-4846-2741 1 , 2 ,
  • Jessie R. Baldwin 2 , 3 ,
  • Ellen Martin 2 ,
  • Wikus Barkhuizen 2 &
  • Jean-Baptiste Pingault   ORCID: orcid.org/0000-0003-2557-4716 2 , 3  

Nature Mental Health ( 2024 ) Cite this article

2295 Accesses

84 Altmetric

Metrics details

  • Outcomes research
  • Risk factors

Cannabis, one of the most widely used psychoactive substances worldwide, can give rise to acute cannabis-associated psychotic symptoms (CAPS). While distinct study designs have been used to examine CAPS, an overarching synthesis of the existing findings has not yet been carried forward. To that end, we quantitatively pooled the evidence on rates and predictors of CAPS ( k  = 162 studies, n  = 210,283 cannabis-exposed individuals) as studied in (1) observational research, (2) experimental tetrahydrocannabinol (THC) studies, and (3) medicinal cannabis research. We found that rates of CAPS varied substantially across the study designs, given the high rates reported by observational and experimental research (19% and 21%, respectively) but not medicinal cannabis studies (2%). CAPS was predicted by THC administration (for example, single dose, Cohen’s d  = 0.7), mental health liabilities (for example, bipolar disorder, d  = 0.8), dopamine activity ( d  = 0.4), younger age ( d  = −0.2), and female gender ( d  = −0.09). Neither candidate genes (for example, COMT , AKT1 ) nor other demographic variables (for example, education) predicted CAPS in meta-analytical models. The results reinforce the need to more closely monitor adverse cannabis-related outcomes in vulnerable individuals as these individuals may benefit most from harm-reduction efforts.

Similar content being viewed by others

experimental and quasi experimental examples

Do AKT1, COMT and FAAH influence reports of acute cannabis intoxication experiences in patients with first episode psychosis, controls and young adult cannabis users?

experimental and quasi experimental examples

Rates and correlates of cannabis-associated psychotic symptoms in over 230,000 people who use cannabis

experimental and quasi experimental examples

Measuring the diversity gap of cannabis clinical trial participants compared to people who report using cannabis

Cannabis, one of the most widely used psychoactive substances in the world, 1 is commonly used as a recreational substance and is increasingly taken for medicinal purposes. 2 , 3 As a recreational substance, cannabis use is particularly prevalent among young people 1 who seek its rewarding acute effects such as relaxation, euphoria, or sociability. 4 When used as a medicinal product, cannabis is typically prescribed to alleviate clinical symptoms in individuals with pre-existing health conditions (for example, epilepsy, multiple sclerosis, chronic pain, nausea. 5 )

Given the widespread use of cannabis, alongside the shifts toward legalization of cannabis for medicinal and recreational purposes, momentum is growing to scrutinize both the potential therapeutic and adverse effects of cannabis on health. From a public health perspective, of particular concern are the increasing rates of cannabis-associated emergency department presentations, 6 the rising levels of THC (tetrahydrocannabinol, the main psychoactive ingredient in cannabis) in street cannabis, 7 the adverse events associated with medicinal cannabis use, 8 and the long-term health hazards associated with cannabis use. 9 In this context, risk of psychosis as a major adverse health outcome related to cannabis use has been studied extensively, suggesting that early-onset and heavy cannabis use constitutes a contributory cause of psychosis. 10 , 11 , 12

More recent research has started to examine the more acute cannabis-associated psychotic symptoms (CAPS) to understand better how individual vulnerabilities and the pharmacological properties of cannabis elicit adverse reactions in individuals exposed to cannabis. Indeed, transient psychosis-like symptoms, including hallucinations or paranoia during cannabis intoxication, are well documented. 5 , 13 , 14 In more rare cases, recreational cannabis users experience severe forms of CAPS, 15 requiring emergency medical treatment as a result of acute CAPS. 16 In addition, acute psychosis following THC administration has been documented in medicinal cannabis trials and experimental studies, 17 , 18 , 19 suggesting that CAPS can also occur in more-controlled environments.

While numerous studies have provided evidence on CAPS in humans, no research has yet synthesized and compared the findings obtained from different study designs and populations. More specifically, three distinct study types have focused on CAPS: (1) observational studies assessing the subjective experiences of cannabis intoxication in recreational cannabis users, (2) experimental challenge studies administering THC in healthy volunteers, and (3) medicinal cannabis studies documenting adverse events when testing medicinal cannabis products in individuals with pre-existing health conditions. As such, the availability of these three distinct lines of evidence provides a unique research opportunity as their findings can be synthesized, be inspected for convergence, and ultimately, contribute to more evidence-based harm-reduction initiatives.

In this work, we therefore aim to perform a quantitative synthesis of all existing evidence examining CAPS to advance our understanding concerning the rates and predictors of CAPS: First, it is currently unknown how common CAPS are among individuals exposed to cannabis. While rates of CAPS are reported by numerous studies, estimates vary substantially (for example, from <1% (ref. 20 ) to 70% (ref. 21 )) and may differ depending on the assessed symptom profile (for example, cannabis-associated hallucinations versus cannabis-associated paranoia), the study design (for example, observational versus experimental research), and the population (for example, healthy volunteers versus medicinal cannabis users). Second, distinct study designs have scrutinized similar questions concerning the risks involved in CAPS. As such, comparisons of the results from one study design (for example, observational studies, assessing self-reported cannabis use in recreational users 22 , 23 ) with another study design (for example, experimental studies administering varying doses of THC 24 , 25 ) can be used to triangulate findings on a given risk factor of interest (for example, potency of cannabis). Finally, studies focusing on predictors of CAPS typically assess hypothesized risk factors in isolation. Pooling all existing evidence across different risk factors therefore provides a more complete picture of the relative magnitude of the individual risk factors involved in CAPS.

In summary, this work is set out to synthesize all of the available evidence on CAPS across three lines of research. In light of the increasingly liberal cannabis policies around the world, alongside the rising levels of THC in cannabis, such efforts are key to informing harm-reduction strategies and future research avenues for public health. Considering that individuals presenting with acute cannabis-induced psychosis are at high risk of converting to a psychotic disorder (for example, rates ranging between 18% (ref. 26 ) and 45% (ref. 27 )), a deeper understanding of factors predicting CAPS would contribute to our understanding concerning risk of long-term psychosis in the context of cannabis use.

Of 20,428 published studies identified by the systematic search, 162 were included in this work. The reasons for exclusion are detailed in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram (Fig. 1 ; see Supplementary Fig. 1 for a breakdown of the number of independent participants included in the different analytical models). The PRISMA reporting checklist is included in the Supplementary Results . At the full-text screening stage, the majority of studies were excluded because they did not report data on CAPS (83.88% of all excluded studies). Figure 2 displays the number of published studies included ( k ) and the number of (non-overlapping) study participants ( n ) per study design, highlighting that out of all participants included in this meta-analysis ( n  = 201,283), most took part in observational research ( n  = 174,300; 82.89%), followed by studies assessing medicinal cannabis products ( n  = 33,502; 15.93%), experimental studies administering THC ( n  = 2,009; 0.96%), and quasi-experimental studies ( n  = 472; 0.22%). Screening of 10% of the studies at the full-text stage by an independent researcher (E.M.) did not identify missed studies.

figure 1

Flow chart as adapted from the PRISMA flow chart ( http://www.prisma-statement.org/ ). Independent study participants are defined as the maximum number of participants available for an underlying study sample assessed in one or more of the included studies.

figure 2

Number of included studies per year of publication and study design, including observational research assessing recreational cannabis users, experimental studies administering THC in healthy volunteers, and medicinal studies assessing adverse events in individuals taking cannabis products for medicinal use. Quasi-experimental research involved research testing the effects of THC administration in a naturalistic setting. 23 , 62 k , number of studies; n , number of (non-overlapping) study participants.

Rates of CAPS across the three study designs

A total of 99 studies published between 1971 and 2023 reported data on rates of CAPS and were included in the analysis, comprising 126,430 individuals from independent samples. Convergence of the data extracted by the two researchers (T.S. and W.B.) was high for the pooled rates on CAPS from observational studies (rate DIFF  = −0.01%, where rate DIFF  = rate TS  – rate WB ), experimental studies (rate DIFF  = 0%), and medicinal cannabis studies (rate DIFF  = 0%). More specifically, we included data from 41 observational studies ( n  = 92,888 cannabis users), 19 experimental studies administering THC ( n  = 754), and 79 studies assessing efficacy and tolerability of medicinal cannabis products containing THC ( n  = 32,821). In medicinal trials, the most common conditions treated with THC were pain ( k  = 19 (23.75%)) and cancer ( k  = 16 (20%)) (see Supplementary Table 1 for an overview). The age distribution of the included participants was similar in observational studies (mean age = 24.47 years, ranging from 16.6 to 34.34 years) and experimental studies (mean age = 25.1 years, ranging from 22.47 to 27.3 years). Individuals taking part in medicinal trials were substantially older (mean age = 48.16 years, ranging from 8 to 74.5 years).

As summarized in Fig. 3 and Supplementary Table 3 , substantial rates of CAPS were reported by observational studies (19.4%, 95% confidence interval (CI): 14.2%, 24.6%) and THC-challenge studies (21%, 95% CI: 11.3%, 30.7%), but not medicinal cannabis studies (1.5%, 95% CI: 1.1%, 1.9%). The pooled rates estimated for different symptom profiles of CAPS (CAPS – paranoia, CAPS – hallucinations, CAPS – delusions) are displayed in Supplementary Fig. 2 . All individual study estimates are listed in Supplementary Table 2 .

figure 3

Pooled rates of CAPS across the three different study designs. Estimates on the y axis are the rates (in %, 95% confidence interval) obtained from models pooling together estimates on rates of CAPS (including psychosis-like symptoms, paranoia, hallucinations, and delusions) per study design.

Most models showed significant levels of heterogeneity (Supplementary Table 3 ), highlighting that rates of CAPS differed as a function of study-specific features. Risk of publication bias was indicated ( P Peters  < 0.05) for one of the meta-analytical models combining all rates of CAPS (see funnel plots, Supplementary Fig. 2 ). Applying the trim-and-fill method slightly reduced the pooled rate of CAPS obtained from medicinal cannabis studies (rate unadjusted  = 1.53%; rate adjusted  = 1.18%). Finally, Fig. 4 summarizes rates of CAPS of a subset of studies where CAPS was defined as the occurrence of a full-blown cannabis-associated psychotic episode (as described in Table 1 ). When combined, the rate of CAPS (full episode) was 0.52% (0.42–0.62%) across the three study designs, highlighting that around one in 200 individuals experienced a severe episode of psychosis when exposed to cannabis/THC. Rates of CAPS (full episode) as reported by the individual studies showed high levels of consistency ( I 2  = 8%, P(I 2 ) = 0.45; Fig. 4 ).

figure 4

Studies reporting rates of cannabis-associated psychosis (full episode). Depicted in violet are the individual study estimates (in %, 95% confidence interval) of studies reporting rates of (full-blown) cannabis-associated psychotic episodes. Included are studies using medicinal cannabis, observational, or experimental samples. The pooled meta-analyzed estimate is colored in blue. The I 2 statistic (scale of 0 to 100) indexes the level of heterogeneity across the estimates included in the meta-analysis.

Predictors of cannabis-associated psychotic symptoms

Assessing predictors of CAPS, we included 103 studies published between 1976 and 2023, corresponding to 80 independent samples ( n  = 170,158 non-overlapping individuals). In total, we extracted 381 Cohen’s d that were pooled in 44 separate meta-analytical models. A summary of all extracted study estimates is provided in Supplementary Table 4 . Comparing the P values of the individual Cohen’s d to the original P values as reported in the studies revealed a high level of concordance ( r  = 0.96 P  = 1.1 × 10 –79 ), indicating that the conversion of the raw study estimates to a common metric did not result in a substantial loss of information. Comparing the results obtained from the data extracted by two researchers (T.S. and W.B.) identified virtually no inconsistencies when inspecting estimates of Cohen’s d , as obtained for severity of cannabis use on CAPS ( d DIFF  = 0, where d DIFF  =  d TS   –d   WB ), gender ( d DIFF  = 0), administration of (placebo controlled) medicinal cannabis ( d DIFF  = 0.003), psychosis liability ( d DIFF  = 0), and administration of a single dose of THC ( d DIFF  = 0).

Figure 5 summarizes the results obtained from the meta-analytical models. We examined whether CAPS was predicted by the pharmacodynamic properties of cannabis, a person’s cannabis use history, demographic factors, mental health/personality traits, neurotransmitters, genetics, and use of other drugs: With respect to the pharmacodynamic properties of cannabis, the largest effect on CAPS severity was present for a single dose of THC ( d  = 0.7, 95% CI: 0.52, 0.87) as administered in experimental studies, followed by a significant dose–response effect of THC on CAPS ( d  = 0.42, 95% CI: 0.25, 0.59, that is, tested as moderation effects of THC dose in experimental studies). When tested in medicinal randomized controlled trials, cannabis products significantly increased symptoms of CAPS ( d  = 0.14, 95% CI: 0.05, 0.23), albeit by a smaller magnitude. Protective effects were present for low THC/COOH levels ( d  = −0.22, 95% CI: −0.39, −0.05, that is, the inactive metabolite of cannabis), but not for the THC/CBD (cannabidiol) ratio ( d  = −0.19, 95% CI: −0.43, 0.05, P  = 0.13).

figure 5

Summary of pooled Cohen’s d , the corresponding 95% confidence intervals, and P values (two-sided, uncorrected for multiple testing). Positive estimates of Cohen’s d indicate increases in CAPS in response to the assessed predictor. Details regarding the classification and interpretation of each predictor are provided in the Supplementary Information . The reference list of all studies included in this figure is provided in Supplementary Table 4 . NS, neurotransmission.

Less clear were the findings with respect to the cannabis use history of the participants and its effect on CAPS. Here, neither young age of onset of cannabis use nor high-frequency use of cannabis or the preferred type of cannabis (strains high in THC, strains high in CBD) was associated with CAPS. The only demographic factors that significantly predicted CAPS were age ( d  = −0.17, 95% CI: −0.292, −0.050) and gender (−0.09, 95% CI: −0.180, −0.001), indicating that younger and female cannabis users report higher levels of CAPS compared with older and male users. With respect to mental health and personality, the strongest predictors for CAPS were diagnosis of bipolar disorder ( d  = 0.8, 95% CI: 0.54, 1.06)) and psychosis liability ( d  = 0.49, 95% CI: 0.21, 0.77), followed by mood problems (anxiety d  = 0.44, 95% CI: 0.03, 0.84; depression d  = 0.37, 95% CI: 0.003, 0.740) and addiction liability ( d  = 0.26, 95% CI: 0.14, 0.38). Summarizing the evidence from studies looking at neurotransmitter functioning showed that increased dopamine activity significantly predicted CAPS ( d  = 0.4, 95% CI: 0.16, 0.64) (for example, reduced CAPS following administration of D2 blockers such as olanzapine 28 or haloperidol 29 ). By contrast, alterations in the opioid system did not reduce risk of CAPS. Similarly, none of the assessed candidate genes showed evidence of altering response to cannabis. Finally, out of 11 psychoactive substances with available data, only use histories of MDMA (3,4-methyl enedioxy methamphetamine) ( d  = 0.2, 95% CI: 0.03, 0.36), crack ( d  = 0.13, 95% CI: 0.03, 0.23), inhalants ( d  = 0.12, 95% CI: 0.03, 0.22), and sedatives ( d  = 0.12, 95% CI: 0.02, 0.22) linked to increases in CAPS.

Most of the meta-analytical models showed considerable levels of heterogeneity ( I 2  > 80%; Supplementary Table 5 ), notably when summarizing findings from observational studies (for example, severity of cannabis use: I 2  = 98%, age of onset of cannabis use: I 2  = 98%), highlighting that the individual effect estimates varied substantially across studies. By contrast, lower levels of heterogeneity were present when pooling evidence from experimental and medicinal cannabis studies (for example, effects of medicinal cannabis: I 2  = 18%; THC dose–response effects: I 2  = 37%). While risk of publication bias was indicated for four of the meta-analytical models (Egger’s test P  < 0.05) (Supplementary Fig. 3 ), an inspection of trim-and-fill adjusted estimates did not alter the conclusions for (1) administration of a single dose of THC ( P Egger  < 0.0001, d unadjusted  = 0.7, d trim-and-fill  = 0.49), (2) CBD administration ( P Egger  = 0.0001, d unadjusted  = −0.19, d trim-and-fill  = −0.14, both P  < 0.05), psychosis liability ( P Egger  = 0.025, d unadjusted  = 0.49, d trim-and-fill  = 0.49), and (3) diagnosis of depression ( P Egger  = 0.019, d unadjusted  = 0.37, d trim-and-fill  = 0.54). Outliers were identified for seven meta-analytical models (Supplementary Fig. 4 ). Removing outliers from the models did not substantially alter the conclusions drawn from the models, as indicated for age ( d  = −0.18, d corr  = −0.14, both P  < 0.05); anxiety ( d  = 0.61, d corr  = 0.47, both P  < 0.05), severity of cannabis use ( d  = 0.19, d corr  = 0.25, both P  > 0.05), depression ( d  = 0.41, d corr  = 0.25, both P  > 0.05), gender ( d  = −0.09, d corr  = −0.12, both P  < 0.05), psychosis liability ( d  = 0.49, d corr  = 0.43, both P  < 0.05), and administration of a single dose of THC ( d  = 0.6, d corr  = 0.56, both P  < 0.05). Sensitivity checks assessing whether Cohen’s d changes as a function of within-subject correlation coefficient highlighted that the results were highly concordant (Supplementary Fig. 6 ). Minor deviations from the main analysis were present for the effects of a single dose of THC ( d r =0.3  = 0.64 versus d r =0.5  = 0.69 versus d r =0.7  = 0.77) and dose–response effects of THC ( d r =0.3  = 0.45 versus d r =0.5  = 0.42 versus d r =0.7  = 0.39), but this did not alter the interpretation of the findings.

Finally, we assessed consistency of findings for predictors examined in more than one of the different study designs (observational, experimental, and medicinal cannabis studies), as illustrated for four meta-analytical models in Fig. 6 (see Supplementary Fig. 7 for the complete set of results). Triangulating the results highlighted that consistency with respect to the direction of effects was particularly high for age ( d Experiments  = −0.14 versus d Observational  = −0.19 versus d Quasi-Experimental  = −0.16) and gender ( d Experiments  = −0.09 versus d Observational  = −0.07 versus d Quasi-Experimental  = −0.25) on CAPS. By contrast, little consistency across the different study designs was present with respect to cannabis use histories, notable age of onset of cannabis use ( d Observational  = −0.3 versus d Quasi-Experimental  = 0.24), and use of high-THC cannabis ( d Observational  = 0.12 versus d Quasi-Experimental  = −0.13).

figure 6

Pooled estimates of Cohen’s d when estimated separately for each of the different study designs. The I 2 statistic (scale of 0 to 100) indexes the level of heterogeneity across the estimates included in the meta-analysis.

In this work, we examined rates and predictors of acute CAPS by synthesizing evidence from three distinct study designs: observational research, experimental studies administering THC, and studies testing medicinal cannabis products. Our results led to a number of key findings regarding the risk of CAPS in individuals exposed to cannabis. First, significant rates of CAPS were reported by all three study designs. This indicates that risk of acute psychosis-like symptoms exists after exposure to cannabis, irrespective of whether it is used recreationally, administered in controlled experiments, or prescribed as a medicinal product. Second, rates of CAPS vary across the different study designs, with substantially higher rates of CAPS in observational and experimental samples than in medicinal cannabis samples. Third, not every individual exposed to cannabis is equally at risk of CAPS as the interplay between individual differences and the pharmacological properties of the cannabis likely play an important role in modulating risk. In particular, risk appears most amplified in vulnerable individuals (for example, young age, pre-existing mental health problems) and increases with higher doses of THC (as shown in experimental studies).

Rates of cannabis-associated psychotic symptoms

Summarizing the existing evidence on rates of CAPS, we find that cannabis can acutely induce CAPS in a subset of cannabis-exposed individuals, irrespective of whether it is used recreationally, administered in controlled experiments, or prescribed as a medicinal product. Importantly, rates of CAPS varied substantially across the designs. More specifically, similar rates of CAPS were reported by observational and experimental evidence (around 19% and 21% in cannabis-exposed individuals, respectively), while considerably lower rates of CAPS were documented in medicinal cannabis samples (between 1% and 2%).

A number of factors likely contribute to the apparently different rates of CAPS across the three study designs. First, rates of CAPS are not directly comparable as different, design-specific measures were used: in observational/experimental research, CAPS is typically defined as the occurrence of transient cannabis-induced psychosis-like symptoms, whereas medicinal trials screen for CAPS as the occurrence of first-rank psychotic symptoms, often resulting in treatment discontinuation. 20 , 30 , 31 As such, transient CAPS may indeed occur commonly in cannabis-exposed individuals (as evident in the higher rates in observational/experimental research), while risk of severe CAPS requiring medical attention is less frequently reported (resulting in lower reported rates in medicinal cannabis samples). This converges with our meta-analytic results, showing that severe CAPS (full psychotic episode) may occur in about 1 in 200 (0.5%) cannabis users. Another key difference between medicinal trials and experimental/observational research lies in the demographic profile of participants recruited into the studies. For example, individuals taking part in medicinal trials were substantially older (mean age: 48 years) compared with subjects taking part in observational or experimental studies (mean age: 24 and 25 years, respectively). As such, older age may have buffered some of the adverse effects reported by adolescent individuals. Finally, cannabis products used in medicinal trials contain noticeable levels of CBD (for example, Sativex, with a THC/CBD ratio of approximately 1:1), a ratio different from that typically found in street cannabis (for example, >15% THC and <1% CBD 32 ) and in the experimental studies included in our meta-analyses (pure THC). As such, the use of medicinal cannabis (as opposed to street cannabis) may constitute a somewhat safer option. However, the potentially protective effects of CBD in this context require further investigation as we did not find a consistent effect of CBD co-administration on THC-induced psychosis-like symptoms. While earlier experimental studies included in our work were suggestive of protective effects of CBD, 33 , 34 , 35 two recent studies did not replicate these findings. 36 , 37

Interestingly, lower but significant rates of CAPS were also observed in placebo groups assessed as part of THC-challenge studies (% THC  = 25% versus % placebo  = 11%) and medicinal cannabis trials (% THC  = 3% versus % placebo  = 1%), highlighting that psychotic symptoms occur not only in the context of cannabis exposure. This is in line with the notion that cannabis use can increase risk of psychosis but appears to be neither a sufficient nor necessary cause for the emergence of psychotic symptoms. 38

Predictors of CAPS

Summarizing evidence on predictors of CAPS, we found that individual vulnerabilities and the pharmacological properties of cannabis both appear to play an important role in modulating risk. Regarding the pharmacological properties of cannabis, evidence from experimental studies showed that the administration of THC increases risk of CAPS, both in a single-dose and dose-dependent manner. Given the nature of the experimental design, these effects are independent of potential confounders that bias estimates obtained from observational studies. More challenging to interpret are therefore findings on individual cannabis use histories (for example, frequency/severity of cannabis use, age of onset of use, preferred cannabis strain) as assessed in observational studies. Contrary to evidence linking high-frequency and early-onset cannabis use to long-term risk of psychosis, 39 none of these factors associated with CAPS in our study. This discrepancy may indicate that cumulative effects of THC exposure are expressed differently for long-term risk of psychosis and acute CAPS: while users accustomed to cannabis may show a more blunted acute response as a result of tolerance, they are nevertheless at a higher risk of developing the clinical manifestation of psychosis in the long run. 38

We also tested a number of meta-analytical models for predictors tapping into demographic and mental health dimensions. Interestingly, among the assessed demographic factors, only age and gender associated with CAPS, with younger and female individuals reporting increased levels of CAPS. Other factors often linked to mental health, such as education or socioeconomic status, were not related to CAPS. Concerning predictors indexing mental health, we found converging evidence showing that a predisposition to psychosis increased the risk of experiencing CAPS. In addition, individuals with other pre-existing mental health vulnerabilities (for example, bipolar disorder, depression, anxiety, addiction liability) also showed a higher risk of CAPS, indicating that risk may stem partly from a common vulnerability to mental health problems.

These findings align with findings from studies focusing on the biological correlates of CAPS, showing that increases in dopamine activity, a neurotransmitter implicated in the etiology of psychosis, 40 altered sensitivity to cannabis. By contrast, none of the a priori selected candidate genes (chosen mostly to index schizophrenia liability) modulated risk of CAPS. This meta-analytic finding is coherent with results from the largest available genome-wide association study on schizophrenia, 41 where none of the candidate genes reached genome-wide significance ( P  < 5 × 10 −8 ) ( Supplementary Information ). Instead, as for any complex trait, genetic risk underlying CAPS is likely to be more polygenic in nature, possibly converging on pathways as yet to be identified. As such, genetic testing companies that screen for the aforementioned genetic variants to provide their customers with an individualized risk profile (such as the Cannabis Genetic Test offered by Lobo Genetics ( https://www.lobogene.com )) are unlikely to fully capture the genetic risk underlying CAPS. Similarly, genetic counseling programs targeting specifically AKT1 allele carriers in the context of cannabis use 42 may be only of limited use when trying to reduce cannabis-associated harms.

Implications for research on cannabis use and psychosis

This work has a number of implications for future research avenues. First, experimental studies administering THC constitute the most stringent available causal inference method when studying risk of CAPS. Future studies should therefore capitalize on experimental designs to advance our understanding of the acute pharmacological effects of cannabis, in terms of standard cannabis units, 43 dose–response risk profiles, 44 the interplay of different cannabinoids, 44 , 45 and building on recent work.

Despite the value of experimental studies in causal inference, observational studies are essential to identify predictors of CAPS that cannot be experimentally manipulated (for example, age, long-term/chronic exposure to cannabis) and to strengthen external validity. However, a particular challenge for inference from observational studies results from bias due to confounding and reverse causation. Triangulating and comparing findings across study designs can therefore help to identify potential sources of bias that are specific to the different study designs. 46 For example, we observed that, despite THC dosing being robustly associated with CAPS in experimental studies, we did not find an association between cannabis use patterns (for example, high-THC cannabis strain) in observational and quasi-observational studies. This apparent inconsistency may result from THC effects that are blunted by long-term, early-onset and heavy cannabis use. For other designs, reverse causation may bias the association between cannabis use patterns and CAPS: as individuals may reduce cannabis consumption as a result of adverse acute effects, 47 the interpretation of cross-sectional estimates concerning different cannabis exposure and risk of CAPS is particularly challenging. Future observational studies should therefore exploit more robust causal inference methods (for example, THC administration in naturalistic settings 48 or within-subject comparisons controlling for time-invariant confounds 49 ) to better approximate the experimental design. In particular, innovative designs that can provide a higher temporal resolution on cannabis exposures and related experiences (for example, experience sampling, 50 assessing daily reactivity to cannabis 51 ) are a valuable addition to the causal inference toolbox for cannabis research. Applying genetically informed causal inferences such as Mendelian randomization analyses 52 can further help to triangulate findings, which would be possible once genome-wide summary results for both different cannabis use patterns and CAPS become available.

With respect to medicinal trials, it is important to note that an assessment of CAPS has not been a primary research focus. Although psychotic events are recognized as a potential adverse reaction to medicinal cannabis, 53 data on CAPS are rarely reported by medicinal trials, considering that only about 20% of medicinal cannabis randomized controlled trials screen for psychosis as a potential adverse effects. 5 As such, trials should systematically monitor CAPS, in addition to longer-term follow-ups assessing the risk of psychosis as a result of medicinal cannabis use. In particular, the use of validated instruments designed to capture more-subtle changes in CAPS should be included in trials to more adequately assess adverse reactions associated with medicinal cannabis products.

Second, with respect to factors associated with risk of CAPS, we find that these are similar to factors associated with onset of psychosis, notably pre-existing mental health vulnerabilities, 54 dose–response effects of cannabis, 55 and young age. 12 The key question deserving further attention is therefore whether CAPS constitutes, per se, a risk maker for long-term psychosis. Preliminary evidence found that in individuals with recent-onset psychosis, 37% reported to have experienced their first psychotic symptoms during cannabis intoxication. 56 Future longitudinal evidence building on this is required to determine whether subclinical cannabis-associated psychotic symptoms can help to identify users at high risk of developing psychosis in the long run. Follow-up research should also examine longitudinal trajectories of adverse cannabis-induced experiences and the distress associated with these experiences, given research suggesting that high levels of distress/persistence may constitute a marker of clinical relevance of psychotic-like experiences. 57 While few studies have explored this question in the context of CAPS, there is, for example, evidence suggesting that the level of distress caused by acute adverse reactions to cannabis may depend on the specific symptom dimension. 58 Here the highest levels of distress resulted from cannabis-associated paranoia and anxiety, rather than cannabis-associated hallucinations or experiences tapping into physical sensations (for example, body humming, numbness). In addition, some evidence highlights the re-occurring nature of CAPS in cannabis-exposed individuals. 22 , 58 Further research focusing on individuals with persisting symptoms of CAPS may therefore help to advance our knowledge concerning individual vulnerabilities underlying the development of long-term psychosis in the context of cannabis use.

Importantly, our synthesizing analysis is not immune to the sources of bias that exist for the different study designs, and our findings should therefore be considered in light of the aforementioned limitations (for example, residual confounding or reverse causation in observational studies, limited external validity in experimental studies). Nevertheless, comparing findings across the different study designs allowed us to pin down areas of inconsistency, which existed mostly with regard to cannabis-related parameters (for example, age of onset, frequency of use) and CAPS. In addition, we observed large levels of heterogeneity among most meta-analysis models, highlighting that study-specific findings may vary as a result of different sample characteristics and study methodologies. Future studies aiming to further discern potential sources of variation such as study design features (for example, treatment length in medicinal trials, route of THC administration in experimental studies), statistical modeling (for example, the type of confounding factors considered in observational research), and sample demographics (for example, age of the participants, previous experience with cannabis) are therefore essential when studying CAPS.

Conclusions

Our results demonstrate that cannabis can induce acute psychotic symptoms in individuals using cannabis for recreational or medicinal purposes. Some individuals appear to be particularly sensitive to the adverse acute effects of cannabis, notably young individuals with pre-existing mental health problems and individuals exposed to high levels of THC. Future studies should therefore monitor more closely adverse cannabis-related outcomes in vulnerable individuals as these individuals may benefit most from harm-reduction efforts.

Systematic search

A systematic literature search was performed in three databases (MEDLINE, EMBASE, and PsycInfo) following the PRISMA guidelines. 59 The final search was conducted on 6 December 2023 using 26 search terms indexing cannabis/THC and 20 terms indexing psychosis-like outcomes or cannabis-intoxication experiences (see Supplementary Information for a complete list of search terms). Search terms were chosen on the basis of terminology used in studies assessing CAPS, including observational studies (self-reported cannabis-induced psychosis-like experiences), THC-challenge studies (testing change in psychosis-like symptoms following THC administration), and medicinal studies testing the efficacy and safety of medicinal cannabis products (adverse events related to medicinal cannabis). Before screening the identified studies for inclusion, we removed non-relevant article types (reviews, case reports, comments, guidelines, editorials, letters, newspaper articles, book chapters, dissertations, conference abstracts) and duplicates using the R package revtools 60 . A senior researcher experienced in meta-analyses on cannabis use (T.S.) then reviewed all titles and abstracts for their relevance before conducting full-text screening. To reduce the risk of wrongful inclusion at the full-text screening stage, 10% of the articles selected for full-text screening were cross-checked for eligibility by a second researcher (E.M.).

Data extraction

We included all study estimates that could be used to derive rates of CAPS (the proportion of cannabis-exposed individuals reporting CAPS) or effect sizes (Cohen’s d ) for factors predicting CAPS. CAPS was defined as the occurrence of hallucinations, paranoia, and/or delusions during cannabis intoxication. These symptom-level items have been identified as the most reliable self-report measures screening for psychosis when validated against clinical interview measures. 61 Table 1 provides examples of CAPS as measured across the three different study designs. In brief, from observational studies, we extracted data if CAPS was assessed in cannabis-exposed individuals on the basis of self-report measures screening for subjective experiences while under the influence of cannabis. From experimental studies administering THC, CAPS was measured as the degree of psychotic symptom change in response to THC, either estimated from a between-subject (placebo groups versus THC group) or within-subject (pre-THC versus post-THC assessment) comparison. We also included data from natural experiments (referred to as quasi-experimental studies hereafter), where psychosis-like experiences were monitored in recreational cannabis users before and after they consumed their own cannabis products. 23 , 62 Finally, with respect to trials testing the efficacy and/or safety of medicinal cannabis products containing THC, we extracted data on adverse events, including the occurrence of psychosis, hallucinations, delusions, and/or paranoia during treatment with medicinal cannabis products. Medicinal studies that tested the effects of cannabis products not containing THC (for example, CBD only, olorinab, lenabasum) were not included.

For 10% of the included studies, data on rates and predictors of CAPS were extracted by a second researcher (W.B.), and agreement between the two extracted datasets was assessed by comparing the pooled estimates on rates and predictors of CAPS. In addition, following recommendations for improved reproducibility and transparency in meta-analytical works, 63 we provide all extracted data, the corresponding analytical scripts, and transformation information in the study repository.

Statistical analysis

Rates of caps.

We extracted the raw estimates of rates of CAPS as reported by observational, experimental, and medicinal cannabis studies. Classification of CAPS differs across the three study designs. In observational studies, occurrence of CAPS is typically defined as the experience of psychotic-like symptoms while under the influence of cannabis. In experimental studies administering THC, CAPS is commonly defined as a clinically significant change in psychotic symptom severity (for example, ≥3 points increase in Positive and Negative Syndrome Scale positive scores following THC 33 ). Finally, in medicinal cannabis samples, a binary measure of CAPS indicates whether psychotic symptoms occurred as an adverse event throughout the treatment with medicinal cannabis products. We derived rates of CAPS ( R CAPS  =  X Count of CAPS / N Sample size ) and the corresponding confidence intervals using the function BinomCI and the Clopper–Pearson method as implemented in the R package DescTools. 64 To estimate the pooled proportions, we fitted random-effects models or multilevel random-effects models as implemented in the R package metafor. 65 Multilevel random-effects models were used whenever accounting for non-independent sampling errors was necessary (further described in the following). Risk of publication bias was assessed using Peters’ test 66 and funnel plots and, if indicated ( P Peters  < 0.05), corrected using the trim-and-fill method ( Supplementary Methods ).

To derive the pooled effects of factors predicting CAPS, we converted study estimates to the standardized effect size Cohen’s d as a common metric. For studies reporting mean differences, two formulas were used for the conversion. First, for studies reporting mean differences from between-subject comparisons (independent samples), we used the following formula:

where M E and M C are the mean scores on a continuous scale (severity of CAPS), reported for individuals exposed ( M E ) and unexposed ( M C ) to a certain risk factor (for example, cannabis users with pre-existing mental health problems versus cannabis users without pre-existing mental health problems). The formula used to derive the pooled standard deviations, SD P , and the variance of Cohen’s d are listed in the Supplementary Methods . Second, an extension of the preceding formula was used to derive Cohen’s d from within-subject comparisons, comparing time-point one ( M T1 ) with time-point two ( M T2 ).The formula takes into account the dependency between the two groups: 67

where r indexes the correlation between the pairs of observations, such as the correlation between the pre- and post-THC condition in the same set of individuals for a particular outcome measure. The correlation coefficient was set to be r  = 0.5 for all studies included in the meta-analysis, on the basis of previous research. 13 We also assessed whether varying within-person correlation coefficients altered the interpretation of the results by re-estimating the pooled Cohen’s d for predictors of CAPS for two additional coefficients ( r  = 0.3 and r  = 0.7). The results were then compared with the findings obtained from the main analysis ( r  = 0.5).

From experimental studies reporting multiple time points of psychosis-like experiences following THC administration (for example, refs. 68 , 69 , 70 , 71 , 72 ), we selected the most immediate time point following THC administration. Of note, whenever studies reported test statistics instead of means (for example, t -test or F -test statistics), the preceding formula was amended to accommodate these statistics. In addition, to allow for the inclusion of studies reporting metrics other than mean comparisons (for example, regression coefficients, correlations coefficients), we converted the results to Cohen’s d using existing formulas. All formulas used in this study are provided in the Supplementary Information . Whenever studies reported non-significant results without providing sufficient data to estimate Cohen’s d ( for example, results reported only as P  > 0.05 ) , we used a conservative estimate of P  = 1 and the corresponding sample size as the input to derive Cohen’s d . Finally, if studies reported estimates in figures only, we used WebPlotDigitizer ( https://automeris.io/WebPlotDigitizer ) to extract the data. Since the conversion of estimates from one metric to another may result in loss of precision, we also extracted the original P -value estimates (whenever reported as numerical values) and assessed the level of concordance with the P values corresponding to the estimated Cohen’s d .

Next, a series of meta-analytical models were fitted, each pooling estimates of Cohen’s d that belonged to the same class of predictors (for example, estimates indexing the effect of dopaminergic function on CAPS; estimates indexing the effect of age on CAPS). A detailed description of the classification of the included predictors is provided in the Supplementary Methods . Cohen’s d estimates were pooled if at least two estimates were available for one predictor class, using one of the following models:

Aggregation models (pooling effect sizes coming from the same underlying sample)

Random-effects models (pooling effect sizes coming from independent samples)

Multilevel random-effects models (pooling effect sizes coming from both independent and non-independent samples)

Predictors that could not meaningfully be grouped were not included in meta-analytical models but are, for completeness, reported as individual study estimates in the Supplementary Information . Levels of heterogeneity for each meta-analytical model were explored using the I 2 statistic, 73 indexing the contribution of study heterogeneity to the total variance. Here, I 2  > 30% represents moderate heterogeneity and I 2  > 50% represents substantial heterogeneity. Risk of publication bias was assessed visually using funnel plots alongside the application of Egger’s test to test for funnel-plot asymmetry. This test was performed for meta-analytical models containing at least six effect estimates. 74 The trim-and-fill 75 method was used whenever risk of publication bias was indicated ( P Egger  < 0.05). To assess whether outliers distorted the conclusions of the meta-analytical models, we applied leave-one-out and outlier analysis 76 as implemented in the R package dmetar, 77 where a pooled estimate was re-calculated after omitting studies that deviated from the pooled estimate. Further details on all applied sensitivity analyses are provided in the Supplementary Methods .

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The data are publicly available via GitHub at github.com/TabeaSchoeler/TS2023_MetaCAPS .

Code availability

All analytical code used to analyze, summarize, and present the data is accessible via GitHub at github.com/TabeaSchoeler/TS2023_MetaCAPS .

World Drug Report 2022 (UNODC, 2022); https://www.unodc.org/unodc/en/data-and-analysis/wdr-2022_booklet-3.html

Turna, J. et al. Overlapping patterns of recreational and medical cannabis use in a large community sample of cannabis users. Compr. Psychiatry 102 , 152188 (2020).

Article   PubMed   Google Scholar  

Rhee, T. G. & Rosenheck, R. A. Increasing use of cannabis for medical purposes among US residents, 2013–2020. Am. J. Prev. Med. 65 , 528–533 (2023).

Green, B., Kavanagh, D. & Young, R. Being stoned: a review of self-reported cannabis effects. Drug Alcohol Rev. 22 , 453–460 (2003).

Whiting, P. F. et al. Cannabinoids for medical use. JAMA. 313 , 2456 (2015).

Callaghan, R. C. et al. Associations between Canada’s cannabis legalization and emergency department presentations for transient cannabis-induced psychosis and schizophrenia conditions: Ontario and Alberta, 2015–2019. Can. J. Psychiatry 67 , 616–625 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Manthey, J., Freeman, T. P., Kilian, C., López-Pelayo, H. & Rehm, J. Public health monitoring of cannabis use in Europe: prevalence of use, cannabis potency, and treatment rates. Lancet Reg. Health Eur. 10 , 100227 (2021).

Pratt, M. et al. Benefits and harms of medical cannabis: a scoping review of systematic reviews. Syst. Rev. 8 , 320 (2019).

McGee, R., Williams, S., Poulton, R. & Moffitt, T. A longitudinal study of cannabis use and mental health from adolescence to early adulthood. Addiction 95 , 491–503 (2000).

Large, M., Sharma, S., Compton, M. T., Slade, T. & Nielssen, O. Cannabis use and earlier onset of psychosis. Arch. Gen. Psychiatry 68 , 555 (2011).

Marconi, A., Di Forti, M., Lewis, C. M., Murray, R. M. & Vassos, E. Meta-analysis of the association between the level of cannabis use and risk of psychosis. Schizophr. Bull. 42 , 1262–1269 (2016).

Hasan, A. et al. Cannabis use and psychosis: a review of reviews. Eur. Arch. Psychiatry Clin. Neurosci. 270 , 403–412 (2020).

Hindley, G. et al. Psychiatric symptoms caused by cannabis constituents: a systematic review and meta-analysis. Lancet Psychiatry 7 , 344–353 (2020).

Sexton, M., Cuttler, C. & Mischley, L. K. A survey of cannabis acute effects and withdrawal symptoms: differential responses across user types and age. J. Altern. Complement. Med. 25 , 326–335 (2019).

Schoeler, T., Ferris, J. & Winstock, A. R. Rates and correlates of cannabis-associated psychotic symptoms in over 230,000 people who use cannabis. Transl. Psychiatry 12 , 369 (2022).

Winstock, A., Lynskey, M., Borschmann, R. & Waldron, J. Risk of emergency medical treatment following consumption of cannabis or synthetic cannabinoids in a large global sample. J. Psychopharmacol. 29 , 698–703 (2015).

Kaufmann, R. M. et al. Acute psychotropic effects of oral cannabis extract with a defined content of Δ9-tetrahydrocannabinol (THC) in healthy volunteers. Pharmacopsychiatry 43 , 24–32 (2010).

Cameron, C., Watson, D. & Robinson, J. Use of a synthetic cannabinoid in a correctional population for posttraumatic stress disorder-related insomnia and nightmares, chronic pain, harm reduction, and other indications. J. Clin. Psychopharmacol. 34 , 559–564 (2014).

Aviram, J. et al. Medical cannabis treatment for chronic pain: outcomes and prediction of response. Eur. J. Pain 25 , 359–374 (2021).

Serpell, M. G., Notcutt, W. & Collin, C. Sativex long-term use: an open-label trial in patients with spasticity due to multiple sclerosis. J. Neurol. 260 , 285–295 (2013).

Colizzi, M. et al. Delta-9-tetrahydrocannabinol increases striatal glutamate levels in healthy individuals: implications for psychosis. Mol. Psychiatry. 25 , 3231–3240 (2020).

Bianconi, F. et al. Differences in cannabis-related experiences between patients with a first episode of psychosis and controls. Psychol. Med. 46 , 995–1003 (2016).

Valerie Curran, H. et al. Which biological and self-report measures of cannabis use predict cannabis dependency and acute psychotic-like effects? Psychol. Med. 49 , 1574–1580 (2019).

Kleinloog, D., Roozen, F., De Winter, W., Freijer, J. & Van Gerven, J. Profiling the subjective effects of Δ9-tetrahydrocannabinol using visual analogue scales. Int. J. Methods Psychiatr. Res. 23 , 245–256 (2014).

Ganesh, S. et al. Psychosis-relevant effects of intravenous delta-9-tetrahydrocannabinol: a mega analysis of individual participant-data from human laboratory studies. Int. J. Neuropsychopharmacol. 23 , 559–570 (2020).

Kendler, K. S., Ohlsson, H., Sundquist, J. & Sundquist, K. Prediction of onset of substance-induced psychotic disorder and its progression to schizophrenia in a Swedish national sample. Am. J. Psychiatry 176 , 711–719 (2019).

Arendt, M., Rosenberg, R., Foldager, L., Perto, G. & Munk-Jørgensen, P. Cannabis-induced psychosis and subsequent schizophrenia-spectrum disorders: follow-up study of 535 incident cases. Br. J. Psychiatry 187 , 510–515 (2005).

Kleinloog, D. et al. Does olanzapine inhibit the psychomimetic effects of Δ9-tetrahydrocannabinol? J. Psychopharmacol. 26 , 1307–1316 (2012).

Liem-Moolenaar, M. et al. Central nervous system effects of haloperidol on THC in healthy male volunteers. J. Psychopharmacol. 24 , 1697–1708 (2010).

Patti, F. et al. Efficacy and safety of cannabinoid oromucosal spray for multiple sclerosis spasticity. J. Neurol. Neurosurg. Psychiatry 87 , 944–951 (2016).

Thaler, A. et al. Single center experience with medical cannabis in Gilles de la Tourette syndrome. Parkinsonism Relat. Disord . 61 , 211–213 (2019).

Chandra, S. et al. New trends in cannabis potency in USA and Europe during the last decade (2008–2017). Eur. Arch. Psychiatry Clin. Neurosci. 269 , 5–15 (2019).

Englund, A. et al. Cannabidiol inhibits THC-elicited paranoid symptoms and hippocampal-dependent memory impairment. J. Psychopharmacol. 27 , 19–27 (2013).

Gibson, L. P. et al. Effects of cannabidiol in cannabis flower: implications for harm reduction. Addict. Biol. 27 , e13092 (2022).

Sainz-Cort, A. et al. The effects of cannabidiol and delta-9-tetrahydrocannabinol in social cognition: a naturalistic controlled study. Cannabis Cannabinoid Res . https://doi.org/10.1089/can.2022.0037 (2022).

Lawn, W. et al. The acute effects of cannabis with and without cannabidiol in adults and adolescents: a randomised, double‐blind, placebo‐controlled, crossover experiment. Addiction 118 , 1282–1294 (2023).

Englund, A. et al. Does cannabidiol make cannabis safer? A randomised, double-blind, cross-over trial of cannabis with four different CBD:THC ratios. Neuropsychopharmacology 48 , 869–876 (2023).

Arseneault, L., Cannon, M., Witton, J. & Murray, R. M. Causal association between cannabis and psychosis: examination of the evidence. Br. J. Psychiatry 184 , 110–117 (2004).

Di Forti, M. et al. The contribution of cannabis use to variation in the incidence of psychotic disorder across Europe (EU-GEI): a multicentre case-control study. Lancet Psychiatry 6 , 427–436 (2019).

McCutcheon, R. A., Abi-Dargham, A. & Howes, O. D. Schizophrenia, dopamine and the striatum: from biology to symptoms. Trends Neurosci. 42 , 205–220 (2019).

Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604 , 502–508 (2022).

Zwicker, A. et al. Genetic counselling for the prevention of mental health consequences of cannabis use: a randomized controlled trial‐within‐cohort. Early Interv. Psychiatry 15 , 1306–1314 (2021).

Hindocha, C., Norberg, M. M. & Tomko, R. L. Solving the problem of cannabis quantification. Lancet Psychiatry 5 , e8 (2018).

Englund, A. et al. The effect of five day dosing with THCV on THC-induced cognitive, psychological and physiological effects in healthy male human volunteers: a placebo-controlled, double-blind, crossover pilot trial. J. Psychopharmacol. 30 , 140–151 (2016).

Wall, M. B. et al. Individual and combined effects of cannabidiol and Δ9-tetrahydrocannabinol on striato-cortical connectivity in the human brain. J. Psychopharmacol. 36 , 732–744 (2022).

Hammerton, G. & Munafò, M. R. Causal inference with observational data: the need for triangulation of evidence. Psychol. Med. 51 , 563–578 (2021).

Sami, M., Notley, C., Kouimtsidis, C., Lynskey, M. & Bhattacharyya, S. Psychotic-like experiences with cannabis use predict cannabis cessation and desire to quit: a cannabis discontinuation hypothesis. Psychol. Med. 49 , 103–112 (2019).

Morgan, C. J. A., Schafer, G., Freeman, T. P. & Curran, H. V. Impact of cannabidiol on the acute memory and psychotomimetic effects of smoked cannabis: naturalistic study. Br. J. Psychiatry 197 , 285–290 (2010).

Schoeler, T. et al. Association between continued cannabis use and risk of relapse in first-episode psychosis: a quasi-experimental investigation within an observational study. JAMA Psychiatry 73 , 1173–1179 (2016).

Sznitman, S., Baruch, Y. Ben, Greene, T. & Gelkopf, M. The association between physical pain and cannabis use in daily life: an experience sampling method. Drug Alcohol Depend. 191 , 294–299 (2018).

Henquet, C. et al. Psychosis reactivity to cannabis use in daily life: an experience sampling study. Br. J. Psychiatry 196 , 447–453 (2010).

Pingault, J.-B. et al. Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet. 19 , 566–580 (2018).

Hill, K. P. Medical cannabis. JAMA 323 , 580 (2020).

Esterberg, M. L., Trotman, H. D., Holtzman, C., Compton, M. T. & Walker, E. F. The impact of a family history of psychosis on age-at-onset and positive and negative symptoms of schizophrenia: a meta-analysis. Schizophr. Res. 120 , 121–130 (2010).

Di Forti, M. et al. Proportion of patients in south London with first-episode psychosis attributable to use of high potency cannabis: a case-control study. Lancet Psychiatry 2 , 233–238 (2015).

Peters, B. D. et al. Subjective effects of cannabis before the first psychotic episode. Aust. N. Z. J. Psychiatry 43 , 1155–1162 (2009).

Karcher, N. R. et al. Persistent and distressing psychotic-like experiences using adolescent brain cognitive development study data. Mol. Psychiatry 27 , 1490–1501 (2022).

LaFrance, E. M., Stueber, A., Glodosky, N. C., Mauzay, D. & Cuttler, C. Overbaked: assessing and predicting acute adverse reactions to cannabis. J. Cannabis Res. 2 , 3 (2020).

Moher, D., Liberati, A., Tetzlaff, J. & Altman, D. G. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Brit. Med. J. 339 , b2535 (2009).

Westgate, M. J. revtools: an R package to support article screening for evidence synthesis. Res. Synth. Methods. 10 , 606–614 (2019).

Kelleher, I., Harley, M., Murtagh, A. & Cannon, M. Are screening instruments valid for psychotic-like experiences? A validation study of screening questions for psychotic-like experiences using in-depth clinical interview. Schizophr. Bull. 37 , 362–369 (2011).

Morgan, C. J. A., Freeman, T. P., Powell, J. & Curran, H. V. AKT1 genotype moderates the acute psychotomimetic effects of naturalistically smoked cannabis in young cannabis smokers. Transl. Psychiatry 6 , e738 (2016).

Ivimey‐Cook, E. R., Noble, D. W. A., Nakagawa, S., Lajeunesse, M. J. & Pick, J. L. Advice for improving the reproducibility of data extraction in meta‐analysis. Res. Synth. Methods. 14 , 911–915 (2023).

Signorell, A. et al. DescTools: Tools for Descriptive Statistics R Package version 0.99 https://cran.r-project.org/web/packages/DescTools/index.html (2019).

Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw . https://doi.org/10.18637/jss.v036.i03 (2010).

Peters, J. L. Comparison of two methods to detect publication bias in meta-analysis. JAMA 295 , 676–680 (2006).

Borenstein, M., Hedges, L. V., Higgins, J. P. T. & Rothstein, H. R. in Introduction to Meta-Analysis 225–238 (John Wiley & Sons, 2009); https://doi.org/10.1002/9780470743386.ch24

Mason, O. et al. Acute cannabis use causes increased psychotomimetic experiences in individuals prone to psychosis. Psychol. Med. 39 , 951–956 (2009).

D’Souza, D. C. et al. Delta-9-tetrahydrocannabinol effects in schizophrenia: implications for cognition, psychosis, and addiction. Biol. Psychiatry 57 , 594–608 (2005).

Solowij, N. et al. A randomised controlled trial of vaporised Δ9-tetrahydrocannabinol and cannabidiol alone and in combination in frequent and infrequent cannabis users: acute intoxication effects. Eur. Arch. Psychiatry Clin. Neurosci. 269 , 17–35 (2019).

Vadhan, N. P., Corcoran, C. M., Bedi, G., Keilp, J. G. & Haney, M. Acute effects of smoked marijuana in marijuana smokers at clinical high-risk for psychosis: a preliminary study. Psychiatry Res. 257 , 372–374 (2017).

Radhakrishnan, R. et al. GABA deficits enhance the psychotomimetic effects of Δ9-THC. Neuropsychopharmacology 40 , 2047–2056 (2015).

Higgins, J. P. T. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21 , 1539–1558 (2002).

Tang, J.-L. & Liu, J. L. Misleading funnel plot for detection of bias in meta-analysis. J. Clin. Epidemiol. 53 , 477–484 (2000).

Duval, S. & Tweedie, R. Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56 , 455–463 (2000).

Viechtbauer, W. & Cheung, M. W.-L. Outlier and influence diagnostics for meta-analysis. Res. Synth. Methods 1 , 112–125 (2010).

Harrer, M., Cuijpers, P., Furukawa, T. & Ebert, D. D. dmetar: Companion R Package for the Guide ’Doing Meta-Analysis in R’ R package version 00.9000 http://dmetar.protectlab.org/ (2019).

Thomas, H. A community survey of adverse effects of cannabis use. Drug Alcohol Depend. 42 , 201–207 (1996).

Olsson, F. et al. An observational study of safety and clinical outcome measures across patient groups in the United Kingdom Medical Cannabis Registry. Expert Rev. Clin. Pharmacol. 16 , 257–266 (2023).

Arendt, M. et al. Testing the self-medication hypothesis of depression and aggression in cannabis-dependent subjects. Psychol. Med. 37 , 935–945 (2007).

Bonn-Miller, M. O. et al. The short-term impact of 3 smoked cannabis preparations versus placebo on PTSD symptoms: a randomized cross-over clinical trial. PLoS ONE 16 , e0246990 (2021).

Stokes, P. R. A., Mehta, M. A., Curran, H. V., Breen, G. & Grasby Paul, R. A. Can recreational doses of THC produce significant dopamine release in the human striatum? Neuroimage 48 , 186–190 (2009).

Zuurman, L. et al. Effect of intrapulmonary tetrahydrocannabinol administration in humans. J. Psychopharmacol. 22 , 707–716 (2008).

Safakish, R. et al. Medical cannabis for the management of pain and quality of life in chronic pain patients: a prospective observational study. Pain Med. 21 , 3073–3086 (2020).

Favrat, B. et al. Two cases of ‘cannabis acute psychosis’ following the administration of oral cannabis. BMC Psychiatry 5 , 17 (2005).

Balash, Y. et al. Medical cannabis in Parkinson disease: real-life patients' experience. Clin. Neuropharmacol. 40 , 268–272 (2017).

Habib, G. & Levinger, U. Characteristics of medical cannabis usage among patients with fibromyalgia. Harefuah 159 , 343–348 (2020).

PubMed   Google Scholar  

Beaulieu, P. Effects of nabilone, a synthetic cannabinoid, on postoperative pain. Can J. Anesth. 53 , 769–775 (2006).

Rup, J., Freeman, T. P., Perlman, C. & Hammond, D. Cannabis and mental health: adverse outcomes and self-reported impact of cannabis use by mental health status. Subst. Use Misuse 57 , 719–729 (2022).

Download references

Acknowledgments

This research was funded in whole, or in part, by the Wellcome Trust (grant nos. 218641/Z/19/Z (to T.S.) and 215917/Z/19/Z (to J.R.B.)). For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. J.-B.P. is funded by the Medical Research Foundation 2018 Emerging Leaders First Prize in Adolescent Mental Health (MRF-160-0002-ELP-PINGA (to J.-B.P.)). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and affiliations.

Department of Computational Biology, University of Lausanne, Lausanne, Switzerland

Tabea Schoeler

Department of Clinical, Educational and Health Psychology, Division of Psychology and Language Sciences, University College London, London, UK

Tabea Schoeler, Jessie R. Baldwin, Ellen Martin, Wikus Barkhuizen & Jean-Baptiste Pingault

Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK

Jessie R. Baldwin & Jean-Baptiste Pingault

You can also search for this author in PubMed   Google Scholar

Contributions

T.S., J.R.B., and J.-B.P. conceived and designed the study. T.S., E.M., and W.B. acquired the data. T.S. analyzed the data and drafted the paper. All authors (T.S., J.R.B., E.M., W.B., and J.-B.P.) reviewed and approved the manuscript.

Corresponding author

Correspondence to Tabea Schoeler .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Mental Health thanks Evangelos Vassos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary Figs. 1–7, Methods (literature search, estimation of Cohen’s d , classification of predictors of CAPS, analysis plan), and references.

Reporting Summary

Supplementary tables.

Supplementary Tables 1–5.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Schoeler, T., Baldwin, J.R., Martin, E. et al. Assessing rates and predictors of cannabis-associated psychotic symptoms across observational, experimental and medical research. Nat. Mental Health (2024). https://doi.org/10.1038/s44220-024-00261-x

Download citation

Received : 06 September 2023

Accepted : 26 April 2024

Published : 03 June 2024

DOI : https://doi.org/10.1038/s44220-024-00261-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

experimental and quasi experimental examples

  • Open access
  • Published: 18 June 2024

The effects of an educational intervention based on the protection motivation theory on the protective behaviors of emergency ward nurses against occupational hazards: a quasi-experimental study

  • Mohadeseh Nouri 1 ,
  • Saeed Ghasemi 2 ,
  • Sahar Dabaghi 2 &
  • Parvin Sarbakhsh 3  

BMC Nursing volume  23 , Article number:  409 ( 2024 ) Cite this article

Metrics details

Emergency ward nurses face a variety of occupational hazards due to the nature of their occupational and professional duties, which can negatively affect their health. Therefore, this study aimed to evaluate the effects of an educational intervention based on the protection motivation theory on the protective behaviors of emergency ward nurses against occupational hazards in Tehran, Iran, in 2023.

The present quasi-experimental study was conducted with two intervention and control groups, using a pretest-posttest design. A total of 124 nurses working in the emergency wards of four hospitals (two hospitals for the intervention group and two hospitals for the control group by random assignment) were selected by multistage sampling method. The educational intervention based on the protection motivation theory was implemented for the intervention group for three weeks. The nurses of both groups completed a demographic questionnaire and the scale of emergency ward nurses’ protective behaviors against occupational hazards before, immediately, and one month after the intervention. Data analysis was performed using descriptive and inferential methods.

The two groups were similar in terms of demographic characteristics at the baseline ( p  > 0.05). Protective behaviors of emergency nurses against occupational hazards and their sub-scales (physical, chemical, biological, ergonomics, and psychosocial hazards) were higher in the intervention group than in the control group immediately and one month after the educational intervention. In addition, the measurement over time also showed the positive effect of time and educational intervention on the protective behaviors of emergency nurses against occupational hazards and their sub-scales in the intervention group.

These findings showed that the educational intervention based on the protection motivation theory can be effective and helpful in improving the protective behaviors of emergency ward nurses against occupational hazards and their sub-scales. Future studies can focus on a more specific design of this kind of intervention based on the type of occupational hazards and needs of nurses in different wards.

Peer Review reports

The most occupational hazards for HealthCare Workers (HCWs), including emergency ward nurses are physical, chemical, biological, ergonomic, and psychosocial hazards [ 1 , 2 , 3 ]. Emergency ward nurses face various occupational hazards while performing their duties, and the safety of nurses and patients depends on the nurses’ knowledge of these hazards and appropriate protective behavior [ 4 ].

Physical hazards include exposure to extreme temperatures, tripping, slipping, cuts, falling, various radiations, unusual noise, electric shock, fire, and explosions [ 1 , 2 , 3 ]. The results of one study in Egypt showed that most nurses (62.4%) had poor knowledge about physical occupational hazards [ 5 ].

Chemical hazards, including exposure to cleaning and disinfecting agents, sterilant materials, mercury, toxic drugs, pesticides, latex and laboratory chemicals and reagents; this hazard may lead to poisoning, allergic reactions, dermatitis, cancer, and maternal health effects, which may occur during compounding, unpacking, cleaning the environment, etc [ 1 , 2 , 3 ]. A systematic review study had shown that the incidence of occupational contact dermatitis for some groups of HCWs was high [ 6 ].

Biological hazards include exposure to blood-borne and air-borne pathogens; such as Hepatitis B virus (HBV), Hepatitis C virus (HCV) and Human immunodeficiency virus (HIV), tuberculosis, etc [ 1 , 2 , 3 , 4 ]. The results of another systematic review also showed a high prevalence of needle stick injuries among HCWs, and health services related to this regard should be improved [ 7 ].

Ergonomic hazards include the inappropriate design of the work environment, inappropriate position while working, and repetitive procedures, which may lead to musculoskeletal disorders [ 1 , 2 , 3 ]. In a study in Malaysia, almost all nurses (97.3%) had work-related musculoskeletal disorders during the past year, so this problem should be considered seriously [ 8 ]. In another study in Saudi Arabia, 85% of nurses participating in the study reported at least one musculoskeletal disorder, which was associated with factors such as hours of working and the weight of nurses [ 9 ].

Psychosocial hazards include stressful conditions, work environment violence, job strain, burnout, exhausting work shifts, long working hours, loss of reputation, being threatened and bullied by colleagues, interpersonal communication at the work environment, satisfaction with the job and imbalanced roles and responsibilities [ 1 , 2 , 3 ]. In a study, some problems and stressors faced by nurses working in the emergency ward were burnout, workplace violence, moral distress, chaotic work environment, etc [ 10 ]. The results of the study in the United States of America (USA) showed that the psychosocial job stress of emergency ward nurses was prevalent [ 11 ]. Another study in Kenya on emergency nurses also revealed a high prevalence of violence in the workplace; 81.7% and 73.2% for lifetime and one year respectively, and this is a significant problem [ 12 ].

Social and behavioral theories can be useful for designing educational interventions to improve the protective behaviors of HCWs against occupational hazards [ 13 ]. Protection motivation theory (PMT) is one of these theories, was introduced by Rogers in 1975, and since has been widely adopted as a framework for the intervention in health-related behavior [ 14 ]. The results of a study indicated that education based on the constructs of the PMT increased the protective behaviors of medical laboratories’ staff [ 15 ]. The results of another study, indicated that educational intervention based on the PMT increased the preventative behaviors of a group of hospital staff against respiratory infections [ 16 ]. Some other types of studies have been done on people with jobs and professions other than the health systems and HCWs; for example, the results of a study indicated the effectiveness of an educational intervention based on the PMT in promoting the protective behaviors of farmer’s ranchers against brucellosis [ 17 ] and the employees of governmental offices against COVID-19 [ 18 ]. These types of interventions sometimes were not effective in changing the protective and healthy behaviors of other people in other contexts [ 19 , 20 ]. Considering the above-mentioned literature; occupational hazards and protective behaviors of emergency nurses against them are important issues in health systems, and PMT is a tool for designing and implementing educational interventions that may promote the protective behaviors of these HCWs against occupational hazards and sufficient scientific evidences were not found in this field by research team; so this study aimed to evaluate the effects of an educational intervention based on the PMT on the protective behaviors of emergency ward nurses against occupational hazards.

Research design and setting

This quasi-experimental study was conducted on two groups, intervention and control, using the pretest-posttest design among emergency ward nurses of four educational hospitals (two hospitals for each group by random allocation) in Tehran, Iran, in 2023.

Sample size and sampling methods

The sampling method of this study was multistage. To prevent the transfer of information between the intervention and the control groups, randomization was performed at the hospital level. So, from 12 possible educational hospitals, because of the executive ability and study facilities, four hospitals were randomly selected by lottery and of them, two hospitals were randomly assigned for each intervention and control group. After estimating the number of the required nurses for each hospital, nurses who had inclusion criteria were selected using convenience sampling. The emergency nurses who had exclusion criteria were excluded from the study and replaced by other nurses from the same hospital. This process continued until data collection was completed. Ultimately, 31 nurses from each hospital, 62 nurses in each group, and a total of 124 nurses were enrolled (Fig.  1 ).

The total sample size for this cluster randomized study was based on 90% power, 95% confidence, estimation of the standard deviation and the effect size greater than or equal to 20% improvement in self-efficacy due to the educational intervention according to the similar study [ 21 ], considering two hospitals for intervention group and two hospitals for control group, and considering ICC = 0.2, the total sample size was calculated to be n  = 62 emergency nurses for each group (31 nurses had to be recruited from each hospital).

Inclusion criteria were as follows: verbal and written informed consent, desire to participate in the study, and having appropriate communication status to participate in the study. Exclusion criteria were failure to complete the questionnaires, missing more than one of the education sessions in the intervention group, translocation to other wards during the study, and participation in similar training courses.

figure 1

Consort diagram

Intervention group procedure

The educational content of the intervention used in this study covered almost all topics about occupational hazards for emergency ward nurses, prepared and extracted from the relevant literature [ 1 , 2 , 3 , 4 , 14 , 22 ] and the experiences of the research team. The initial educational content was evaluated by three experts outside the research team. These evaluators had ph.D. degrees in nursing and were faculty members of the Department of community health nursing of Shahid Beheshti University of Medical Sciences, whose opinions were re-evaluated and applied by the research team if needed. Finally, the educational content was confirmed by the three experts and the research team.

The educational intervention in this study was prepared based on constructs of the PMT (Protective behaviors, intention, perceived severity, perceived vulnerability, fear, response costs, rewards of maladaptive response, self-efficacy, and response efficacy) (Table  1 ).

The educational intervention in this study was implemented in three sessions (one session per week). At first, the educational content was presented face to face (lecture, Q&A, PowerPoints, PDF files), and then the PowerPoint slides and educational pamphlets were delivered to nurses in a way that was more convenient to them via their cellphones.

Control group procedure

The control group did not receive any particular intervention during the study; the educational content was provided to those who were willing to receive it only after completing the study.

Instruments

The instrument used to collect data consisted of two sections; a demographic characteristics form (13 items) and a scale for measuring emergency ward nurses’ protective behaviors against occupational hazards (39 items).

Demographic characteristics included age, sex, marital status, having children, education level (in nursing), work experience, types of work shifts, working in additional centers, working overtime, history of exposure to occupational hazards and diseases, suffering from underlying diseases, history of allergy to latex, and history of vaccination against potential occupational diseases.

The initial scale for measuring emergency ward nurses’ protective behaviors against occupational hazards was developed for this study by authors based on relevant literature [ 1 , 2 , 3 , 4 , 14 , 22 ] and the researchers’ experiences, and included 47 items. The initial scale’s face validity was assessed using qualitative and quantitative methods with ten nurses who had similar working conditions as the nurses participating in the study. The content validity was assessed using qualitative and quantitative methods such as the Content validity index (CVI) and Content validity ratio (CVR), by the participation of 15 occupational health experts and nursing professors and instructors. For the reliability of the scale, Cronbach’s alpha and Intraclass correlation coefficient (ICC) (a 2-week interval) were estimated by the participation of 20 nurses. Following this process out of the initial 47 items, 5 items were removed because CVR of items were less than 0.49 [ 23 ], and 3 items were removed due to covering the same concept according to the opinions of the experts and after the agreement of the research team. The item reduction process was carried out in a way such that the original content of the scale remained intact. The final scale included five sub-scales and 39 items, covering nurses’ protective behaviors against physical (items 1–6; scoring: 6–30), chemical (items 7–11; scoring: 5–25), biological (items 12–21; scoring: 10–50), ergonomics (items 22–26; scoring: 5–25), psychosocial (items 27–39; scoring: 13–65) and total hazards (items 1–39; scoring: 39–195). In order to better compare the subscales and the total score with each other, the mean score (1–5) of each was calculated. The items on the scale were scored based on a 5-point Likert scale (from Never (1) to Always (5)), and there was no reverse item. Higher scores indicated higher compliance with protective behaviors against occupational hazards ( Supplementary File ). All items obtained an impact score higher than 1.5, and the overall CVI was 0.96. After obtaining the necessary permits and providing some information about the objectives of the study, written informed consent was received from the participants. The nurses in both groups completed the demographic characteristics form and the scale designed to measure protective behaviors against occupational hazards before, immediately, and one month after the intervention. Cronbach’s alpha and ICC (a 2-week interval) of the scale among the 124 participants of this study were obtained as 0.930 and 0.832 respectively.

Data analysis

Data were analyzed using descriptive (mean, Standard Deviation (SD), Mean Difference (MD), frequency, and frequency percentage) and inferential methods (Chi-square (χ2) or Fisher’s exact test, independent t-test, Analysis of Variance (ANOVA) and repeated measures ANOVA) in SPSS software (version 26; IBM Corp., Armonk, NY, USA). The assumptions of the repeated measure ANOVA included assumptions of normally, homogeneity of variance, homogeneity of covariances (sphericity), and no significant outliers were tested for nurses’ protective behaviors variables. These assumptions were established for underlying variables except sphericity assumption for some of the variables, which was modified by the Greenhouse-Geisser Correction. In the final analysis, to assess the intervention effect, we used the random effects model to allow for clustering design by considering a random effect for the clusters in the analysis. The significance level was set at p  < 0.05.

The mean age of the participants was 33.79 ± 7.43 years, and the mean work experience was 8.55 ± 6.42 years. Most of the participants were female, married, and held Bachelor of Science (BSc) degrees in nursing. There were no statistically significant differences between the two groups in terms of demographic characteristics and groups were homogenous in terms of demographic variables, except for the types of work shifts (Table  2 ).

The results of the independent samples t-test showed that the mean scores of protective behaviors against ergonomic and psychosocial hazards were not statistically significant ( p  > 0.05) between the control and intervention groups before the intervention; however, the mean scores of protective behaviors against physical, chemical, biological and total hazards were significantly higher in the control group than in the intervention group at the baseline ( p  < 0.05). Immediately and one month after the educational intervention, the mean scores of protective behaviors in all dimensions were significantly higher in the intervention group than in the control group ( p  < 0.05); except for the physical hazard sub-scale measured immediately after the intervention (t = 1.342, p  = 0.182) (Table  3 ).

Intragroup comparison using the one-way repeated measure ANOVA showed a significant increase in the total mean score of protective behaviors and sub-scales in the intervention group over time, reflecting the impact of the educational intervention on the protective behaviors of nurses in the intervention group, while a declining trend was noticed in the control group over time (Table  3 ). Bonferroni post-hoc comparison procedure indicated that at the measurements of pre-intervention, immediately and one-month after the intervention, the total mean scores of protective behaviors against occupational hazards and all sub-scales were statistically significant differences in the intervention and control groups ( p  < 0.05); except for physical (MD = 0.075, p  = 0.089) and ergonomic hazards (MD = 0.023, p  = 1) measured at pre-intervention and immediately after the intervention in the control group, as well as for psychosocial hazards (MD = 0.046, p  = 0.056) in the control group and ergonomic hazards (MD=-0.071, p  = 0.461) in the intervention group measured immediately and one month after the intervention.

The present study aimed to evaluate the effects of an educational intervention based on the protection motivation theory on the protective behaviors of emergency ward nurses against occupational hazards. The findings showed that nurses in the intervention and control groups were similar in terms of demographic characteristics. Most nurses were female, married, without children, and had BSc degrees. Most of the participants just worked in one hospital, and had a history of vaccination against HBV and COVID-19. Most of the nurses had no history of allergy to latex, had no underlying disease, and had no history of exposure to occupational hazards and diseases. The results of this study indicated that the PMT-based educational intervention improved the emergency ward nurses’ protective behaviors against various types of occupational hazards (physical, chemical, biological, ergonomics, and psychosocial hazards) in the intervention group. The results of a study in Iran showed that training a standard guideline about the safe handling of antineoplastic drugs, effectively improved the knowledge and behaviors of chemotherapy ward nurses [ 24 ]. Another study in Iran, showed that efficacy, effectiveness and rewards were the most predictors constructs of PMT for adherence to safe injection guidelines among nurses, suggesting that educational interventions for nurses should be more focused on these constructs [ 25 ]. In the present study, we included the most important constructs of the PMT for preparing and delivering the education content to emergency ward nurses. Another study in India also revealed that educational workshops improved HCWs’ knowledge about occupational hazards [ 26 ]. A literature review study also highlighted the positive impacts of e-training programs on employees’ knowledge and behavior regarding occupational health and safety and reducing workplace injuries [ 27 ]. These findings are consistent with the present study regarding the impact of educational intervention on individuals’ protective behaviors against occupational hazards, also it should be denoted that some parts of the educational intervention in the present study were delivered virtually on mobile platforms. A study on the efficiency of web-based learning in preventing exposure to occupational hazards in a clinical nursing setting showed that this type of education could significantly boost knowledge, but no remarkable changes were seen with regard to attitudes and behaviors [ 28 ]. Regarding behavioral dimensions, our results differed from the above-mentioned survey, which could be due to differences in training methods and educational content used in these studies. The present study used multi-methods approaches for education such as face-to-face and virtual methods, whereas education in the above-mentioned study was purely web-based. It should also be noted that changes in behavior are not solely dependent on knowledge, and other factors such as workload, time availability, access to facilities, and self-efficacy may also be influential. For instance, a study identified that type of profession, self-efficacy and behavioral intention were related factors to HCWs’ protective behaviors against COVID-19 [ 22 ]. In the present study, education was based on the constructs of PMT, and various factors for changing protective behaviors were discussed with the participants. Anyway, education is considered an effective factor for changing the behaviors of people in other topics and contexts [ 15 , 16 , 17 , 18 ].

A study investigated the impact of an educational program on overall occupational safety and ergonomic, biological, radiation, and chemical hazards among nurses and other HCWs in India and verified the influence of this program in boosting knowledge regarding these hazards. The effect of this program on the knowledge of biological hazards was highest and for the radiation, and chemical hazards were lowest. The participants of the recent study suggested that psychosocial hazards should be added to educational programs [ 29 ]. In the present study, psychosocial hazards were also considered. We also observed that in the intervention group and one month after the intervention, the highest and lowest mean scores of protective behaviors belonged to biological and ergonomic hazards, respectively. The results of the present study were consistent with the findings of the above-mentioned study on biological hazards and inconsistent with regard to ergonomic hazards. This discrepancy may be related to factors such as the educational content, nurses’ self-efficacy and access to equipment for performing protective behaviors against occupational hazards.

There are several limitations to consider in this study. The construct validity of the scale of emergency nurses’ protective behaviors against occupational hazards was not investigated and verified. This scale was a self-report, so the data in some dimensions might not reflect the actual levels of nurses’ protective behaviors. Future studies can use more objective scales for evaluating these behaviors. Additionally, the participants in this study were selected from educational and public hospitals, which might limit the generalizability of results to nurses working in private hospitals. There could be some organizational factors such as rules and laws that were not evaluated in this study, so future studies can also pay attention to these factors. Finally, data collection was conducted immediately and one month after the intervention, so longer follow-ups (3–6 months) are recommended for future studies to determine the durability of protective behaviors and the long-term effects of the educational intervention.

The study results showed that the implementation of an educational intervention based on PMT constructs could be effective and valuable in increasing the protective behaviors of emergency nurses against occupational hazards. Education alone is insufficient to change nurses’ health behaviors against occupational hazards. More attention should be paid to other factors affecting health and protective behaviors, such as access to personal protective equipment (PPE), work conditions, facilities, organizational regulations, state rules and laws, workload, and time restrictions. Future research efforts can be focused on designing more specific educational interventions based on the needs of nurses and also include nurses from various hospital wards.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Analysis of variance

Bachelor of Science

Content Validity Index

Content Validity Ratio

Hepatitis B virus

Hepatitis C virus

HealthCare Workers

Human immunodeficiency virus

Intraclass Correlation Coefficient

Mean Difference

Master of science

Protection Motivation Theory

Standard Deviation

Statistical Package for the Social Sciences

United States of America

Levy BS, Wegman DH, Baron SL, Sokas RK. Occupational and Environmental Health. 4th ed. In: Lipscomb JA. CHAPTER: 32 C Hazards for Healthcare Workers. Publisher: Oxford University Press; 2017. Pages 673–680. https://doi.org/10.1093/oso/9780190662677.003.0037 .

Che Huei L, Ya-Wen L, Chiu Ming Y, Li Chen H, Jong Yi W, Ming Hung L. Occupational health and safety hazards faced by healthcare professionals in Taiwan: a systematic review of risk factors and control strategies. SAGE Open Med. 2020;8:2050312120918999.

Article   PubMed   PubMed Central   Google Scholar  

World Health Organization (WHO). Occupational hazards in the health sector, https://www.who.int/tools/occupational-hazards-in-health-sector (accessed 13 February 2024).

Ramsay J, Denny F, Szirotnyak K, Thomas J, Corneliuson E, Paxton KL. Identifying nursing hazards in the emergency department: a new approach to nursing job hazard analysis. J Saf Res. 2006;37(1):63–74.

Article   Google Scholar  

El-Sallamy RM, Kabbash IA, El-Fatah SA, El-Feky A. Physical hazard safety awareness among healthcare workers in Tanta university hospitals, Egypt. Environ Sci Pollut Res Int. 2018;25(31):30826–38.

Article   PubMed   Google Scholar  

Larese Filon F, Pesce M, Paulo MS, Loney T, Modenese A, John SM, et al. Incidence of occupational contact dermatitis in healthcare workers: a systematic review. J Eur Acad Dermatol Venereol. 2021;35(6):1285–9.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Mengistu DA, Tolera ST, Demmu YM. Worldwide Prevalence of Occupational exposure to needle Stick Injury among Healthcare Workers: a systematic review and Meta-analysis. Can J Infect Dis Med Microbiol. 2021;2021:9019534.

Krishnan KS, Raju G, Shawkataly O. Prevalence of work-related Musculoskeletal disorders: psychological and physical risk factors. Int J Environ Res Public Health. 2021;18(17):9361.

Attar SM. Frequency and risk factors of musculoskeletal pain in nurses at a tertiary centre in Jeddah, Saudi Arabia: a cross sectional study. BMC Res Notes. 2014;7:61.

Rozo JA, Olson DM, Thu HS, Stutzman SE. Situational factors Associated with Burnout among Emergency Department nurses. Workplace Health Saf. 2017;65(6):262–5.

Bardhan R, Heaton K, Davis M, Chen P, Dickinson DA, Lungu CT. A Cross Sectional Study evaluating psychosocial job stress and health risk in Emergency Department nurses. Int J Environ Res Public Health. 2019;16(18):3243.

Kibunja BK, Musembi HM, Kimani RW, Gatimu SM. Prevalence and effect of Workplace Violence against Emergency Nurses at a Tertiary Hospital in Kenya: a cross-sectional study. Saf Health Work. 2021;12(2):249–54.

Guerin RJ, Sleet DA. Using behavioral theory to Enhance Occupational Safety and Health: applications to Health Care workers. Am J Lifestyle Med. 2020;15(3):269–78.

Conner M, Norman P. Predicting and changing Health Behavior, Research and practice with Social Cognition models. 4th ed. McGrawHill Global Education Holdings: LLC; 2015.

Google Scholar  

Hosseini Zijoud SS, Rahaei Z, Hekmatimoghaddam S, Zarei S, Sadeghian HA. Effect of education based on the protection motivation theory on the promotion of protective behaviors in medical laboratories’ staff in Yazd, Iran. Int Arch Health Sci. 2023;10(4):171–6.

Rakhshani T, Nikeghbal S, Kashfi SM, Kamyab A, Harsini PA, Jeihooni AK. Effect of educational intervention based on protection motivation theory on preventive behaviors of respiratory infections among hospital staff. Front Public Health. 2024;11:1326760.

Soleimanpour Hossein Abadi S, Mehri A, Rastaghi S, Hashemian M, Joveini H, Rakhshani MH, et al. Effectiveness of Educational intervention based on Protection Motivation Theory to Promotion of Preventive behaviors from Brucellosis among ranchers of Farmer. J Educ Community Health. 2021;8(1):11–9.

Matlabi M, Esmaeili R, Mohammadzadeh F, Hassanpour-Nejad H. The Effect of Educational intervention based on the Protection Motivation Theory in Promotion of Preventive behaviors against COVID-19. J Health Syst Res. 2022;18(1):30–8.

Boeka AG, Prentice-Dunn S, Lokken KL. Psychosocial predictors of intentions to comply with bariatric surgery guidelines. Psychol Health Med. 2010;15(2):188–97.

Bassett SF, Prapavessis H. A test of an adherence-enhancing adjunct to physiotherapy steeped in the protection motivation theory. Physiother Theory Pract. 2011;27(5):360–72.

Sadeghi R, Hashemi M, Khanjani N. The impact of educational intervention based on the health belief model on observing standard precautions among emergency center nurses in Sirjan, Iran. Health Educ Res. 2018;33(4):327–35.

Toghanian R, Ghasemi S, Hosseini M, Nasiri M. Protection behaviors and related factors against COVID-19 in the Healthcare Workers of the hospitals in Iran: a cross-sectional study. Iran J Nurs Midwifery Res. 2022;27(6):587–92.

Lawshe CH. Quantitative approach to content validity. Pers Psychol. 1975;28(4):563–75.

Nouri A, Seyed Javadi M, Iranijam E, Aghamohammadi M. Improving nurses’ performance in the safe handling of antineoplastic agents: a quasi-experimental study. BMC Nurs. 2021;20(1):247.

Karimi M, Khoramaki Z, Faradonbeh MR, Ghaedi M, Ashoori F, Asadollahi A. Predictors of hospital nursing staff’s adherence to safe injection guidelines: application of the protection motivation theory in Fars Province, Iran. BMC Nurs. 2024;23(1):25.

Khapre M, Agarwal S, Dhingra V, Singh V, Kathrotia R, Goyal B, et al. Comprehensive structured training on occupational health hazards and vaccination: a novel initiative toward employee safety. J Family Med Prim Care. 2022;11(7):3746–53.

Barati Jozan MM, Ghorbani BD, Khalid MS, Lotfata A, Tabesh H. Impact assessment of e-trainings in occupational safety and health: a literature review. BMC Public Health. 2023;23(1):1187.

Tung CY, Chang CC, Ming JL, Chao KP. Occupational hazards education for nursing staff through web-based learning. Int J Environ Res Public Health. 2014;11(12):13035–46.

Naithani M, Khapre M, Kathrotia R, Gupta PK, Dhingra VK, Rao S. Evaluation of Sensitization Program on Occupational Health Hazards for Nursing and Allied Health Care Workers in a Tertiary Health Care setting. Front Public Health. 2021;9:669179.

Download references

Acknowledgements

The authors would like to appreciate the Shahid Beheshti University of Medical Sciences (SBMU), the school of Nursing and Midwifery of the university, and the authorities of the four educational hospitals in Tehran, Iran (Imam Hossein, Shahid Modarres, Shohada-e-Tajrish, and Loghman Hakim) for their support, cooperation, and assistance throughout the study. We would also like to thank all the participants who took part in the study.

Shahid Beheshti University of Medical Sciences (SBMU), Tehran, Iran.

Author information

Authors and affiliations.

Student Research Committee, Department of Community Health Nursing, School of Nursing and Midwifery, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Mohadeseh Nouri

Department of Community Health Nursing, School of Nursing and Midwifery, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Saeed Ghasemi & Sahar Dabaghi

Department of Statistics and Epidemiology, Faculty of Health, Tabriz University of Medical Sciences, Tabriz, Iran

Parvin Sarbakhsh

You can also search for this author in PubMed   Google Scholar

Contributions

This manuscript was the results of the collaboration of all authors. Authors SG, MN, SD, and PS designed the study and wrote the study proposal. MN conducted data collection. PS, SG, MN, and SD analyzed the data. SG, MN, SD, and PS wrote the final draft of the manuscript and prepared tables. SG submitted the manuscript to the journal. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Saeed Ghasemi .

Ethics declarations

Ethics approval and consent to participate.

The necessary permits and approvals for this study were obtained from the Research Ethics Committees of School of Pharmacy and Nursing & Midwifery-Shahid Beheshti University of Medical Sciences (Approval ID: IR.SBMU.PHARMACY.REC.1401.195, Approval Date: 2022-12-06). The protocols were in accordance with the Declaration of Helsinki. Participants were provided with information about the research and its objectives, the confidentiality of their information, their right to withdraw from the study, and their access to the study findings. Written informed consent was obtained from all participants, and the necessary permissions were obtained from authorities before sampling.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

12912_2024_2053_MOESM1_ESM.docx

Supplementary Material 1: The online version contains a supplementary file (The Scale of emergency ward nurses’ protective behaviors against occupational hazards)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Nouri, M., Ghasemi, S., Dabaghi, S. et al. The effects of an educational intervention based on the protection motivation theory on the protective behaviors of emergency ward nurses against occupational hazards: a quasi-experimental study. BMC Nurs 23 , 409 (2024). https://doi.org/10.1186/s12912-024-02053-1

Download citation

Received : 19 February 2024

Accepted : 30 May 2024

Published : 18 June 2024

DOI : https://doi.org/10.1186/s12912-024-02053-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Emergency wards
  • Health behaviors
  • Occupational health
  • Quasi-experimental study

BMC Nursing

ISSN: 1472-6955

experimental and quasi experimental examples

Dynamic mechanical behavior and nonlinear hyper-viscoelastic constitutive model of SiO 2 particle-reinforced silicone rubber composite: experimental and numerical investigation

  • Zhang, Xihuang
  • Wu, Xuexing
  • Cheng, Xiangli

SiO 2 -particle reinforced silicon rubber composite (SP-RSRC) is a widely utilized material that offers shock absorption protection to various engineering structures in impact environments. This paper presents a comprehensive investigation of the mechanical behavior of SP-RSRC under various strain rates, employing a combination of experimental, theoretical, and numerical analyses. Firstly, quasi-static and dynamic compression tests were performed on SP-RSRC utilizing a universal testing machine and split Hopkinson pressure bar (SHPB) apparatus. Nonlinear stress-strain relationships of SP-RSRC were obtained for strain rates ranging from 1×10 ‑3 to 3065 s ‑1 . The results indicated that the composite showed evident strain rate sensitivity, along with nonlinearity. Then, a nonlinear visco-hyperelastic constitutive model was developed, consisting of a hyperelastic component utilizing the 3rd-order Ogden energy function and a viscous component employing a rate-dependent relaxation time scheme. The model accurately characterized the dynamic mechanical response of SP-RSRC, effectively mitigating the challenge of calibrating an excessive number of material parameters inherent in conventional viscoelastic models. Furthermore, the simplified rubber material (SRM) model, integrated within the LS-DYNA software, was chosen to depict the mechanical properties of SP-RSRC in numerical simulations. The parameters of the SRM model were further calibrated based on the strain-stress relationships of SP-RSRC, as predicted by the developed nonlinear visco-hyperelastic constitutive model. Finally, an inverse ballistic experiment using a single-stage air gun was conducted for SP-RSRC. Numerical simulations of SHPB experiments and the inverse ballistic experiment were then performed, and the reliability of the calibrated SRM model was verified by comparing the results of experiments and numerical simulations. This study offers a valuable reference for the utilization of SP-RSRC in the realm of impact protection.

  • Silicone rubber composites;
  • Dynamic mechanical properties;
  • Nonlinear visco-hyperelastic constitutive model;
  • Inverse ballistic experiment

COMMENTS

  1. Quasi-Experimental Design

    Revised on January 22, 2024. Like a true experiment, a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable. However, unlike a true experiment, a quasi-experiment does not rely on random assignment. Instead, subjects are assigned to groups based on non-random criteria.

  2. Quasi-Experimental Design: Types, Examples, Pros, and Cons

    Quasi-Experimental Design: Types, Examples, Pros, and Cons. A quasi-experimental design can be a great option when ethical or practical concerns make true experiments impossible, but the research methodology does have its drawbacks. Learn all the ins and outs of a quasi-experimental design. A quasi-experimental design can be a great option when ...

  3. Quasi Experimental Design Overview & Examples

    A significant advantage of quasi-experimental research over purely observational studies and correlational research is that it addresses the issue of directionality, determining which variable is the cause and which is the effect. In quasi-experiments, an intervention typically occurs during the investigation, and the researchers record outcomes before and after it, increasing the confidence ...

  4. Experimental vs Quasi-Experimental Design: Which to Choose?

    A quasi-experimental design is a non-randomized study design used to evaluate the effect of an intervention. The intervention can be a training program, a policy change or a medical treatment. Unlike a true experiment, in a quasi-experimental study the choice of who gets the intervention and who doesn't is not randomized.

  5. Experiments and Quasi-Experiments

    Often, however, it is not possible or practical to control all the key factors, so it becomes necessary to implement a quasi-experimental research design. Similarities between true and quasi-experiments: ... For example, a research study shows that a new curriculum improved reading comprehension of third-grade children in Iowa. To assess the ...

  6. Experimental and Quasi-Experimental Research

    Experimental and Quasi-Experimental Research. Guide Title: Experimental and Quasi-Experimental Research Guide ID: 64. You approach a stainless-steel wall, separated vertically along its middle where two halves meet. After looking to the left, you see two buttons on the wall to the right. You press the top button and it lights up.

  7. 7.3 Quasi-Experimental Research

    Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one. The prefix quasi means "resembling.". Thus quasi-experimental research is research that resembles experimental research but is not true experimental research.

  8. Quasi-Experimental Research Design

    Quasi-Experimental Design Examples. Here are some examples of real-time quasi-experimental designs: Evaluating the impact of a new teaching method: In this study, a group of students are taught using a new teaching method, while another group is taught using the traditional method. The test scores of both groups are compared before and after ...

  9. Quasi-experimental Research: What It Is, Types & Examples

    Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn't give full control over the independent variable (s) like true experimental designs do. In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at ...

  10. 5 Chapter 5: Experimental and Quasi-Experimental Designs

    The four components of experimental and quasi-experimental research designs and their function in answering a research question. ... The teen court evaluation that began this chapter is an example of an experimental design. The researchers of the study wanted to determine whether teen court was more effective at reducing recidivism and ...

  11. PDF Quantitative Research Designs: Experimental, Quasi-Experimental, and

    shows examples of statistics that may be used to answer these two questions. TIP When you read a study, first read the abstract to determine whether there is an intervention. If so, the study is either experimental or quasi-experimental. If not, the study will fit into one of the other categories.

  12. Quasi-experiment

    A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but it specifically lacks the element of random assignment to ...

  13. Experimental and Quasi-Experimental Designs in Implementation Research

    Quasi-experimental designs allow implementation scientists to conduct rigorous studies in these contexts, albeit with certain limitations. We briefly review the characteristics of these designs here; other recent review articles are available for the interested reader (e.g. Handley et al., 2018 ). 2.1.

  14. 12.2: Pre-experimental and quasi-experimental design

    Quasi-experimental designs are similar to true experiments, but they lack random assignment to experimental and control groups. ... For example, Stratmann and Wille (2016) [2] were interested in the effects of a state healthcare policy called Certificate of Need on the quality of hospitals. They clearly cannot assign states to adopt one set of ...

  15. Guide to Experimental Design

    An experimental design where treatments aren't randomly assigned is called a quasi-experimental design. Between-subjects vs. within-subjects. ... Types & Examples Quasi-experimental design attempts to establish a cause-and-effect relationship by using criteria other than randomization. 1114. How to write a lab report A lab report conveys the ...

  16. Introduction to Experimental and Quasi-Experimental Design

    To close out this chapter, I turn to two examples of quasi-experimental design, regression discontinuity and difference-in-differences. Footnote 7 In some contexts, and under specific assumptions, these two research designs are able to approximate the results that we would otherwise get from a true experiment, an RCT. Indeed, quasi-experimental ...

  17. (PDF) Experimental and quasi-experimental designs

    Experimental and quasi-experimental research designs examine whether there is a causal. relationship between independent and dependent variables. Simply de ned, the independent. variable is the ...

  18. 8.2 Quasi-experimental and pre-experimental designs

    Pre-experimental designs - a variation of experimental design that lacks the rigor of experiments and is often used before a true experiment is conducted. Quasi-experimental design - designs lack random assignment to experimental and control groups. Static group design - uses an experimental group and a comparison group, without random ...

  19. Quasi-Experimental Design

    For example, in a quasi-experimental study, researchers may be interested in studying a particular group of students (meaning that participants are predetermined). Another example is the desire to ...

  20. Quasi-experimental Studies in Health Systems Evidence Synthesis

    Quasi-experimental (QE) studies have a key role in the development of bodies of evidence to both inform health policy decisions and guide investments for health systems strengthening. Studies of this type entail a nonrandomized, quantitative approach to causal inference, which may be applied prospectively (as in a trial) or retrospectively (as in the analysis of routine observational or ...

  21. 14

    Both simple quasi-experimental designs and embellishments of these simple designs are presented. Potential threats to internal validity are illustrated along with means of addressing their potentially biasing effects so that these effects can be minimized. In contrast to quasi-experiments, randomized experiments are often thought to be the gold ...

  22. Use of Quasi-Experimental Research Designs in Education Research

    The increasing use of quasi-experimental research designs (QEDs) in education, brought into focus following the "credibility revolution" (Angrist & Pischke, 2010) in economics, which sought to use data to empirically test theoretical assertions, has indeed improved causal claims in education (Loeb et al., 2017).However, more recently, scholars, practitioners, and policymakers have ...

  23. The Use and Interpretation of Quasi-Experimental Studies in Medical

    Examples of quasi-experimental studies follow. As one example of a quasi-experimental study, a hospital introduces a new order-entry system and wishes to study the impact of this intervention on the number of medication-related adverse events before and after the intervention. As another example, an informatics technology group is introducing a ...

  24. The Impact of Aid on Economic Growth: Analysis of Quasi-Experimental

    Recommendations Advocate for quasi-experimental design to improve quality of findings on sustainable economic development. Support interventions to affect change at the household and community level as they can have significant, growth-related impacts. Evaluate projects after completion to measure sustainability and emphasize building capacity to ensure long-term viability. Specify targeted ...

  25. Effectiveness of an Emergency Department-Based Machine ...

    We will use a quasi-experimental design known as a sharp regression discontinuity with regard to intent-to-treat, since the intervention is administered to patients whose risk score falls above a threshold. A conditional logistic regression model will be built to describe 6-month fall risk at each site as a function of the intervention, patient ...

  26. The influence of GenAI on the effectiveness of argumentative writing in

    A total of 61 Chinese university students from two classes were invited to participate in a quasi-experimental study with different learning methods as intervention measures, and student perceptions were gathered through interviews. The results show that the proposed LCAW method significantly improves students' writing performance in terms of ...

  27. Assessing rates and predictors of cannabis-associated ...

    The authors synthesize data from previous literature on observational, experimental and medicinal cannabis research to assess rates and predictors of cannabis-associated psychotic symptoms.

  28. The effects of an educational intervention based on the protection

    This quasi-experimental study was conducted on two groups, intervention and control, using the pretest-posttest design among emergency ward nurses of four educational hospitals (two hospitals for each group by random allocation) in Tehran, Iran, in 2023. Sample size and sampling methods. The sampling method of this study was multistage.

  29. Dynamic mechanical behavior and nonlinear hyper-viscoelastic

    This paper presents a comprehensive investigation of the mechanical behavior of SP-RSRC under various strain rates, employing a combination of experimental, theoretical, and numerical analyses. Firstly, quasi-static and dynamic compression tests were performed on SP-RSRC utilizing a universal testing machine and split Hopkinson pressure bar ...